Quick Answer: RTX 5090 offers 32GB VRAM and starts around current market pricing. It delivers approximately 450 tokens/sec on Deepseek AI Deepseek Ocr 2. It typically draws 575W under load.

RTX 5090

Check availability

By NVIDIAReleased 2025-01MSRP $1,999.00

This GPU offers reliable throughput for local AI workloads. Pair it with the right model quantization to hit your desired tokens/sec, and monitor prices below to catch the best deal.

Search on Amazon View Benchmarks

Specs snapshot

Key hardware metrics for AI workloads.

VRAM32GB

Cores21,760

TDP575W

ArchitectureBlackwell

Key Takeaways

32GB VRAM - runs models up to ~80B parameters
Flagship-class compute for maximum throughput
High power draw (575W) - requires robust PSU (850W+ recommended)
Strong price-to-VRAM value

What this means for you

With 32GB VRAM, RTX 5090 can run models up to approximately 80B parameters using 4-bit quantization. This handles most popular models including Llama 3 70B, Mistral 7B, and larger.

Who should buy

Running 70B parameter models at good speeds
Multiple model instances simultaneously
Production deployments on single GPU

Looking to upgrade?

Consider RTX 4090 or RTX 6000 Ada — 24GB Ada offers better efficiency than Ampere.

AI benchmarks

Showing 80 of 80 benchmark rows.

Model	Size	Quantization	Tokens/sec	VRAM used
Deepseek AI Deepseek Ocr 2	Unknown	Q4	449.65 tok/sEstimated Static estimation (DB-independent)	1GB
Deepseek AI Deepseek Math V2	Unknown	Q4	449.65 tok/sEstimated Static estimation (DB-independent)	1GB
Deepseek AI Deepseek V2 5	Unknown	Q4	449.65 tok/sEstimated Static estimation (DB-independent)	1GB
Deepseek AI Deepseek V3	Unknown	Q4	449.65 tok/sEstimated Static estimation (DB-independent)	2GB
Deepseek AI Deepseek Coder V2 Lite Instruct	Unknown	Q4	449.65 tok/sEstimated Static estimation (DB-independent)	1GB
Deepseek AI Deepseek V3.1	Unknown	Q4	449.65 tok/sEstimated Static estimation (DB-independent)	2GB
Deepseek AI Deepseek Coder 1.3B Instruct	1.3B	Q4	449.65 tok/sEstimated Static estimation (DB-independent)	1GB
Deepseek AI Deepseek R1 Distill Qwen 1.5B	1.5B	Q4	449.65 tok/sEstimated Static estimation (DB-independent)	1GB
Deepseek AI Deepseek Ocr	Unknown	Q4	374.71 tok/sEstimated Static estimation (DB-independent)	4GB
Lmstudio Community Deepseek R1 0528 Qwen3 8B Mlx 8bit	8B	Q4	374.71 tok/sEstimated Static estimation (DB-independent)	4GB
Lmstudio Community Deepseek R1 0528 Qwen3 8B Mlx 4bit	8B	Q4	374.71 tok/sEstimated Static estimation (DB-independent)	4GB
Deepseek AI Deepseek R1	Unknown	Q4	374.71 tok/sEstimated Static estimation (DB-independent)	4GB
Deepseek AI Deepseek R1 0528	Unknown	Q4	374.71 tok/sEstimated Static estimation (DB-independent)	4GB
Deepseek AI Deepseek R1 Distill Llama 8B	8B	Q4	374.71 tok/sEstimated Static estimation (DB-independent)	4GB
Deepseek AI Deepseek R1 Distill Qwen 7B	7B	Q4	374.71 tok/sEstimated Static estimation (DB-independent)	4GB
Nineninesix Kani Tts 2 En	Unknown	Q4	359.72 tok/sEstimated Static estimation (DB-independent)	1GB
Nanbeige Nanbeige4 1 3B	3B	Q4	359.72 tok/sEstimated Static estimation (DB-independent)	2GB
Minimaxai Minimax M2 5	Unknown	Q4	359.72 tok/sEstimated Static estimation (DB-independent)	1GB
Minimaxai Minimax M2 1	Unknown	Q4	359.72 tok/sEstimated Static estimation (DB-independent)	1GB
Stepfun AI Step 3 5 Flash	Unknown	Q4	359.72 tok/sEstimated Static estimation (DB-independent)	2GB
Qwen Qwen3 Coder Next	Unknown	Q4	359.72 tok/sEstimated Static estimation (DB-independent)	2GB
Moonshotai Kimi K2 5	Unknown	Q4	359.72 tok/sEstimated Static estimation (DB-independent)	1GB
Xiaomimimo Mimo V2 Flash	Unknown	Q4	359.72 tok/sEstimated Static estimation (DB-independent)	1GB
Nari Labs Dia2 2B	2B	Q4	359.72 tok/sEstimated Static estimation (DB-independent)	1GB
Google Embeddinggemma 300M	Unknown	Q4	359.72 tok/sEstimated Static estimation (DB-independent)	1GB
Facebook Sam3	Unknown	Q4	359.72 tok/sEstimated Static estimation (DB-independent)	2GB
Black Forest Labs Flux 2 Dev	Unknown	Q4	359.72 tok/sEstimated Static estimation (DB-independent)	1GB
Moonshotai Kimi K2 Thinking	Unknown	Q4	359.72 tok/sEstimated Static estimation (DB-independent)	1GB
Microsoft Phi 3 5 Mini Instruct	Unknown	Q4	359.72 tok/sEstimated Static estimation (DB-independent)	2GB
Meta Llama Llama 3 2 3B Instruct	3B	Q4	359.72 tok/sEstimated Static estimation (DB-independent)	2GB
Qwen Qwen3 1.7B Base	1.7B	Q4	359.72 tok/sEstimated Static estimation (DB-independent)	1GB
Dicta Il Dictalm2.0 Instruct	Unknown	Q4	359.72 tok/sEstimated Static estimation (DB-independent)	1GB
Qwen Qwen2 0.5B Instruct	0.5B	Q4	359.72 tok/sEstimated Static estimation (DB-independent)	1GB
Alibaba Nlp Gte Qwen2 1.5B Instruct	1.5B	Q4	359.72 tok/sEstimated Static estimation (DB-independent)	1GB
Apple Openelm 1 1B Instruct	1B	Q4	359.72 tok/sEstimated Static estimation (DB-independent)	1GB
Qwen Qwen2.5 3B	3B	Q4	359.72 tok/sEstimated Static estimation (DB-independent)	2GB
Unsloth Gemma 3 1B It	1B	Q4	359.72 tok/sEstimated Static estimation (DB-independent)	1GB
Bigcode Starcoder2 3B	3B	Q4	359.72 tok/sEstimated Static estimation (DB-independent)	2GB
Ibm Granite Granite Docling 258M	Unknown	Q4	359.72 tok/sEstimated Static estimation (DB-independent)	1GB
Skt Kogpt2 Base V2	Unknown	Q4	359.72 tok/sEstimated Static estimation (DB-independent)	1GB
Google Gemma 3 270M It	Unknown	Q4	359.72 tok/sEstimated Static estimation (DB-independent)	1GB
Eleutherai Pythia 70M Deduped	Unknown	Q4	359.72 tok/sEstimated Static estimation (DB-independent)	1GB
Microsoft Vibevoice 1.5B	1.5B	Q4	359.72 tok/sEstimated Static estimation (DB-independent)	1GB
Ibm Granite Granite 3.3 2B Instruct	2B	Q4	359.72 tok/sEstimated Static estimation (DB-independent)	1GB
Google Gemma 2B	2B	Q4	359.72 tok/sEstimated Static estimation (DB-independent)	1GB
Trl Internal Testing Tiny Llamaforcausallm 3.2	Unknown	Q4	359.72 tok/sEstimated Static estimation (DB-independent)	2GB
Llamafactory Tiny Random Llama 3	Unknown	Q4	359.72 tok/sEstimated Static estimation (DB-independent)	2GB
Unsloth Llama 3.2 1B Instruct	1B	Q4	359.72 tok/sEstimated Static estimation (DB-independent)	1GB
Numind Nuextract 1.5	Unknown	Q4	359.72 tok/sEstimated Static estimation (DB-independent)	1GB
Hmellor Tiny Random Llamaforcausallm	Unknown	Q4	359.72 tok/sEstimated Static estimation (DB-independent)	1GB
Sshleifer Tiny Gpt2	Unknown	Q4	359.72 tok/sEstimated Static estimation (DB-independent)	1GB
Openai Community Gpt2 Xl	Unknown	Q4	359.72 tok/sEstimated Static estimation (DB-independent)	1GB
Ibm Research Powermoe 3B	3B	Q4	359.72 tok/sEstimated Static estimation (DB-independent)	2GB
Unsloth Llama 3.2 3B Instruct	3B	Q4	359.72 tok/sEstimated Static estimation (DB-independent)	2GB
Meta Llama Llama 3.2 3B	3B	Q4	359.72 tok/sEstimated Static estimation (DB-independent)	2GB
Eleutherai Gpt Neo 125M	Unknown	Q4	359.72 tok/sEstimated Static estimation (DB-independent)	1GB
Meta Llama Llama Guard 3 1B	1B	Q4	359.72 tok/sEstimated Static estimation (DB-independent)	1GB
Qwen Qwen2 1.5B Instruct	1.5B	Q4	359.72 tok/sEstimated Static estimation (DB-independent)	1GB
Google Gemma 2 2B It	2B	Q4	359.72 tok/sEstimated Static estimation (DB-independent)	1GB
Microsoft Phi 3.5 Mini Instruct	Unknown	Q4	359.72 tok/sEstimated Static estimation (DB-independent)	2GB
Microsoft Phi 3.5 Vision Instruct	Unknown	Q4	359.72 tok/sEstimated Static estimation (DB-independent)	2GB
Rinna Japanese Gpt Neox Small	Unknown	Q4	359.72 tok/sEstimated Static estimation (DB-independent)	1GB
Qwen Qwen2.5 Coder 1.5B	1.5B	Q4	359.72 tok/sEstimated Static estimation (DB-independent)	1GB
Microsoft Dialogpt Small	Unknown	Q4	359.72 tok/sEstimated Static estimation (DB-independent)	1GB
Qwen Qwen3 0.6B Base	0.6B	Q4	359.72 tok/sEstimated Static estimation (DB-independent)	1GB
Openai Community Gpt2 Medium	Unknown	Q4	359.72 tok/sEstimated Static estimation (DB-independent)	1GB
Trl Internal Testing Tiny Random Llamaforcausallm	Unknown	Q4	359.72 tok/sEstimated Static estimation (DB-independent)	1GB
Qwen Qwen2.5 Math 1.5B	1.5B	Q4	359.72 tok/sEstimated Static estimation (DB-independent)	1GB
Huggingfacetb Smollm 135M	Unknown	Q4	359.72 tok/sEstimated Static estimation (DB-independent)	1GB
Liquidai Lfm2 1.2B	1.2B	Q4	359.72 tok/sEstimated Static estimation (DB-independent)	1GB
Qwen Qwen2 0.5B	0.5B	Q4	359.72 tok/sEstimated Static estimation (DB-independent)	1GB
Minimaxai Minimax M2	Unknown	Q4	359.72 tok/sEstimated Static estimation (DB-independent)	1GB
Huggingfacetb Smollm2 135M	Unknown	Q4	359.72 tok/sEstimated Static estimation (DB-independent)	1GB
Microsoft Phi 2	Unknown	Q4	359.72 tok/sEstimated Static estimation (DB-independent)	1GB
Qwen Qwen2.5 0.5B	0.5B	Q4	359.72 tok/sEstimated Static estimation (DB-independent)	1GB
Qwen Qwen2.5 1.5B	1.5B	Q4	359.72 tok/sEstimated Static estimation (DB-independent)	1GB
Qwen Qwen3 Reranker 0.6B	0.6B	Q4	359.72 tok/sEstimated Static estimation (DB-independent)	1GB
Google T5 T5 3B	3B	Q4	359.72 tok/sEstimated Static estimation (DB-independent)	2GB
Qwen Qwen3 1.7B	1.7B	Q4	359.72 tok/sEstimated Static estimation (DB-independent)	1GB
Openai Community Gpt2 Large	Unknown	Q4	359.72 tok/sEstimated Static estimation (DB-independent)	1GB

Deepseek AI Deepseek Ocr 2

Q4 · Unknown

1GB

449.65 tok/sEstimated