Quick Answer: RTX 4080 Super offers 16GB VRAM and starts around current market pricing. It delivers approximately 190 tokens/sec on Deepseek AI Deepseek Ocr 2. It typically draws 320W under load.

RTX 4080 Super

Check availability

By NVIDIAReleased 2024-01MSRP $999.00

This GPU offers reliable throughput for local AI workloads. Pair it with the right model quantization to hit your desired tokens/sec, and monitor prices below to catch the best deal.

Search on Amazon View Benchmarks

Specs snapshot

Key hardware metrics for AI workloads.

VRAM16GB

Cores10,240

TDP320W

ArchitectureAda Lovelace

Key Takeaways

16GB VRAM - runs models up to ~40B parameters
High-end compute for demanding workloads
Moderate power draw (320W) - 750W PSU typically sufficient
Strong price-to-VRAM value

What this means for you

With 16GB VRAM, RTX 4080 Super can run models up to approximately 40B parameters using 4-bit quantization. This handles most popular models including Llama 3 70B, Mistral 7B, and larger.

Who should buy

Running 13B-34B parameter models
Stable Diffusion XL at high resolution
Enthusiast-level local AI experimentation

Looking to upgrade?

Consider RTX 4090 — Double the VRAM for larger models.

AI benchmarks

Showing 80 of 80 benchmark rows.

Model	Size	Quantization	Tokens/sec	VRAM used
Deepseek AI Deepseek Ocr 2	Unknown	Q4	190.04 tok/sEstimated Static estimation (DB-independent)	1GB
Deepseek AI Deepseek Math V2	Unknown	Q4	190.04 tok/sEstimated Static estimation (DB-independent)	1GB
Deepseek AI Deepseek V2 5	Unknown	Q4	190.04 tok/sEstimated Static estimation (DB-independent)	1GB
Deepseek AI Deepseek V3	Unknown	Q4	190.04 tok/sEstimated Static estimation (DB-independent)	2GB
Deepseek AI Deepseek Coder V2 Lite Instruct	Unknown	Q4	190.04 tok/sEstimated Static estimation (DB-independent)	1GB
Deepseek AI Deepseek V3.1	Unknown	Q4	190.04 tok/sEstimated Static estimation (DB-independent)	2GB
Deepseek AI Deepseek Coder 1.3B Instruct	1.3B	Q4	190.04 tok/sEstimated Static estimation (DB-independent)	1GB
Deepseek AI Deepseek R1 Distill Qwen 1.5B	1.5B	Q4	190.04 tok/sEstimated Static estimation (DB-independent)	1GB
Deepseek AI Deepseek Ocr	Unknown	Q4	158.37 tok/sEstimated Static estimation (DB-independent)	4GB
Lmstudio Community Deepseek R1 0528 Qwen3 8B Mlx 8bit	8B	Q4	158.37 tok/sEstimated Static estimation (DB-independent)	4GB
Lmstudio Community Deepseek R1 0528 Qwen3 8B Mlx 4bit	8B	Q4	158.37 tok/sEstimated Static estimation (DB-independent)	4GB
Deepseek AI Deepseek R1	Unknown	Q4	158.37 tok/sEstimated Static estimation (DB-independent)	4GB
Deepseek AI Deepseek R1 0528	Unknown	Q4	158.37 tok/sEstimated Static estimation (DB-independent)	4GB
Deepseek AI Deepseek R1 Distill Llama 8B	8B	Q4	158.37 tok/sEstimated Static estimation (DB-independent)	4GB
Deepseek AI Deepseek R1 Distill Qwen 7B	7B	Q4	158.37 tok/sEstimated Static estimation (DB-independent)	4GB
Nineninesix Kani Tts 2 En	Unknown	Q4	152.04 tok/sEstimated Static estimation (DB-independent)	1GB
Nanbeige Nanbeige4 1 3B	3B	Q4	152.04 tok/sEstimated Static estimation (DB-independent)	2GB
Minimaxai Minimax M2 5	Unknown	Q4	152.04 tok/sEstimated Static estimation (DB-independent)	1GB
Minimaxai Minimax M2 1	Unknown	Q4	152.04 tok/sEstimated Static estimation (DB-independent)	1GB
Stepfun AI Step 3 5 Flash	Unknown	Q4	152.04 tok/sEstimated Static estimation (DB-independent)	2GB
Qwen Qwen3 Coder Next	Unknown	Q4	152.04 tok/sEstimated Static estimation (DB-independent)	2GB
Moonshotai Kimi K2 5	Unknown	Q4	152.04 tok/sEstimated Static estimation (DB-independent)	1GB
Xiaomimimo Mimo V2 Flash	Unknown	Q4	152.04 tok/sEstimated Static estimation (DB-independent)	1GB
Nari Labs Dia2 2B	2B	Q4	152.04 tok/sEstimated Static estimation (DB-independent)	1GB
Google Embeddinggemma 300M	Unknown	Q4	152.04 tok/sEstimated Static estimation (DB-independent)	1GB
Facebook Sam3	Unknown	Q4	152.04 tok/sEstimated Static estimation (DB-independent)	2GB
Black Forest Labs Flux 2 Dev	Unknown	Q4	152.04 tok/sEstimated Static estimation (DB-independent)	1GB
Moonshotai Kimi K2 Thinking	Unknown	Q4	152.04 tok/sEstimated Static estimation (DB-independent)	1GB
Microsoft Phi 3 5 Mini Instruct	Unknown	Q4	152.04 tok/sEstimated Static estimation (DB-independent)	2GB
Meta Llama Llama 3 2 3B Instruct	3B	Q4	152.04 tok/sEstimated Static estimation (DB-independent)	2GB
Qwen Qwen3 1.7B Base	1.7B	Q4	152.04 tok/sEstimated Static estimation (DB-independent)	1GB
Dicta Il Dictalm2.0 Instruct	Unknown	Q4	152.04 tok/sEstimated Static estimation (DB-independent)	1GB
Qwen Qwen2 0.5B Instruct	0.5B	Q4	152.04 tok/sEstimated Static estimation (DB-independent)	1GB
Alibaba Nlp Gte Qwen2 1.5B Instruct	1.5B	Q4	152.04 tok/sEstimated Static estimation (DB-independent)	1GB
Apple Openelm 1 1B Instruct	1B	Q4	152.04 tok/sEstimated Static estimation (DB-independent)	1GB
Qwen Qwen2.5 3B	3B	Q4	152.04 tok/sEstimated Static estimation (DB-independent)	2GB
Unsloth Gemma 3 1B It	1B	Q4	152.04 tok/sEstimated Static estimation (DB-independent)	1GB
Bigcode Starcoder2 3B	3B	Q4	152.04 tok/sEstimated Static estimation (DB-independent)	2GB
Ibm Granite Granite Docling 258M	Unknown	Q4	152.04 tok/sEstimated Static estimation (DB-independent)	1GB
Skt Kogpt2 Base V2	Unknown	Q4	152.04 tok/sEstimated Static estimation (DB-independent)	1GB
Google Gemma 3 270M It	Unknown	Q4	152.04 tok/sEstimated Static estimation (DB-independent)	1GB
Eleutherai Pythia 70M Deduped	Unknown	Q4	152.04 tok/sEstimated Static estimation (DB-independent)	1GB
Microsoft Vibevoice 1.5B	1.5B	Q4	152.04 tok/sEstimated Static estimation (DB-independent)	1GB
Ibm Granite Granite 3.3 2B Instruct	2B	Q4	152.04 tok/sEstimated Static estimation (DB-independent)	1GB
Google Gemma 2B	2B	Q4	152.04 tok/sEstimated Static estimation (DB-independent)	1GB
Trl Internal Testing Tiny Llamaforcausallm 3.2	Unknown	Q4	152.04 tok/sEstimated Static estimation (DB-independent)	2GB
Llamafactory Tiny Random Llama 3	Unknown	Q4	152.04 tok/sEstimated Static estimation (DB-independent)	2GB
Unsloth Llama 3.2 1B Instruct	1B	Q4	152.04 tok/sEstimated Static estimation (DB-independent)	1GB
Numind Nuextract 1.5	Unknown	Q4	152.04 tok/sEstimated Static estimation (DB-independent)	1GB
Hmellor Tiny Random Llamaforcausallm	Unknown	Q4	152.04 tok/sEstimated Static estimation (DB-independent)	1GB
Sshleifer Tiny Gpt2	Unknown	Q4	152.04 tok/sEstimated Static estimation (DB-independent)	1GB
Openai Community Gpt2 Xl	Unknown	Q4	152.04 tok/sEstimated Static estimation (DB-independent)	1GB
Ibm Research Powermoe 3B	3B	Q4	152.04 tok/sEstimated Static estimation (DB-independent)	2GB
Unsloth Llama 3.2 3B Instruct	3B	Q4	152.04 tok/sEstimated Static estimation (DB-independent)	2GB
Meta Llama Llama 3.2 3B	3B	Q4	152.04 tok/sEstimated Static estimation (DB-independent)	2GB
Eleutherai Gpt Neo 125M	Unknown	Q4	152.04 tok/sEstimated Static estimation (DB-independent)	1GB
Meta Llama Llama Guard 3 1B	1B	Q4	152.04 tok/sEstimated Static estimation (DB-independent)	1GB
Qwen Qwen2 1.5B Instruct	1.5B	Q4	152.04 tok/sEstimated Static estimation (DB-independent)	1GB
Google Gemma 2 2B It	2B	Q4	152.04 tok/sEstimated Static estimation (DB-independent)	1GB
Microsoft Phi 3.5 Mini Instruct	Unknown	Q4	152.04 tok/sEstimated Static estimation (DB-independent)	2GB
Microsoft Phi 3.5 Vision Instruct	Unknown	Q4	152.04 tok/sEstimated Static estimation (DB-independent)	2GB
Rinna Japanese Gpt Neox Small	Unknown	Q4	152.04 tok/sEstimated Static estimation (DB-independent)	1GB
Qwen Qwen2.5 Coder 1.5B	1.5B	Q4	152.04 tok/sEstimated Static estimation (DB-independent)	1GB
Microsoft Dialogpt Small	Unknown	Q4	152.04 tok/sEstimated Static estimation (DB-independent)	1GB
Qwen Qwen3 0.6B Base	0.6B	Q4	152.04 tok/sEstimated Static estimation (DB-independent)	1GB
Openai Community Gpt2 Medium	Unknown	Q4	152.04 tok/sEstimated Static estimation (DB-independent)	1GB
Trl Internal Testing Tiny Random Llamaforcausallm	Unknown	Q4	152.04 tok/sEstimated Static estimation (DB-independent)	1GB
Qwen Qwen2.5 Math 1.5B	1.5B	Q4	152.04 tok/sEstimated Static estimation (DB-independent)	1GB
Huggingfacetb Smollm 135M	Unknown	Q4	152.04 tok/sEstimated Static estimation (DB-independent)	1GB
Liquidai Lfm2 1.2B	1.2B	Q4	152.04 tok/sEstimated Static estimation (DB-independent)	1GB
Qwen Qwen2 0.5B	0.5B	Q4	152.04 tok/sEstimated Static estimation (DB-independent)	1GB
Minimaxai Minimax M2	Unknown	Q4	152.04 tok/sEstimated Static estimation (DB-independent)	1GB
Huggingfacetb Smollm2 135M	Unknown	Q4	152.04 tok/sEstimated Static estimation (DB-independent)	1GB
Microsoft Phi 2	Unknown	Q4	152.04 tok/sEstimated Static estimation (DB-independent)	1GB
Qwen Qwen2.5 0.5B	0.5B	Q4	152.04 tok/sEstimated Static estimation (DB-independent)	1GB
Qwen Qwen2.5 1.5B	1.5B	Q4	152.04 tok/sEstimated Static estimation (DB-independent)	1GB
Qwen Qwen3 Reranker 0.6B	0.6B	Q4	152.04 tok/sEstimated Static estimation (DB-independent)	1GB
Google T5 T5 3B	3B	Q4	152.04 tok/sEstimated Static estimation (DB-independent)	2GB
Qwen Qwen3 1.7B	1.7B	Q4	152.04 tok/sEstimated Static estimation (DB-independent)	1GB
Openai Community Gpt2 Large	Unknown	Q4	152.04 tok/sEstimated Static estimation (DB-independent)	1GB

Deepseek AI Deepseek Ocr 2

Q4 · Unknown

1GB

190.04 tok/sEstimated