Quick Answer: NVIDIA L40S offers 48GB VRAM and starts around current market pricing. It delivers approximately 248 tokens/sec on Deepseek AI Deepseek Ocr 2. It typically draws 350W under load.

NVIDIA L40S

Check availability

By NVIDIAReleased 2023-08MSRP $10,000.00

This GPU offers reliable throughput for local AI workloads. Pair it with the right model quantization to hit your desired tokens/sec, and monitor prices below to catch the best deal.

Search on Amazon View Benchmarks

Specs snapshot

Key hardware metrics for AI workloads.

VRAM48GB

Cores18,176

TDP350W

ArchitectureAda Lovelace

Key Takeaways

48GB VRAM - runs models up to ~120B parameters
Flagship-class compute for maximum throughput
Moderate power draw (350W) - 750W PSU typically sufficient
Strong price-to-VRAM value

What this means for you

With 48GB VRAM, NVIDIA L40S can run models up to approximately 120B parameters using 4-bit quantization. This handles most popular models including Llama 3 70B, Mistral 7B, and larger.

Who should buy

Professional AI workloads requiring maximum VRAM
Running 100B+ parameter models with full precision

Looking to upgrade?

Consider H100 or MI300X — Maximum VRAM for enterprise workloads.

AI benchmarks

Showing 80 of 80 benchmark rows.

Model	Size	Quantization	Tokens/sec	VRAM used
Deepseek AI Deepseek Ocr 2	Unknown	Q4	248.45 tok/sEstimated Static estimation (DB-independent)	1GB
Deepseek AI Deepseek Math V2	Unknown	Q4	248.45 tok/sEstimated Static estimation (DB-independent)	1GB
Deepseek AI Deepseek V2 5	Unknown	Q4	248.45 tok/sEstimated Static estimation (DB-independent)	1GB
Deepseek AI Deepseek V3	Unknown	Q4	248.45 tok/sEstimated Static estimation (DB-independent)	2GB
Deepseek AI Deepseek Coder V2 Lite Instruct	Unknown	Q4	248.45 tok/sEstimated Static estimation (DB-independent)	1GB
Deepseek AI Deepseek V3.1	Unknown	Q4	248.45 tok/sEstimated Static estimation (DB-independent)	2GB
Deepseek AI Deepseek Coder 1.3B Instruct	1.3B	Q4	248.45 tok/sEstimated Static estimation (DB-independent)	1GB
Deepseek AI Deepseek R1 Distill Qwen 1.5B	1.5B	Q4	248.45 tok/sEstimated Static estimation (DB-independent)	1GB
Deepseek AI Deepseek Ocr	Unknown	Q4	207.05 tok/sEstimated Static estimation (DB-independent)	4GB
Lmstudio Community Deepseek R1 0528 Qwen3 8B Mlx 8bit	8B	Q4	207.05 tok/sEstimated Static estimation (DB-independent)	4GB
Lmstudio Community Deepseek R1 0528 Qwen3 8B Mlx 4bit	8B	Q4	207.05 tok/sEstimated Static estimation (DB-independent)	4GB
Deepseek AI Deepseek R1	Unknown	Q4	207.05 tok/sEstimated Static estimation (DB-independent)	4GB
Deepseek AI Deepseek R1 0528	Unknown	Q4	207.05 tok/sEstimated Static estimation (DB-independent)	4GB
Deepseek AI Deepseek R1 Distill Llama 8B	8B	Q4	207.05 tok/sEstimated Static estimation (DB-independent)	4GB
Deepseek AI Deepseek R1 Distill Qwen 7B	7B	Q4	207.05 tok/sEstimated Static estimation (DB-independent)	4GB
Nineninesix Kani Tts 2 En	Unknown	Q4	198.76 tok/sEstimated Static estimation (DB-independent)	1GB
Nanbeige Nanbeige4 1 3B	3B	Q4	198.76 tok/sEstimated Static estimation (DB-independent)	2GB
Minimaxai Minimax M2 5	Unknown	Q4	198.76 tok/sEstimated Static estimation (DB-independent)	1GB
Minimaxai Minimax M2 1	Unknown	Q4	198.76 tok/sEstimated Static estimation (DB-independent)	1GB
Stepfun AI Step 3 5 Flash	Unknown	Q4	198.76 tok/sEstimated Static estimation (DB-independent)	2GB
Qwen Qwen3 Coder Next	Unknown	Q4	198.76 tok/sEstimated Static estimation (DB-independent)	2GB
Moonshotai Kimi K2 5	Unknown	Q4	198.76 tok/sEstimated Static estimation (DB-independent)	1GB
Xiaomimimo Mimo V2 Flash	Unknown	Q4	198.76 tok/sEstimated Static estimation (DB-independent)	1GB
Nari Labs Dia2 2B	2B	Q4	198.76 tok/sEstimated Static estimation (DB-independent)	1GB
Google Embeddinggemma 300M	Unknown	Q4	198.76 tok/sEstimated Static estimation (DB-independent)	1GB
Facebook Sam3	Unknown	Q4	198.76 tok/sEstimated Static estimation (DB-independent)	2GB
Black Forest Labs Flux 2 Dev	Unknown	Q4	198.76 tok/sEstimated Static estimation (DB-independent)	1GB
Moonshotai Kimi K2 Thinking	Unknown	Q4	198.76 tok/sEstimated Static estimation (DB-independent)	1GB
Microsoft Phi 3 5 Mini Instruct	Unknown	Q4	198.76 tok/sEstimated Static estimation (DB-independent)	2GB
Meta Llama Llama 3 2 3B Instruct	3B	Q4	198.76 tok/sEstimated Static estimation (DB-independent)	2GB
Qwen Qwen3 1.7B Base	1.7B	Q4	198.76 tok/sEstimated Static estimation (DB-independent)	1GB
Dicta Il Dictalm2.0 Instruct	Unknown	Q4	198.76 tok/sEstimated Static estimation (DB-independent)	1GB
Qwen Qwen2 0.5B Instruct	0.5B	Q4	198.76 tok/sEstimated Static estimation (DB-independent)	1GB
Alibaba Nlp Gte Qwen2 1.5B Instruct	1.5B	Q4	198.76 tok/sEstimated Static estimation (DB-independent)	1GB
Apple Openelm 1 1B Instruct	1B	Q4	198.76 tok/sEstimated Static estimation (DB-independent)	1GB
Qwen Qwen2.5 3B	3B	Q4	198.76 tok/sEstimated Static estimation (DB-independent)	2GB
Unsloth Gemma 3 1B It	1B	Q4	198.76 tok/sEstimated Static estimation (DB-independent)	1GB
Bigcode Starcoder2 3B	3B	Q4	198.76 tok/sEstimated Static estimation (DB-independent)	2GB
Ibm Granite Granite Docling 258M	Unknown	Q4	198.76 tok/sEstimated Static estimation (DB-independent)	1GB
Skt Kogpt2 Base V2	Unknown	Q4	198.76 tok/sEstimated Static estimation (DB-independent)	1GB
Google Gemma 3 270M It	Unknown	Q4	198.76 tok/sEstimated Static estimation (DB-independent)	1GB
Eleutherai Pythia 70M Deduped	Unknown	Q4	198.76 tok/sEstimated Static estimation (DB-independent)	1GB
Microsoft Vibevoice 1.5B	1.5B	Q4	198.76 tok/sEstimated Static estimation (DB-independent)	1GB
Ibm Granite Granite 3.3 2B Instruct	2B	Q4	198.76 tok/sEstimated Static estimation (DB-independent)	1GB
Google Gemma 2B	2B	Q4	198.76 tok/sEstimated Static estimation (DB-independent)	1GB
Trl Internal Testing Tiny Llamaforcausallm 3.2	Unknown	Q4	198.76 tok/sEstimated Static estimation (DB-independent)	2GB
Llamafactory Tiny Random Llama 3	Unknown	Q4	198.76 tok/sEstimated Static estimation (DB-independent)	2GB
Unsloth Llama 3.2 1B Instruct	1B	Q4	198.76 tok/sEstimated Static estimation (DB-independent)	1GB
Numind Nuextract 1.5	Unknown	Q4	198.76 tok/sEstimated Static estimation (DB-independent)	1GB
Hmellor Tiny Random Llamaforcausallm	Unknown	Q4	198.76 tok/sEstimated Static estimation (DB-independent)	1GB
Sshleifer Tiny Gpt2	Unknown	Q4	198.76 tok/sEstimated Static estimation (DB-independent)	1GB
Openai Community Gpt2 Xl	Unknown	Q4	198.76 tok/sEstimated Static estimation (DB-independent)	1GB
Ibm Research Powermoe 3B	3B	Q4	198.76 tok/sEstimated Static estimation (DB-independent)	2GB
Unsloth Llama 3.2 3B Instruct	3B	Q4	198.76 tok/sEstimated Static estimation (DB-independent)	2GB
Meta Llama Llama 3.2 3B	3B	Q4	198.76 tok/sEstimated Static estimation (DB-independent)	2GB
Eleutherai Gpt Neo 125M	Unknown	Q4	198.76 tok/sEstimated Static estimation (DB-independent)	1GB
Meta Llama Llama Guard 3 1B	1B	Q4	198.76 tok/sEstimated Static estimation (DB-independent)	1GB
Qwen Qwen2 1.5B Instruct	1.5B	Q4	198.76 tok/sEstimated Static estimation (DB-independent)	1GB
Google Gemma 2 2B It	2B	Q4	198.76 tok/sEstimated Static estimation (DB-independent)	1GB
Microsoft Phi 3.5 Mini Instruct	Unknown	Q4	198.76 tok/sEstimated Static estimation (DB-independent)	2GB
Microsoft Phi 3.5 Vision Instruct	Unknown	Q4	198.76 tok/sEstimated Static estimation (DB-independent)	2GB
Rinna Japanese Gpt Neox Small	Unknown	Q4	198.76 tok/sEstimated Static estimation (DB-independent)	1GB
Qwen Qwen2.5 Coder 1.5B	1.5B	Q4	198.76 tok/sEstimated Static estimation (DB-independent)	1GB
Microsoft Dialogpt Small	Unknown	Q4	198.76 tok/sEstimated Static estimation (DB-independent)	1GB
Qwen Qwen3 0.6B Base	0.6B	Q4	198.76 tok/sEstimated Static estimation (DB-independent)	1GB
Openai Community Gpt2 Medium	Unknown	Q4	198.76 tok/sEstimated Static estimation (DB-independent)	1GB
Trl Internal Testing Tiny Random Llamaforcausallm	Unknown	Q4	198.76 tok/sEstimated Static estimation (DB-independent)	1GB
Qwen Qwen2.5 Math 1.5B	1.5B	Q4	198.76 tok/sEstimated Static estimation (DB-independent)	1GB
Huggingfacetb Smollm 135M	Unknown	Q4	198.76 tok/sEstimated Static estimation (DB-independent)	1GB
Liquidai Lfm2 1.2B	1.2B	Q4	198.76 tok/sEstimated Static estimation (DB-independent)	1GB
Qwen Qwen2 0.5B	0.5B	Q4	198.76 tok/sEstimated Static estimation (DB-independent)	1GB
Minimaxai Minimax M2	Unknown	Q4	198.76 tok/sEstimated Static estimation (DB-independent)	1GB
Huggingfacetb Smollm2 135M	Unknown	Q4	198.76 tok/sEstimated Static estimation (DB-independent)	1GB
Microsoft Phi 2	Unknown	Q4	198.76 tok/sEstimated Static estimation (DB-independent)	1GB
Qwen Qwen2.5 0.5B	0.5B	Q4	198.76 tok/sEstimated Static estimation (DB-independent)	1GB
Qwen Qwen2.5 1.5B	1.5B	Q4	198.76 tok/sEstimated Static estimation (DB-independent)	1GB
Qwen Qwen3 Reranker 0.6B	0.6B	Q4	198.76 tok/sEstimated Static estimation (DB-independent)	1GB
Google T5 T5 3B	3B	Q4	198.76 tok/sEstimated Static estimation (DB-independent)	2GB
Qwen Qwen3 1.7B	1.7B	Q4	198.76 tok/sEstimated Static estimation (DB-independent)	1GB
Openai Community Gpt2 Large	Unknown	Q4	198.76 tok/sEstimated Static estimation (DB-independent)	1GB

Deepseek AI Deepseek Ocr 2

Q4 · Unknown

1GB

248.45 tok/sEstimated