Quick Answer: NVIDIA H200 SXM 141GB offers 141GB VRAM and starts around current market pricing. It delivers approximately 1034 tokens/sec on Deepseek AI Deepseek Coder 1.3B Instruct. It typically draws 700W under load.

NVIDIA H200 SXM 141GB

Check availability

By NVIDIAReleased 2023-11MSRP $35,000.00

This GPU offers reliable throughput for local AI workloads. Pair it with the right model quantization to hit your desired tokens/sec, and monitor prices below to catch the best deal.

Search on Amazon View Benchmarks

Specs snapshot

Key hardware metrics for AI workloads.

VRAM141GB

Cores16,896

TDP700W

ArchitectureHopper

Key Takeaways

141GB VRAM - runs models up to ~352B parameters
Flagship-class compute for maximum throughput
High power draw (700W) - requires robust PSU (850W+ recommended)
Strong price-to-VRAM value

What this means for you

With 141GB VRAM, NVIDIA H200 SXM 141GB can run models up to approximately 352B parameters using 4-bit quantization. This handles most popular models including Llama 3 70B, Mistral 7B, and larger.

Who should buy

Professional AI workloads requiring maximum VRAM
Running 100B+ parameter models with full precision

Looking to upgrade?

Consider H100 or MI300X — Maximum VRAM for enterprise workloads.

AI benchmarks

Showing 80 of 80 benchmark rows.

Model	Size	Quantization	Tokens/sec	VRAM used
Deepseek AI Deepseek Coder 1.3B Instruct	1.3B	Q4	1033.90 tok/sEstimated Static estimation (DB-independent)	1GB
Deepseek AI Deepseek Coder V2 Lite Instruct	Unknown	Q4	1033.90 tok/sEstimated Static estimation (DB-independent)	1GB
Deepseek AI Deepseek Math V2	Unknown	Q4	1033.90 tok/sEstimated Static estimation (DB-independent)	1GB
Deepseek AI Deepseek Ocr 2	Unknown	Q4	1033.90 tok/sEstimated Static estimation (DB-independent)	1GB
Deepseek AI Deepseek R1 Distill Qwen 1.5B	1.5B	Q4	1033.90 tok/sEstimated Static estimation (DB-independent)	1GB
Deepseek AI Deepseek V2 5	Unknown	Q4	1033.90 tok/sEstimated Static estimation (DB-independent)	1GB
Deepseek AI Deepseek V3	Unknown	Q4	1033.90 tok/sEstimated Static estimation (DB-independent)	2GB
Deepseek AI Deepseek V3.1	Unknown	Q4	1033.90 tok/sEstimated Static estimation (DB-independent)	2GB
Deepseek AI Deepseek Ocr	Unknown	Q4	861.58 tok/sEstimated Static estimation (DB-independent)	4GB
Deepseek AI Deepseek R1	Unknown	Q4	861.58 tok/sEstimated Static estimation (DB-independent)	4GB
Deepseek AI Deepseek R1 0528	Unknown	Q4	861.58 tok/sEstimated Static estimation (DB-independent)	4GB
Deepseek AI Deepseek R1 Distill Llama 8B	8B	Q4	861.58 tok/sEstimated Static estimation (DB-independent)	4GB
Deepseek AI Deepseek R1 Distill Qwen 7B	7B	Q4	861.58 tok/sEstimated Static estimation (DB-independent)	4GB
Lmstudio Community Deepseek R1 0528 Qwen3 8B Mlx 4bit	8B	Q4	861.58 tok/sEstimated Static estimation (DB-independent)	4GB
Lmstudio Community Deepseek R1 0528 Qwen3 8B Mlx 8bit	8B	Q4	861.58 tok/sEstimated Static estimation (DB-independent)	4GB
Alibaba Nlp Gte Qwen2 1.5B Instruct	1.5B	Q4	827.12 tok/sEstimated Static estimation (DB-independent)	1GB
Allenai Olmo 2 0425 1B	1B	Q4	827.12 tok/sEstimated Static estimation (DB-independent)	1GB
Apple Openelm 1 1B Instruct	1B	Q4	827.12 tok/sEstimated Static estimation (DB-independent)	1GB
Bigcode Starcoder2 3B	3B	Q4	827.12 tok/sEstimated Static estimation (DB-independent)	2GB
Bigscience Bloomz 560M	Unknown	Q4	827.12 tok/sEstimated Static estimation (DB-independent)	1GB
Black Forest Labs Flux 2 Dev	Unknown	Q4	827.12 tok/sEstimated Static estimation (DB-independent)	1GB
Context Labs Meta Llama Llama 3.2 3B Instruct FP16	3B	Q4	827.12 tok/sEstimated Static estimation (DB-independent)	2GB
Dicta Il Dictalm2.0 Instruct	Unknown	Q4	827.12 tok/sEstimated Static estimation (DB-independent)	1GB
Distilbert Distilgpt2	Unknown	Q4	827.12 tok/sEstimated Static estimation (DB-independent)	1GB
Eleutherai Gpt Neo 125M	Unknown	Q4	827.12 tok/sEstimated Static estimation (DB-independent)	1GB
Eleutherai Pythia 70M Deduped	Unknown	Q4	827.12 tok/sEstimated Static estimation (DB-independent)	1GB
Facebook Opt 125M	Unknown	Q4	827.12 tok/sEstimated Static estimation (DB-independent)	1GB
Facebook Sam3	Unknown	Q4	827.12 tok/sEstimated Static estimation (DB-independent)	2GB
Gensyn Qwen2.5 0.5B Instruct	0.5B	Q4	827.12 tok/sEstimated Static estimation (DB-independent)	1GB
Google Embeddinggemma 300M	Unknown	Q4	827.12 tok/sEstimated Static estimation (DB-independent)	1GB
Google Gemma 2 2B It	2B	Q4	827.12 tok/sEstimated Static estimation (DB-independent)	1GB
Google Gemma 2B	2B	Q4	827.12 tok/sEstimated Static estimation (DB-independent)	1GB
Google Gemma 3 1B It	1B	Q4	827.12 tok/sEstimated Static estimation (DB-independent)	1GB
Google Gemma 3 270M It	Unknown	Q4	827.12 tok/sEstimated Static estimation (DB-independent)	1GB
Google T5 T5 3B	3B	Q4	827.12 tok/sEstimated Static estimation (DB-independent)	2GB
Hmellor Tiny Random Llamaforcausallm	Unknown	Q4	827.12 tok/sEstimated Static estimation (DB-independent)	1GB
Huggingfacetb Smollm 135M	Unknown	Q4	827.12 tok/sEstimated Static estimation (DB-independent)	1GB
Huggingfacetb Smollm2 135M	Unknown	Q4	827.12 tok/sEstimated Static estimation (DB-independent)	1GB
Ibm Granite Granite 3.3 2B Instruct	2B	Q4	827.12 tok/sEstimated Static estimation (DB-independent)	1GB
Ibm Granite Granite Docling 258M	Unknown	Q4	827.12 tok/sEstimated Static estimation (DB-independent)	1GB
Ibm Research Powermoe 3B	3B	Q4	827.12 tok/sEstimated Static estimation (DB-independent)	2GB
Inference Net Schematron 3B	3B	Q4	827.12 tok/sEstimated Static estimation (DB-independent)	2GB
Liquidai Lfm2 1.2B	1.2B	Q4	827.12 tok/sEstimated Static estimation (DB-independent)	1GB
Llamafactory Tiny Random Llama 3	Unknown	Q4	827.12 tok/sEstimated Static estimation (DB-independent)	2GB
Meta Llama Llama 3 2 3B Instruct	3B	Q4	827.12 tok/sEstimated Static estimation (DB-independent)	2GB
Meta Llama Llama 3.2 1B	1B	Q4	827.12 tok/sEstimated Static estimation (DB-independent)	1GB
Meta Llama Llama 3.2 1B Instruct	1B	Q4	827.12 tok/sEstimated Static estimation (DB-independent)	1GB
Meta Llama Llama 3.2 3B	3B	Q4	827.12 tok/sEstimated Static estimation (DB-independent)	2GB
Meta Llama Llama 3.2 3B Instruct	3B	Q4	827.12 tok/sEstimated Static estimation (DB-independent)	2GB
Meta Llama Llama Guard 3 1B	1B	Q4	827.12 tok/sEstimated Static estimation (DB-independent)	1GB
Microsoft Dialogpt Small	Unknown	Q4	827.12 tok/sEstimated Static estimation (DB-independent)	1GB
Microsoft Phi 2	Unknown	Q4	827.12 tok/sEstimated Static estimation (DB-independent)	1GB
Microsoft Phi 3 5 Mini Instruct	Unknown	Q4	827.12 tok/sEstimated Static estimation (DB-independent)	2GB
Microsoft Phi 3.5 Mini Instruct	Unknown	Q4	827.12 tok/sEstimated Static estimation (DB-independent)	2GB
Microsoft Phi 3.5 Vision Instruct	Unknown	Q4	827.12 tok/sEstimated Static estimation (DB-independent)	2GB
Microsoft Vibevoice 1.5B	1.5B	Q4	827.12 tok/sEstimated Static estimation (DB-independent)	1GB
Minimaxai Minimax M2	Unknown	Q4	827.12 tok/sEstimated Static estimation (DB-independent)	1GB
Minimaxai Minimax M2 1	Unknown	Q4	827.12 tok/sEstimated Static estimation (DB-independent)	1GB
Minimaxai Minimax M2 5	Unknown	Q4	827.12 tok/sEstimated Static estimation (DB-independent)	1GB
Moonshotai Kimi K2 5	Unknown	Q4	827.12 tok/sEstimated Static estimation (DB-independent)	1GB
Moonshotai Kimi K2 Thinking	Unknown	Q4	827.12 tok/sEstimated Static estimation (DB-independent)	1GB
Nanbeige Nanbeige4 1 3B	3B	Q4	827.12 tok/sEstimated Static estimation (DB-independent)	2GB
Nari Labs Dia2 2B	2B	Q4	827.12 tok/sEstimated Static estimation (DB-independent)	1GB
Nineninesix Kani Tts 2 En	Unknown	Q4	827.12 tok/sEstimated Static estimation (DB-independent)	1GB
Numind Nuextract 1.5	Unknown	Q4	827.12 tok/sEstimated Static estimation (DB-independent)	1GB
Openai Community Gpt2	Unknown	Q4	827.12 tok/sEstimated Static estimation (DB-independent)	1GB
Openai Community Gpt2 Large	Unknown	Q4	827.12 tok/sEstimated Static estimation (DB-independent)	1GB
Openai Community Gpt2 Medium	Unknown	Q4	827.12 tok/sEstimated Static estimation (DB-independent)	1GB
Openai Community Gpt2 Xl	Unknown	Q4	827.12 tok/sEstimated Static estimation (DB-independent)	1GB
Petals Team Stablebeluga2	Unknown	Q4	827.12 tok/sEstimated Static estimation (DB-independent)	1GB
Qwen Qwen2 0.5B	0.5B	Q4	827.12 tok/sEstimated Static estimation (DB-independent)	1GB
Qwen Qwen2 0.5B Instruct	0.5B	Q4	827.12 tok/sEstimated Static estimation (DB-independent)	1GB
Qwen Qwen2 1.5B Instruct	1.5B	Q4	827.12 tok/sEstimated Static estimation (DB-independent)	1GB
Qwen Qwen2.5 0.5B	0.5B	Q4	827.12 tok/sEstimated Static estimation (DB-independent)	1GB
Qwen Qwen2.5 0.5B Instruct	0.5B	Q4	827.12 tok/sEstimated Static estimation (DB-independent)	1GB
Qwen Qwen2.5 1.5B	1.5B	Q4	827.12 tok/sEstimated Static estimation (DB-independent)	1GB
Qwen Qwen2.5 1.5B Instruct	1.5B	Q4	827.12 tok/sEstimated Static estimation (DB-independent)	1GB
Qwen Qwen2.5 3B	3B	Q4	827.12 tok/sEstimated Static estimation (DB-independent)	2GB
Qwen Qwen2.5 3B Instruct	3B	Q4	827.12 tok/sEstimated Static estimation (DB-independent)	2GB
Qwen Qwen2.5 Coder 1.5B	1.5B	Q4	827.12 tok/sEstimated Static estimation (DB-independent)	1GB

Deepseek AI Deepseek Coder 1.3B Instruct

Q4 · 1.3B

1GB

1033.90 tok/sEstimated