Quick Answer: RTX 3080 offers 10GB VRAM and starts around $449.99. It delivers approximately 166 tokens/sec on google-bert/bert-base-uncased. It typically draws 320W under load.

RTX 3080

In Stock

By NVIDIAReleased 2020-09MSRP $699.00

This GPU offers reliable throughput for local AI workloads. Pair it with the right model quantization to hit your desired tokens/sec, and monitor prices below to catch the best deal.

Buy on Amazon - $449.99 View Benchmarks

Specs snapshot

Key hardware metrics for AI workloads.

VRAM10GB

Cores8,704

TDP320W

ArchitectureAmpere

Where to Buy

Buy directly on Amazon with fast shipping and reliable customer service.

AmazonIn Stock

$449.99

Buy on Amazon

More Amazon options

Rotate out primary variants whenever validation flags an issue.

💡 Not ready to buy? Try cloud GPUs first

Test RTX 3080 performance in the cloud before investing in hardware. Pay by the hour with no commitment.

Vast.aifrom $0.20/hr RunPodfrom $0.30/hr Lambda Labsenterprise-grade

AI benchmarks

Model	Quantization	Tokens/sec	VRAM used
google-bert/bert-base-uncased	Q4	165.90 tok/sEstimated Auto-generated benchmark	1GB
context-labs/meta-llama-Llama-3.2-3B-Instruct-FP16	Q4	164.92 tok/sEstimated Auto-generated benchmark	2GB
meta-llama/Llama-3.2-3B-Instruct	Q4	164.59 tok/sEstimated Auto-generated benchmark	2GB
ibm-granite/granite-3.3-2b-instruct	Q4	162.95 tok/sEstimated Auto-generated benchmark	1GB
deepseek-ai/DeepSeek-OCR	Q4	162.84 tok/sEstimated Auto-generated benchmark	2GB
ibm-research/PowerMoE-3b	Q4	162.03 tok/sEstimated Auto-generated benchmark	2GB
unsloth/Llama-3.2-3B-Instruct	Q4	161.10 tok/sEstimated Auto-generated benchmark	2GB
nari-labs/Dia2-2B	Q4	160.77 tok/sEstimated Auto-generated benchmark	2GB
inference-net/Schematron-3B	Q4	160.41 tok/sEstimated Auto-generated benchmark	2GB
Qwen/Qwen2.5-3B	Q4	159.97 tok/sEstimated Auto-generated benchmark	2GB
WeiboAI/VibeThinker-1.5B	Q4	158.73 tok/sEstimated Auto-generated benchmark	1GB
bigcode/starcoder2-3b	Q4	158.60 tok/sEstimated Auto-generated benchmark	2GB
meta-llama/Llama-3.2-1B-Instruct	Q4	157.16 tok/sEstimated Auto-generated benchmark	1GB
TinyLlama/TinyLlama-1.1B-Chat-v1.0	Q4	155.42 tok/sEstimated Auto-generated benchmark	1GB
deepseek-ai/deepseek-coder-1.3b-instruct	Q4	154.90 tok/sEstimated Auto-generated benchmark	2GB
tencent/HunyuanOCR	Q4	154.89 tok/sEstimated Auto-generated benchmark	1GB
apple/OpenELM-1_1B-Instruct	Q4	152.45 tok/sEstimated Auto-generated benchmark	1GB
facebook/sam3	Q4	151.35 tok/sEstimated Auto-generated benchmark	1GB
meta-llama/Llama-Guard-3-1B	Q4	150.35 tok/sEstimated Auto-generated benchmark	1GB
google/embeddinggemma-300m	Q4	149.94 tok/sEstimated Auto-generated benchmark	1GB
google/gemma-2-2b-it	Q4	145.74 tok/sEstimated Auto-generated benchmark	1GB
unsloth/gemma-3-1b-it	Q4	145.07 tok/sEstimated Auto-generated benchmark	1GB
meta-llama/Llama-3.2-1B	Q4	144.41 tok/sEstimated Auto-generated benchmark	1GB
google-t5/t5-3b	Q4	143.36 tok/sEstimated Auto-generated benchmark	2GB
allenai/OLMo-2-0425-1B	Q4	143.24 tok/sEstimated Auto-generated benchmark	1GB
meta-llama/Llama-3.2-3B	Q4	142.91 tok/sEstimated Auto-generated benchmark	2GB
Qwen/Qwen2.5-3B-Instruct	Q4	140.40 tok/sEstimated Auto-generated benchmark	2GB
unsloth/Llama-3.2-1B-Instruct	Q4	140.09 tok/sEstimated Auto-generated benchmark	1GB
google/gemma-2b	Q4	140.07 tok/sEstimated Auto-generated benchmark	1GB
google/gemma-3-1b-it	Q4	138.73 tok/sEstimated Auto-generated benchmark	1GB
Qwen/Qwen2.5-1.5B-Instruct	Q4	138.08 tok/sEstimated Auto-generated benchmark	3GB
deepseek-ai/DeepSeek-V3-0324	Q4	137.90 tok/sEstimated Auto-generated benchmark	4GB
LiquidAI/LFM2-1.2B	Q4	137.86 tok/sEstimated Auto-generated benchmark	1GB
vikhyatk/moondream2	Q4	137.37 tok/sEstimated Auto-generated benchmark	4GB
GSAI-ML/LLaDA-8B-Instruct	Q4	137.31 tok/sEstimated Auto-generated benchmark	4GB
zai-org/GLM-4.6-FP8	Q4	137.03 tok/sEstimated Auto-generated benchmark	4GB
Alibaba-NLP/gte-Qwen2-1.5B-instruct	Q4	136.97 tok/sEstimated Auto-generated benchmark	3GB
Qwen/Qwen2-1.5B-Instruct	Q4	136.47 tok/sEstimated Auto-generated benchmark	3GB
MiniMaxAI/MiniMax-M2	Q4	135.86 tok/sEstimated Auto-generated benchmark	4GB
Qwen/Qwen2.5-7B-Instruct	Q4	135.09 tok/sEstimated Auto-generated benchmark	4GB
deepseek-ai/DeepSeek-V3	Q4	135.03 tok/sEstimated Auto-generated benchmark	4GB
meta-llama/Llama-3.1-8B-Instruct	Q4	135.02 tok/sEstimated Auto-generated benchmark	4GB
dicta-il/dictalm2.0-instruct	Q4	134.86 tok/sEstimated Auto-generated benchmark	4GB
microsoft/phi-4	Q4	134.85 tok/sEstimated Auto-generated benchmark	4GB
trl-internal-testing/tiny-Qwen2ForCausalLM-2.5	Q4	134.44 tok/sEstimated Auto-generated benchmark	4GB
HuggingFaceTB/SmolLM-135M	Q4	134.06 tok/sEstimated Auto-generated benchmark	4GB
Qwen/Qwen2.5-0.5B	Q4	133.77 tok/sEstimated Auto-generated benchmark	3GB
tencent/HunyuanVideo-1.5	Q4	133.57 tok/sEstimated Auto-generated benchmark	4GB
unsloth/Meta-Llama-3.1-8B-Instruct	Q4	133.52 tok/sEstimated Auto-generated benchmark	4GB
deepseek-ai/DeepSeek-Coder-V2-Lite-Instruct	Q4	132.94 tok/sEstimated Auto-generated benchmark	4GB

google-bert/bert-base-uncased

1GB

165.90 tok/sEstimated

Auto-generated benchmark

context-labs/meta-llama-Llama-3.2-3B-Instruct-FP16

2GB

164.92 tok/sEstimated

Auto-generated benchmark

meta-llama/Llama-3.2-3B-Instruct

2GB

164.59 tok/sEstimated

Auto-generated benchmark

ibm-granite/granite-3.3-2b-instruct

1GB

162.95 tok/sEstimated

Auto-generated benchmark

deepseek-ai/DeepSeek-OCR

2GB

162.84 tok/sEstimated

Auto-generated benchmark

ibm-research/PowerMoE-3b

2GB

162.03 tok/sEstimated

Auto-generated benchmark

unsloth/Llama-3.2-3B-Instruct

2GB

161.10 tok/sEstimated

Auto-generated benchmark

nari-labs/Dia2-2B

2GB

160.77 tok/sEstimated

Auto-generated benchmark

inference-net/Schematron-3B

2GB

160.41 tok/sEstimated

Auto-generated benchmark

Qwen/Qwen2.5-3B

2GB

159.97 tok/sEstimated

Auto-generated benchmark

WeiboAI/VibeThinker-1.5B

1GB

158.73 tok/sEstimated

Auto-generated benchmark

bigcode/starcoder2-3b

2GB

158.60 tok/sEstimated

Auto-generated benchmark

meta-llama/Llama-3.2-1B-Instruct

1GB

157.16 tok/sEstimated

Auto-generated benchmark

TinyLlama/TinyLlama-1.1B-Chat-v1.0

1GB

155.42 tok/sEstimated

Auto-generated benchmark

deepseek-ai/deepseek-coder-1.3b-instruct

2GB

154.90 tok/sEstimated

Auto-generated benchmark

tencent/HunyuanOCR

1GB

154.89 tok/sEstimated

Auto-generated benchmark

apple/OpenELM-1_1B-Instruct

1GB

152.45 tok/sEstimated

Auto-generated benchmark

facebook/sam3

1GB

151.35 tok/sEstimated

Auto-generated benchmark

meta-llama/Llama-Guard-3-1B

1GB

150.35 tok/sEstimated

Auto-generated benchmark

google/embeddinggemma-300m

1GB

149.94 tok/sEstimated

Auto-generated benchmark

google/gemma-2-2b-it

1GB

145.74 tok/sEstimated

Auto-generated benchmark

unsloth/gemma-3-1b-it

1GB

145.07 tok/sEstimated

Auto-generated benchmark

meta-llama/Llama-3.2-1B

1GB

144.41 tok/sEstimated

Auto-generated benchmark

google-t5/t5-3b

2GB

143.36 tok/sEstimated

Auto-generated benchmark

allenai/OLMo-2-0425-1B

1GB

143.24 tok/sEstimated

Auto-generated benchmark

meta-llama/Llama-3.2-3B

2GB

142.91 tok/sEstimated

Auto-generated benchmark

Qwen/Qwen2.5-3B-Instruct

2GB

140.40 tok/sEstimated

Auto-generated benchmark

unsloth/Llama-3.2-1B-Instruct

1GB

140.09 tok/sEstimated

Auto-generated benchmark

google/gemma-2b

1GB

140.07 tok/sEstimated

Auto-generated benchmark

google/gemma-3-1b-it

1GB

138.73 tok/sEstimated

Auto-generated benchmark

Qwen/Qwen2.5-1.5B-Instruct

3GB

138.08 tok/sEstimated

Auto-generated benchmark

deepseek-ai/DeepSeek-V3-0324

4GB

137.90 tok/sEstimated

Auto-generated benchmark

LiquidAI/LFM2-1.2B

1GB

137.86 tok/sEstimated

Auto-generated benchmark

vikhyatk/moondream2

4GB

137.37 tok/sEstimated

Auto-generated benchmark

GSAI-ML/LLaDA-8B-Instruct

4GB

137.31 tok/sEstimated

Auto-generated benchmark

zai-org/GLM-4.6-FP8

4GB

137.03 tok/sEstimated

Auto-generated benchmark

Alibaba-NLP/gte-Qwen2-1.5B-instruct

3GB

136.97 tok/sEstimated

Auto-generated benchmark

Qwen/Qwen2-1.5B-Instruct

3GB

136.47 tok/sEstimated

Auto-generated benchmark

MiniMaxAI/MiniMax-M2

4GB

135.86 tok/sEstimated

Auto-generated benchmark

Qwen/Qwen2.5-7B-Instruct

4GB

135.09 tok/sEstimated

Auto-generated benchmark

deepseek-ai/DeepSeek-V3

4GB

135.03 tok/sEstimated

Auto-generated benchmark

meta-llama/Llama-3.1-8B-Instruct

4GB

135.02 tok/sEstimated

Auto-generated benchmark

dicta-il/dictalm2.0-instruct

4GB

134.86 tok/sEstimated

Auto-generated benchmark

microsoft/phi-4

4GB

134.85 tok/sEstimated

Auto-generated benchmark

trl-internal-testing/tiny-Qwen2ForCausalLM-2.5

4GB

134.44 tok/sEstimated

Auto-generated benchmark

HuggingFaceTB/SmolLM-135M

4GB

134.06 tok/sEstimated

Auto-generated benchmark

Qwen/Qwen2.5-0.5B

3GB

133.77 tok/sEstimated

Auto-generated benchmark

tencent/HunyuanVideo-1.5

4GB

133.57 tok/sEstimated

Auto-generated benchmark

unsloth/Meta-Llama-3.1-8B-Instruct

4GB

133.52 tok/sEstimated

Auto-generated benchmark

deepseek-ai/DeepSeek-Coder-V2-Lite-Instruct

4GB

132.94 tok/sEstimated

Auto-generated benchmark

Note: Performance estimates are calculated. Real results may vary. Methodology · Submit real data

Model compatibility

Model	Quantization	Verdict	Estimated speed	VRAM needed
meta-llama/Meta-Llama-3-8B-Instruct	Q4	Fits comfortably	123.79 tok/sEstimated	4GB (have 10GB)
Qwen/Qwen2.5-1.5B	Q4	Fits comfortably	121.15 tok/sEstimated	3GB (have 10GB)
meta-llama/Meta-Llama-3-8B	Q4	Fits comfortably	122.26 tok/sEstimated	4GB (have 10GB)
deepseek-ai/DeepSeek-R1-Distill-Qwen-7B	Q8	Fits comfortably	84.50 tok/sEstimated	7GB (have 10GB)
deepseek-ai/DeepSeek-R1-Distill-Qwen-7B	FP16	Not supported	49.55 tok/sEstimated	15GB (have 10GB)
deepseek-ai/DeepSeek-R1-Distill-Llama-8B	Q8	Fits (tight)	92.11 tok/sEstimated	9GB (have 10GB)
ibm-research/PowerMoE-3b	Q8	Fits comfortably	109.87 tok/sEstimated	3GB (have 10GB)
ibm-research/PowerMoE-3b	FP16	Fits comfortably	57.52 tok/sEstimated	6GB (have 10GB)
lmstudio-community/Qwen3-4B-Thinking-2507-MLX-6bit	Q4	Fits comfortably	124.62 tok/sEstimated	2GB (have 10GB)
lmstudio-community/Qwen3-4B-Thinking-2507-MLX-6bit	Q8	Fits comfortably	94.99 tok/sEstimated	4GB (have 10GB)
Qwen/Qwen3-1.7B-Base	FP16	Not supported	43.21 tok/sEstimated	15GB (have 10GB)
baichuan-inc/Baichuan-M2-32B	Q8	Not supported	29.86 tok/sEstimated	33GB (have 10GB)
baichuan-inc/Baichuan-M2-32B	FP16	Not supported	16.22 tok/sEstimated	66GB (have 10GB)
ai-forever/ruGPT-3.5-13B	Q4	Fits comfortably	103.60 tok/sEstimated	7GB (have 10GB)
tencent/HunyuanOCR	Q4	Fits comfortably	154.89 tok/sEstimated	1GB (have 10GB)
tencent/HunyuanOCR	Q8	Fits comfortably	95.41 tok/sEstimated	2GB (have 10GB)
tencent/HunyuanOCR	FP16	Fits comfortably	58.12 tok/sEstimated	3GB (have 10GB)
facebook/sam3	Q4	Fits comfortably	151.35 tok/sEstimated	1GB (have 10GB)
facebook/sam3	Q8	Fits comfortably	105.55 tok/sEstimated	1GB (have 10GB)
facebook/sam3	FP16	Fits comfortably	58.82 tok/sEstimated	2GB (have 10GB)
Qwen/Qwen-Image-Edit-2509	Q4	Fits comfortably	115.89 tok/sEstimated	4GB (have 10GB)
google-bert/bert-base-uncased	Q8	Fits comfortably	112.65 tok/sEstimated	1GB (have 10GB)
google-bert/bert-base-uncased	FP16	Fits comfortably	56.55 tok/sEstimated	1GB (have 10GB)
MiniMaxAI/MiniMax-VL-01	Q4	Not supported	14.82 tok/sEstimated	256GB (have 10GB)
baichuan-inc/Baichuan-M2-32B	Q4	Not supported	42.25 tok/sEstimated	16GB (have 10GB)
lmstudio-community/Qwen3-Coder-30B-A3B-Instruct-MLX-8bit	Q4	Not supported	69.91 tok/sEstimated	15GB (have 10GB)
lmstudio-community/Qwen3-Coder-30B-A3B-Instruct-MLX-8bit	Q8	Not supported	47.97 tok/sEstimated	31GB (have 10GB)
lmstudio-community/Qwen3-Coder-30B-A3B-Instruct-MLX-6bit	Q8	Not supported	51.71 tok/sEstimated	31GB (have 10GB)
deepseek-ai/DeepSeek-Math-V2	Q4	Not supported	17.17 tok/sEstimated	383GB (have 10GB)
mistralai/Mistral-7B-Instruct-v0.1	Q4	Fits comfortably	124.36 tok/sEstimated	4GB (have 10GB)
mistralai/Mistral-7B-Instruct-v0.1	Q8	Fits comfortably	93.28 tok/sEstimated	7GB (have 10GB)
Qwen/Qwen2-7B-Instruct	Q8	Fits comfortably	94.83 tok/sEstimated	7GB (have 10GB)
deepseek-ai/DeepSeek-V2.5	Q8	Not supported	33.71 tok/sEstimated	656GB (have 10GB)
Qwen/Qwen2.5-7B-Instruct	FP16	Not supported	47.55 tok/sEstimated	15GB (have 10GB)
Qwen/Qwen3-0.6B	Q4	Fits comfortably	121.19 tok/sEstimated	3GB (have 10GB)
Qwen/Qwen2.5-3B-Instruct	Q4	Fits comfortably	140.40 tok/sEstimated	2GB (have 10GB)
Qwen/Qwen2.5-3B-Instruct	Q8	Fits comfortably	98.32 tok/sEstimated	3GB (have 10GB)
Qwen/Qwen2.5-3B-Instruct	FP16	Fits comfortably	57.64 tok/sEstimated	6GB (have 10GB)
vikhyatk/moondream2	Q4	Fits comfortably	137.37 tok/sEstimated	4GB (have 10GB)
vikhyatk/moondream2	Q8	Fits comfortably	81.74 tok/sEstimated	7GB (have 10GB)
vikhyatk/moondream2	FP16	Not supported	49.42 tok/sEstimated	15GB (have 10GB)
petals-team/StableBeluga2	Q4	Fits comfortably	123.30 tok/sEstimated	4GB (have 10GB)
Qwen/Qwen3-0.6B-Base	Q4	Fits comfortably	130.69 tok/sEstimated	3GB (have 10GB)
microsoft/Phi-4-mini-instruct	Q8	Fits comfortably	89.90 tok/sEstimated	7GB (have 10GB)
microsoft/Phi-4-mini-instruct	FP16	Not supported	46.24 tok/sEstimated	15GB (have 10GB)
lmstudio-community/Qwen3-Coder-30B-A3B-Instruct-MLX-5bit	Q8	Not supported	48.55 tok/sEstimated	31GB (have 10GB)
lmstudio-community/Qwen3-Coder-30B-A3B-Instruct-MLX-5bit	FP16	Not supported	25.26 tok/sEstimated	61GB (have 10GB)
moonshotai/Kimi-Linear-48B-A3B-Instruct	Q4	Not supported	39.91 tok/sEstimated	25GB (have 10GB)
moonshotai/Kimi-Linear-48B-A3B-Instruct	Q8	Not supported	29.40 tok/sEstimated	50GB (have 10GB)
openai/gpt-oss-20b	Q8	Not supported	51.68 tok/sEstimated	20GB (have 10GB)

meta-llama/Meta-Llama-3-8B-InstructQ4

Fits comfortably4GB required · 10GB available

123.79 tok/sEstimated

Qwen/Qwen2.5-1.5BQ4

Fits comfortably3GB required · 10GB available

121.15 tok/sEstimated

meta-llama/Meta-Llama-3-8BQ4

Fits comfortably4GB required · 10GB available

122.26 tok/sEstimated

deepseek-ai/DeepSeek-R1-Distill-Qwen-7BQ8

Fits comfortably7GB required · 10GB available

84.50 tok/sEstimated

deepseek-ai/DeepSeek-R1-Distill-Qwen-7BFP16

Not supported15GB required · 10GB available

49.55 tok/sEstimated

deepseek-ai/DeepSeek-R1-Distill-Llama-8BQ8

Fits (tight)9GB required · 10GB available

92.11 tok/sEstimated

ibm-research/PowerMoE-3bQ8

Fits comfortably3GB required · 10GB available

109.87 tok/sEstimated

ibm-research/PowerMoE-3bFP16

Fits comfortably6GB required · 10GB available

57.52 tok/sEstimated

lmstudio-community/Qwen3-4B-Thinking-2507-MLX-6bitQ4

Fits comfortably2GB required · 10GB available

124.62 tok/sEstimated

lmstudio-community/Qwen3-4B-Thinking-2507-MLX-6bitQ8

Fits comfortably4GB required · 10GB available

94.99 tok/sEstimated

Qwen/Qwen3-1.7B-BaseFP16

Not supported15GB required · 10GB available

43.21 tok/sEstimated

baichuan-inc/Baichuan-M2-32BQ8

Not supported33GB required · 10GB available

29.86 tok/sEstimated

baichuan-inc/Baichuan-M2-32BFP16

Not supported66GB required · 10GB available

16.22 tok/sEstimated

ai-forever/ruGPT-3.5-13BQ4

Fits comfortably7GB required · 10GB available

103.60 tok/sEstimated

tencent/HunyuanOCRQ4

Fits comfortably1GB required · 10GB available

154.89 tok/sEstimated

tencent/HunyuanOCRQ8

Fits comfortably2GB required · 10GB available

95.41 tok/sEstimated

tencent/HunyuanOCRFP16

Fits comfortably3GB required · 10GB available

58.12 tok/sEstimated

facebook/sam3Q4

Fits comfortably1GB required · 10GB available

151.35 tok/sEstimated

facebook/sam3Q8

Fits comfortably1GB required · 10GB available

105.55 tok/sEstimated

facebook/sam3FP16

Fits comfortably2GB required · 10GB available

58.82 tok/sEstimated

Qwen/Qwen-Image-Edit-2509Q4

Fits comfortably4GB required · 10GB available

115.89 tok/sEstimated

google-bert/bert-base-uncasedQ8

Fits comfortably1GB required · 10GB available

112.65 tok/sEstimated

google-bert/bert-base-uncasedFP16

Fits comfortably1GB required · 10GB available

56.55 tok/sEstimated

MiniMaxAI/MiniMax-VL-01Q4

Not supported256GB required · 10GB available

14.82 tok/sEstimated

baichuan-inc/Baichuan-M2-32BQ4

Not supported16GB required · 10GB available

42.25 tok/sEstimated

lmstudio-community/Qwen3-Coder-30B-A3B-Instruct-MLX-8bitQ4

Not supported15GB required · 10GB available

69.91 tok/sEstimated

lmstudio-community/Qwen3-Coder-30B-A3B-Instruct-MLX-8bitQ8

Not supported31GB required · 10GB available

47.97 tok/sEstimated

lmstudio-community/Qwen3-Coder-30B-A3B-Instruct-MLX-6bitQ8

Not supported31GB required · 10GB available

51.71 tok/sEstimated

deepseek-ai/DeepSeek-Math-V2Q4

Not supported383GB required · 10GB available

17.17 tok/sEstimated

mistralai/Mistral-7B-Instruct-v0.1Q4

Fits comfortably4GB required · 10GB available

124.36 tok/sEstimated

mistralai/Mistral-7B-Instruct-v0.1Q8

Fits comfortably7GB required · 10GB available

93.28 tok/sEstimated

Qwen/Qwen2-7B-InstructQ8

Fits comfortably7GB required · 10GB available

94.83 tok/sEstimated

deepseek-ai/DeepSeek-V2.5Q8

Not supported656GB required · 10GB available

33.71 tok/sEstimated

Qwen/Qwen2.5-7B-InstructFP16

Not supported15GB required · 10GB available

47.55 tok/sEstimated

Qwen/Qwen3-0.6BQ4

Fits comfortably3GB required · 10GB available

121.19 tok/sEstimated

Qwen/Qwen2.5-3B-InstructQ4

Fits comfortably2GB required · 10GB available

140.40 tok/sEstimated

Qwen/Qwen2.5-3B-InstructQ8

Fits comfortably3GB required · 10GB available

98.32 tok/sEstimated

Qwen/Qwen2.5-3B-InstructFP16

Fits comfortably6GB required · 10GB available

57.64 tok/sEstimated

vikhyatk/moondream2Q4

Fits comfortably4GB required · 10GB available

137.37 tok/sEstimated

vikhyatk/moondream2Q8

Fits comfortably7GB required · 10GB available

81.74 tok/sEstimated

vikhyatk/moondream2FP16

Not supported15GB required · 10GB available

49.42 tok/sEstimated

petals-team/StableBeluga2Q4

Fits comfortably4GB required · 10GB available

123.30 tok/sEstimated

Qwen/Qwen3-0.6B-BaseQ4

Fits comfortably3GB required · 10GB available

130.69 tok/sEstimated

microsoft/Phi-4-mini-instructQ8

Fits comfortably7GB required · 10GB available

89.90 tok/sEstimated

microsoft/Phi-4-mini-instructFP16

Not supported15GB required · 10GB available

46.24 tok/sEstimated

lmstudio-community/Qwen3-Coder-30B-A3B-Instruct-MLX-5bitQ8

Not supported31GB required · 10GB available

48.55 tok/sEstimated

lmstudio-community/Qwen3-Coder-30B-A3B-Instruct-MLX-5bitFP16

Not supported61GB required · 10GB available

25.26 tok/sEstimated

moonshotai/Kimi-Linear-48B-A3B-InstructQ4

Not supported25GB required · 10GB available

39.91 tok/sEstimated

moonshotai/Kimi-Linear-48B-A3B-InstructQ8

Not supported50GB required · 10GB available

29.40 tok/sEstimated

openai/gpt-oss-20bQ8

Not supported20GB required · 10GB available

51.68 tok/sEstimated

Note: Performance estimates are calculated. Real results may vary. Methodology · Submit real data

GPU FAQs

Data-backed answers pulled from community benchmarks, manufacturer specs, and live pricing.

What speed can a 10 GB RTX 3080 hit on Qwen 30B?

Owners running Qwen3-30B-A3B on a 10 GB RTX 3080 report roughly 15 tokens/sec after tuning, keeping interactive coding prompts responsive.

Source: Reddit – /r/LocalLLaMA (mquvxwc)

Why do benchmark sheets overstate requirements?

Some spec sheets assume higher ceilings, but real-world users note they already achieve ~10 tok/sec on a 10 GB 3080—showing how tuning beats blanket requirements.

Source: Reddit – /r/LocalLLaMA (mj408ke)

Why does Ollama still offload to CPU on 12B models?

With larger context windows, Ollama reports 40% of layers moving to system RAM even on 12B models—illustrating the need to tune gpu_layers on 10 GB cards.

Source: Reddit – /r/LocalLLaMA (mnspe0d)

What are the core specs of RTX 3080?

The RTX 3080 Founders Edition includes 10 GB GDDR6X, a 320 W board power, triple 8-pin power connectors, and NVIDIA recommends a 750 W PSU.

Source: TechPowerUp – RTX 3080 Specs

How much does an RTX 3080 cost right now?

Latest snapshot (Nov 2025): Amazon at $699 (check current availability).

Source: Supabase price tracker snapshot – 2025-11-03

Alternative GPUs

RTX 3090

24GB

Explore how RTX 3090 stacks up for local inference workloads.

RTX 3070

8GB

Explore how RTX 3070 stacks up for local inference workloads.

RTX 4070

12GB

Explore how RTX 4070 stacks up for local inference workloads.

RX 6800 XT

16GB

Explore how RX 6800 XT stacks up for local inference workloads.

RTX 4090

24GB

Explore how RTX 4090 stacks up for local inference workloads.

Compare RTX 3080

RTX 3080 vs RTX 3070

Side-by-side VRAM, throughput, efficiency, and pricing benchmarks for both GPUs.

RTX 3080 vs RTX 3090

Side-by-side VRAM, throughput, efficiency, and pricing benchmarks for both GPUs.

RTX 3080 vs RX 6900 XT

Side-by-side VRAM, throughput, efficiency, and pricing benchmarks for both GPUs.

Quick Answer: RTX 3080 offers 10GB VRAM and starts around $449.99. It delivers approximately 166 tokens/sec on google-bert/bert-base-uncased. It typically draws 320W under load.

RTX 3080

In Stock

By NVIDIAReleased 2020-09MSRP $699.00

This GPU offers reliable throughput for local AI workloads. Pair it with the right model quantization to hit your desired tokens/sec, and monitor prices below to catch the best deal.

Buy on Amazon - $449.99 View Benchmarks

Specs snapshot

Key hardware metrics for AI workloads.

VRAM10GB

Cores8,704

TDP320W

ArchitectureAmpere

Where to Buy

Buy directly on Amazon with fast shipping and reliable customer service.

AmazonIn Stock

$449.99

Buy on Amazon

More Amazon options

Rotate out primary variants whenever validation flags an issue.

💡 Not ready to buy? Try cloud GPUs first

Test RTX 3080 performance in the cloud before investing in hardware. Pay by the hour with no commitment.

Vast.aifrom $0.20/hr RunPodfrom $0.30/hr Lambda Labsenterprise-grade

AI benchmarks

Model	Quantization	Tokens/sec	VRAM used
google-bert/bert-base-uncased	Q4	165.90 tok/sEstimated Auto-generated benchmark	1GB
context-labs/meta-llama-Llama-3.2-3B-Instruct-FP16	Q4	164.92 tok/sEstimated Auto-generated benchmark	2GB
meta-llama/Llama-3.2-3B-Instruct	Q4	164.59 tok/sEstimated Auto-generated benchmark	2GB
ibm-granite/granite-3.3-2b-instruct	Q4	162.95 tok/sEstimated Auto-generated benchmark	1GB
deepseek-ai/DeepSeek-OCR	Q4	162.84 tok/sEstimated Auto-generated benchmark	2GB
ibm-research/PowerMoE-3b	Q4	162.03 tok/sEstimated Auto-generated benchmark	2GB
unsloth/Llama-3.2-3B-Instruct	Q4	161.10 tok/sEstimated Auto-generated benchmark	2GB
nari-labs/Dia2-2B	Q4	160.77 tok/sEstimated Auto-generated benchmark	2GB
inference-net/Schematron-3B	Q4	160.41 tok/sEstimated Auto-generated benchmark	2GB
Qwen/Qwen2.5-3B	Q4	159.97 tok/sEstimated Auto-generated benchmark	2GB
WeiboAI/VibeThinker-1.5B	Q4	158.73 tok/sEstimated Auto-generated benchmark	1GB
bigcode/starcoder2-3b	Q4	158.60 tok/sEstimated Auto-generated benchmark	2GB
meta-llama/Llama-3.2-1B-Instruct	Q4	157.16 tok/sEstimated Auto-generated benchmark	1GB
TinyLlama/TinyLlama-1.1B-Chat-v1.0	Q4	155.42 tok/sEstimated Auto-generated benchmark	1GB
deepseek-ai/deepseek-coder-1.3b-instruct	Q4	154.90 tok/sEstimated Auto-generated benchmark	2GB
tencent/HunyuanOCR	Q4	154.89 tok/sEstimated Auto-generated benchmark	1GB
apple/OpenELM-1_1B-Instruct	Q4	152.45 tok/sEstimated Auto-generated benchmark	1GB
facebook/sam3	Q4	151.35 tok/sEstimated Auto-generated benchmark	1GB
meta-llama/Llama-Guard-3-1B	Q4	150.35 tok/sEstimated Auto-generated benchmark	1GB
google/embeddinggemma-300m	Q4	149.94 tok/sEstimated Auto-generated benchmark	1GB
google/gemma-2-2b-it	Q4	145.74 tok/sEstimated Auto-generated benchmark	1GB
unsloth/gemma-3-1b-it	Q4	145.07 tok/sEstimated Auto-generated benchmark	1GB
meta-llama/Llama-3.2-1B	Q4	144.41 tok/sEstimated Auto-generated benchmark	1GB
google-t5/t5-3b	Q4	143.36 tok/sEstimated Auto-generated benchmark	2GB
allenai/OLMo-2-0425-1B	Q4	143.24 tok/sEstimated Auto-generated benchmark	1GB
meta-llama/Llama-3.2-3B	Q4	142.91 tok/sEstimated Auto-generated benchmark	2GB
Qwen/Qwen2.5-3B-Instruct	Q4	140.40 tok/sEstimated Auto-generated benchmark	2GB
unsloth/Llama-3.2-1B-Instruct	Q4	140.09 tok/sEstimated Auto-generated benchmark	1GB
google/gemma-2b	Q4	140.07 tok/sEstimated Auto-generated benchmark	1GB
google/gemma-3-1b-it	Q4	138.73 tok/sEstimated Auto-generated benchmark	1GB
Qwen/Qwen2.5-1.5B-Instruct	Q4	138.08 tok/sEstimated Auto-generated benchmark	3GB
deepseek-ai/DeepSeek-V3-0324	Q4	137.90 tok/sEstimated Auto-generated benchmark	4GB
LiquidAI/LFM2-1.2B	Q4	137.86 tok/sEstimated Auto-generated benchmark	1GB
vikhyatk/moondream2	Q4	137.37 tok/sEstimated Auto-generated benchmark	4GB
GSAI-ML/LLaDA-8B-Instruct	Q4	137.31 tok/sEstimated Auto-generated benchmark	4GB
zai-org/GLM-4.6-FP8	Q4	137.03 tok/sEstimated Auto-generated benchmark	4GB
Alibaba-NLP/gte-Qwen2-1.5B-instruct	Q4	136.97 tok/sEstimated Auto-generated benchmark	3GB
Qwen/Qwen2-1.5B-Instruct	Q4	136.47 tok/sEstimated Auto-generated benchmark	3GB
MiniMaxAI/MiniMax-M2	Q4	135.86 tok/sEstimated Auto-generated benchmark	4GB
Qwen/Qwen2.5-7B-Instruct	Q4	135.09 tok/sEstimated Auto-generated benchmark	4GB
deepseek-ai/DeepSeek-V3	Q4	135.03 tok/sEstimated Auto-generated benchmark	4GB
meta-llama/Llama-3.1-8B-Instruct	Q4	135.02 tok/sEstimated Auto-generated benchmark	4GB
dicta-il/dictalm2.0-instruct	Q4	134.86 tok/sEstimated Auto-generated benchmark	4GB
microsoft/phi-4	Q4	134.85 tok/sEstimated Auto-generated benchmark	4GB
trl-internal-testing/tiny-Qwen2ForCausalLM-2.5	Q4	134.44 tok/sEstimated Auto-generated benchmark	4GB
HuggingFaceTB/SmolLM-135M	Q4	134.06 tok/sEstimated Auto-generated benchmark	4GB
Qwen/Qwen2.5-0.5B	Q4	133.77 tok/sEstimated Auto-generated benchmark	3GB
tencent/HunyuanVideo-1.5	Q4	133.57 tok/sEstimated Auto-generated benchmark	4GB
unsloth/Meta-Llama-3.1-8B-Instruct	Q4	133.52 tok/sEstimated Auto-generated benchmark	4GB
deepseek-ai/DeepSeek-Coder-V2-Lite-Instruct	Q4	132.94 tok/sEstimated Auto-generated benchmark	4GB

google-bert/bert-base-uncased

1GB

165.90 tok/sEstimated

Auto-generated benchmark

context-labs/meta-llama-Llama-3.2-3B-Instruct-FP16

2GB

164.92 tok/sEstimated

Auto-generated benchmark

meta-llama/Llama-3.2-3B-Instruct

2GB

164.59 tok/sEstimated

Auto-generated benchmark

ibm-granite/granite-3.3-2b-instruct

1GB

162.95 tok/sEstimated

Auto-generated benchmark

deepseek-ai/DeepSeek-OCR

2GB

162.84 tok/sEstimated

Auto-generated benchmark

ibm-research/PowerMoE-3b

2GB

162.03 tok/sEstimated

Auto-generated benchmark

unsloth/Llama-3.2-3B-Instruct

2GB

161.10 tok/sEstimated

Auto-generated benchmark

nari-labs/Dia2-2B

2GB

160.77 tok/sEstimated

Auto-generated benchmark

inference-net/Schematron-3B

2GB

160.41 tok/sEstimated

Auto-generated benchmark

Qwen/Qwen2.5-3B

2GB

159.97 tok/sEstimated

Auto-generated benchmark

WeiboAI/VibeThinker-1.5B

1GB

158.73 tok/sEstimated

Auto-generated benchmark

bigcode/starcoder2-3b

2GB

158.60 tok/sEstimated

Auto-generated benchmark

meta-llama/Llama-3.2-1B-Instruct

1GB

157.16 tok/sEstimated

Auto-generated benchmark

TinyLlama/TinyLlama-1.1B-Chat-v1.0

1GB

155.42 tok/sEstimated

Auto-generated benchmark

deepseek-ai/deepseek-coder-1.3b-instruct

2GB

154.90 tok/sEstimated

Auto-generated benchmark

tencent/HunyuanOCR

1GB

154.89 tok/sEstimated

Auto-generated benchmark

apple/OpenELM-1_1B-Instruct

1GB

152.45 tok/sEstimated

Auto-generated benchmark

facebook/sam3

1GB

151.35 tok/sEstimated

Auto-generated benchmark

meta-llama/Llama-Guard-3-1B

1GB

150.35 tok/sEstimated

Auto-generated benchmark

google/embeddinggemma-300m

1GB

149.94 tok/sEstimated

Auto-generated benchmark

google/gemma-2-2b-it

1GB

145.74 tok/sEstimated

Auto-generated benchmark

unsloth/gemma-3-1b-it

1GB

145.07 tok/sEstimated

Auto-generated benchmark

meta-llama/Llama-3.2-1B

1GB

144.41 tok/sEstimated

Auto-generated benchmark

google-t5/t5-3b

2GB

143.36 tok/sEstimated

Auto-generated benchmark

allenai/OLMo-2-0425-1B

1GB

143.24 tok/sEstimated

Auto-generated benchmark

meta-llama/Llama-3.2-3B

2GB

142.91 tok/sEstimated

Auto-generated benchmark

Qwen/Qwen2.5-3B-Instruct

2GB

140.40 tok/sEstimated

Auto-generated benchmark

unsloth/Llama-3.2-1B-Instruct

1GB

140.09 tok/sEstimated

Auto-generated benchmark

google/gemma-2b

1GB

140.07 tok/sEstimated

Auto-generated benchmark

google/gemma-3-1b-it

1GB

138.73 tok/sEstimated

Auto-generated benchmark

Qwen/Qwen2.5-1.5B-Instruct

3GB

138.08 tok/sEstimated

Auto-generated benchmark

deepseek-ai/DeepSeek-V3-0324

4GB

137.90 tok/sEstimated

Auto-generated benchmark

LiquidAI/LFM2-1.2B

1GB

137.86 tok/sEstimated

Auto-generated benchmark

vikhyatk/moondream2

4GB

137.37 tok/sEstimated

Auto-generated benchmark

GSAI-ML/LLaDA-8B-Instruct

4GB

137.31 tok/sEstimated

Auto-generated benchmark

zai-org/GLM-4.6-FP8

4GB

137.03 tok/sEstimated

Auto-generated benchmark

Alibaba-NLP/gte-Qwen2-1.5B-instruct

3GB

136.97 tok/sEstimated

Auto-generated benchmark

Qwen/Qwen2-1.5B-Instruct

3GB

136.47 tok/sEstimated

Auto-generated benchmark

MiniMaxAI/MiniMax-M2

4GB

135.86 tok/sEstimated

Auto-generated benchmark

Qwen/Qwen2.5-7B-Instruct

4GB

135.09 tok/sEstimated

Auto-generated benchmark

deepseek-ai/DeepSeek-V3

4GB

135.03 tok/sEstimated

Auto-generated benchmark

meta-llama/Llama-3.1-8B-Instruct

4GB

135.02 tok/sEstimated

Auto-generated benchmark

dicta-il/dictalm2.0-instruct

4GB

134.86 tok/sEstimated

Auto-generated benchmark

microsoft/phi-4

4GB

134.85 tok/sEstimated

Auto-generated benchmark

trl-internal-testing/tiny-Qwen2ForCausalLM-2.5

4GB

134.44 tok/sEstimated

Auto-generated benchmark

HuggingFaceTB/SmolLM-135M

4GB

134.06 tok/sEstimated

Auto-generated benchmark

Qwen/Qwen2.5-0.5B

3GB

133.77 tok/sEstimated

Auto-generated benchmark

tencent/HunyuanVideo-1.5

4GB

133.57 tok/sEstimated

Auto-generated benchmark

unsloth/Meta-Llama-3.1-8B-Instruct

4GB

133.52 tok/sEstimated

Auto-generated benchmark

deepseek-ai/DeepSeek-Coder-V2-Lite-Instruct

4GB

132.94 tok/sEstimated

Auto-generated benchmark

Note: Performance estimates are calculated. Real results may vary. Methodology · Submit real data

Model compatibility

Model	Quantization	Verdict	Estimated speed	VRAM needed
meta-llama/Meta-Llama-3-8B-Instruct	Q4	Fits comfortably	123.79 tok/sEstimated	4GB (have 10GB)
Qwen/Qwen2.5-1.5B	Q4	Fits comfortably	121.15 tok/sEstimated	3GB (have 10GB)
meta-llama/Meta-Llama-3-8B	Q4	Fits comfortably	122.26 tok/sEstimated	4GB (have 10GB)
deepseek-ai/DeepSeek-R1-Distill-Qwen-7B	Q8	Fits comfortably	84.50 tok/sEstimated	7GB (have 10GB)
deepseek-ai/DeepSeek-R1-Distill-Qwen-7B	FP16	Not supported	49.55 tok/sEstimated	15GB (have 10GB)
deepseek-ai/DeepSeek-R1-Distill-Llama-8B	Q8	Fits (tight)	92.11 tok/sEstimated	9GB (have 10GB)
ibm-research/PowerMoE-3b	Q8	Fits comfortably	109.87 tok/sEstimated	3GB (have 10GB)
ibm-research/PowerMoE-3b	FP16	Fits comfortably	57.52 tok/sEstimated	6GB (have 10GB)
lmstudio-community/Qwen3-4B-Thinking-2507-MLX-6bit	Q4	Fits comfortably	124.62 tok/sEstimated	2GB (have 10GB)
lmstudio-community/Qwen3-4B-Thinking-2507-MLX-6bit	Q8	Fits comfortably	94.99 tok/sEstimated	4GB (have 10GB)
Qwen/Qwen3-1.7B-Base	FP16	Not supported	43.21 tok/sEstimated	15GB (have 10GB)
baichuan-inc/Baichuan-M2-32B	Q8	Not supported	29.86 tok/sEstimated	33GB (have 10GB)
baichuan-inc/Baichuan-M2-32B	FP16	Not supported	16.22 tok/sEstimated	66GB (have 10GB)
ai-forever/ruGPT-3.5-13B	Q4	Fits comfortably	103.60 tok/sEstimated	7GB (have 10GB)
tencent/HunyuanOCR	Q4	Fits comfortably	154.89 tok/sEstimated	1GB (have 10GB)
tencent/HunyuanOCR	Q8	Fits comfortably	95.41 tok/sEstimated	2GB (have 10GB)
tencent/HunyuanOCR	FP16	Fits comfortably	58.12 tok/sEstimated	3GB (have 10GB)
facebook/sam3	Q4	Fits comfortably	151.35 tok/sEstimated	1GB (have 10GB)
facebook/sam3	Q8	Fits comfortably	105.55 tok/sEstimated	1GB (have 10GB)
facebook/sam3	FP16	Fits comfortably	58.82 tok/sEstimated	2GB (have 10GB)
Qwen/Qwen-Image-Edit-2509	Q4	Fits comfortably	115.89 tok/sEstimated	4GB (have 10GB)
google-bert/bert-base-uncased	Q8	Fits comfortably	112.65 tok/sEstimated	1GB (have 10GB)
google-bert/bert-base-uncased	FP16	Fits comfortably	56.55 tok/sEstimated	1GB (have 10GB)
MiniMaxAI/MiniMax-VL-01	Q4	Not supported	14.82 tok/sEstimated	256GB (have 10GB)
baichuan-inc/Baichuan-M2-32B	Q4	Not supported	42.25 tok/sEstimated	16GB (have 10GB)
lmstudio-community/Qwen3-Coder-30B-A3B-Instruct-MLX-8bit	Q4	Not supported	69.91 tok/sEstimated	15GB (have 10GB)
lmstudio-community/Qwen3-Coder-30B-A3B-Instruct-MLX-8bit	Q8	Not supported	47.97 tok/sEstimated	31GB (have 10GB)
lmstudio-community/Qwen3-Coder-30B-A3B-Instruct-MLX-6bit	Q8	Not supported	51.71 tok/sEstimated	31GB (have 10GB)
deepseek-ai/DeepSeek-Math-V2	Q4	Not supported	17.17 tok/sEstimated	383GB (have 10GB)
mistralai/Mistral-7B-Instruct-v0.1	Q4	Fits comfortably	124.36 tok/sEstimated	4GB (have 10GB)
mistralai/Mistral-7B-Instruct-v0.1	Q8	Fits comfortably	93.28 tok/sEstimated	7GB (have 10GB)
Qwen/Qwen2-7B-Instruct	Q8	Fits comfortably	94.83 tok/sEstimated	7GB (have 10GB)
deepseek-ai/DeepSeek-V2.5	Q8	Not supported	33.71 tok/sEstimated	656GB (have 10GB)
Qwen/Qwen2.5-7B-Instruct	FP16	Not supported	47.55 tok/sEstimated	15GB (have 10GB)
Qwen/Qwen3-0.6B	Q4	Fits comfortably	121.19 tok/sEstimated	3GB (have 10GB)
Qwen/Qwen2.5-3B-Instruct	Q4	Fits comfortably	140.40 tok/sEstimated	2GB (have 10GB)
Qwen/Qwen2.5-3B-Instruct	Q8	Fits comfortably	98.32 tok/sEstimated	3GB (have 10GB)
Qwen/Qwen2.5-3B-Instruct	FP16	Fits comfortably	57.64 tok/sEstimated	6GB (have 10GB)
vikhyatk/moondream2	Q4	Fits comfortably	137.37 tok/sEstimated	4GB (have 10GB)
vikhyatk/moondream2	Q8	Fits comfortably	81.74 tok/sEstimated	7GB (have 10GB)
vikhyatk/moondream2	FP16	Not supported	49.42 tok/sEstimated	15GB (have 10GB)
petals-team/StableBeluga2	Q4	Fits comfortably	123.30 tok/sEstimated	4GB (have 10GB)
Qwen/Qwen3-0.6B-Base	Q4	Fits comfortably	130.69 tok/sEstimated	3GB (have 10GB)
microsoft/Phi-4-mini-instruct	Q8	Fits comfortably	89.90 tok/sEstimated	7GB (have 10GB)
microsoft/Phi-4-mini-instruct	FP16	Not supported	46.24 tok/sEstimated	15GB (have 10GB)
lmstudio-community/Qwen3-Coder-30B-A3B-Instruct-MLX-5bit	Q8	Not supported	48.55 tok/sEstimated	31GB (have 10GB)
lmstudio-community/Qwen3-Coder-30B-A3B-Instruct-MLX-5bit	FP16	Not supported	25.26 tok/sEstimated	61GB (have 10GB)
moonshotai/Kimi-Linear-48B-A3B-Instruct	Q4	Not supported	39.91 tok/sEstimated	25GB (have 10GB)
moonshotai/Kimi-Linear-48B-A3B-Instruct	Q8	Not supported	29.40 tok/sEstimated	50GB (have 10GB)
openai/gpt-oss-20b	Q8	Not supported	51.68 tok/sEstimated	20GB (have 10GB)

meta-llama/Meta-Llama-3-8B-InstructQ4

Fits comfortably4GB required · 10GB available

123.79 tok/sEstimated

Qwen/Qwen2.5-1.5BQ4

Fits comfortably3GB required · 10GB available

121.15 tok/sEstimated

meta-llama/Meta-Llama-3-8BQ4

Fits comfortably4GB required · 10GB available

122.26 tok/sEstimated

deepseek-ai/DeepSeek-R1-Distill-Qwen-7BQ8

Fits comfortably7GB required · 10GB available

84.50 tok/sEstimated

deepseek-ai/DeepSeek-R1-Distill-Qwen-7BFP16

Not supported15GB required · 10GB available

49.55 tok/sEstimated

deepseek-ai/DeepSeek-R1-Distill-Llama-8BQ8

Fits (tight)9GB required · 10GB available

92.11 tok/sEstimated

ibm-research/PowerMoE-3bQ8

Fits comfortably3GB required · 10GB available

109.87 tok/sEstimated

ibm-research/PowerMoE-3bFP16

Fits comfortably6GB required · 10GB available

57.52 tok/sEstimated

lmstudio-community/Qwen3-4B-Thinking-2507-MLX-6bitQ4

Fits comfortably2GB required · 10GB available

124.62 tok/sEstimated

lmstudio-community/Qwen3-4B-Thinking-2507-MLX-6bitQ8

Fits comfortably4GB required · 10GB available

94.99 tok/sEstimated

Qwen/Qwen3-1.7B-BaseFP16

Not supported15GB required · 10GB available

43.21 tok/sEstimated

baichuan-inc/Baichuan-M2-32BQ8

Not supported33GB required · 10GB available

29.86 tok/sEstimated

baichuan-inc/Baichuan-M2-32BFP16

Not supported66GB required · 10GB available

16.22 tok/sEstimated

ai-forever/ruGPT-3.5-13BQ4

Fits comfortably7GB required · 10GB available

103.60 tok/sEstimated

tencent/HunyuanOCRQ4

Fits comfortably1GB required · 10GB available

154.89 tok/sEstimated

tencent/HunyuanOCRQ8

Fits comfortably2GB required · 10GB available

95.41 tok/sEstimated

tencent/HunyuanOCRFP16

Fits comfortably3GB required · 10GB available

58.12 tok/sEstimated

facebook/sam3Q4

Fits comfortably1GB required · 10GB available

151.35 tok/sEstimated

facebook/sam3Q8

Fits comfortably1GB required · 10GB available

105.55 tok/sEstimated

facebook/sam3FP16

Fits comfortably2GB required · 10GB available

58.82 tok/sEstimated

Qwen/Qwen-Image-Edit-2509Q4

Fits comfortably4GB required · 10GB available

115.89 tok/sEstimated

google-bert/bert-base-uncasedQ8

Fits comfortably1GB required · 10GB available

112.65 tok/sEstimated

google-bert/bert-base-uncasedFP16

Fits comfortably1GB required · 10GB available

56.55 tok/sEstimated

MiniMaxAI/MiniMax-VL-01Q4

Not supported256GB required · 10GB available

14.82 tok/sEstimated

baichuan-inc/Baichuan-M2-32BQ4

Not supported16GB required · 10GB available

42.25 tok/sEstimated

lmstudio-community/Qwen3-Coder-30B-A3B-Instruct-MLX-8bitQ4

Not supported15GB required · 10GB available

69.91 tok/sEstimated

lmstudio-community/Qwen3-Coder-30B-A3B-Instruct-MLX-8bitQ8

Not supported31GB required · 10GB available

47.97 tok/sEstimated

lmstudio-community/Qwen3-Coder-30B-A3B-Instruct-MLX-6bitQ8

Not supported31GB required · 10GB available

51.71 tok/sEstimated

deepseek-ai/DeepSeek-Math-V2Q4

Not supported383GB required · 10GB available

17.17 tok/sEstimated

mistralai/Mistral-7B-Instruct-v0.1Q4

Fits comfortably4GB required · 10GB available

124.36 tok/sEstimated

mistralai/Mistral-7B-Instruct-v0.1Q8

Fits comfortably7GB required · 10GB available

93.28 tok/sEstimated

Qwen/Qwen2-7B-InstructQ8

Fits comfortably7GB required · 10GB available

94.83 tok/sEstimated

deepseek-ai/DeepSeek-V2.5Q8

Not supported656GB required · 10GB available

33.71 tok/sEstimated

Qwen/Qwen2.5-7B-InstructFP16

Not supported15GB required · 10GB available

47.55 tok/sEstimated

Qwen/Qwen3-0.6BQ4

Fits comfortably3GB required · 10GB available

121.19 tok/sEstimated

Qwen/Qwen2.5-3B-InstructQ4

Fits comfortably2GB required · 10GB available

140.40 tok/sEstimated

Qwen/Qwen2.5-3B-InstructQ8

Fits comfortably3GB required · 10GB available

98.32 tok/sEstimated

Qwen/Qwen2.5-3B-InstructFP16

Fits comfortably6GB required · 10GB available

57.64 tok/sEstimated

vikhyatk/moondream2Q4

Fits comfortably4GB required · 10GB available

137.37 tok/sEstimated

vikhyatk/moondream2Q8

Fits comfortably7GB required · 10GB available

81.74 tok/sEstimated

vikhyatk/moondream2FP16

Not supported15GB required · 10GB available

49.42 tok/sEstimated

petals-team/StableBeluga2Q4

Fits comfortably4GB required · 10GB available

123.30 tok/sEstimated

Qwen/Qwen3-0.6B-BaseQ4

Fits comfortably3GB required · 10GB available

130.69 tok/sEstimated

microsoft/Phi-4-mini-instructQ8

Fits comfortably7GB required · 10GB available

89.90 tok/sEstimated

microsoft/Phi-4-mini-instructFP16

Not supported15GB required · 10GB available

46.24 tok/sEstimated

lmstudio-community/Qwen3-Coder-30B-A3B-Instruct-MLX-5bitQ8

Not supported31GB required · 10GB available

48.55 tok/sEstimated

lmstudio-community/Qwen3-Coder-30B-A3B-Instruct-MLX-5bitFP16

Not supported61GB required · 10GB available

25.26 tok/sEstimated

moonshotai/Kimi-Linear-48B-A3B-InstructQ4

Not supported25GB required · 10GB available

39.91 tok/sEstimated

moonshotai/Kimi-Linear-48B-A3B-InstructQ8

Not supported50GB required · 10GB available

29.40 tok/sEstimated

openai/gpt-oss-20bQ8

Not supported20GB required · 10GB available

51.68 tok/sEstimated

Note: Performance estimates are calculated. Real results may vary. Methodology · Submit real data

GPU FAQs

Data-backed answers pulled from community benchmarks, manufacturer specs, and live pricing.

What speed can a 10 GB RTX 3080 hit on Qwen 30B?

Owners running Qwen3-30B-A3B on a 10 GB RTX 3080 report roughly 15 tokens/sec after tuning, keeping interactive coding prompts responsive.

Source: Reddit – /r/LocalLLaMA (mquvxwc)

Why do benchmark sheets overstate requirements?

Some spec sheets assume higher ceilings, but real-world users note they already achieve ~10 tok/sec on a 10 GB 3080—showing how tuning beats blanket requirements.

Source: Reddit – /r/LocalLLaMA (mj408ke)

Why does Ollama still offload to CPU on 12B models?

With larger context windows, Ollama reports 40% of layers moving to system RAM even on 12B models—illustrating the need to tune gpu_layers on 10 GB cards.

Source: Reddit – /r/LocalLLaMA (mnspe0d)

What are the core specs of RTX 3080?

The RTX 3080 Founders Edition includes 10 GB GDDR6X, a 320 W board power, triple 8-pin power connectors, and NVIDIA recommends a 750 W PSU.

Source: TechPowerUp – RTX 3080 Specs

How much does an RTX 3080 cost right now?

Latest snapshot (Nov 2025): Amazon at $699 (check current availability).

Source: Supabase price tracker snapshot – 2025-11-03