L
localai.computer
ModelsGPUsSystemsAI SetupsBuildsMethodology

Resources

  • Methodology
  • Submit Benchmark
  • About

Browse

  • AI Models
  • GPUs
  • PC Builds

Community

  • Leaderboard

Legal

  • Privacy
  • Terms
  • Contact

© 2025 localai.computer. Hardware recommendations for running AI models locally.

ℹ️We earn from qualifying purchases through affiliate links at no extra cost to you. This supports our free content and research.

  1. Home
  2. GPUs
  3. RTX 3080

Quick Answer: RTX 3080 offers 10GB VRAM and starts around $449.99. It delivers approximately 166 tokens/sec on google-bert/bert-base-uncased. It typically draws 320W under load.

RTX 3080

In Stock
By NVIDIAReleased 2020-09MSRP $699.00

This GPU offers reliable throughput for local AI workloads. Pair it with the right model quantization to hit your desired tokens/sec, and monitor prices below to catch the best deal.

Buy on Amazon - $449.99View Benchmarks
Specs snapshot
Key hardware metrics for AI workloads.
VRAM10GB
Cores8,704
TDP320W
ArchitectureAmpere

Where to Buy

Buy directly on Amazon with fast shipping and reliable customer service.

AmazonIn Stock
$449.99
Buy on Amazon

More Amazon options

Rotate out primary variants whenever validation flags an issue.

💡 Not ready to buy? Try cloud GPUs first

Test RTX 3080 performance in the cloud before investing in hardware. Pay by the hour with no commitment.

Vast.aifrom $0.20/hrRunPodfrom $0.30/hrLambda Labsenterprise-grade

AI benchmarks

ModelQuantizationTokens/secVRAM used
google-bert/bert-base-uncasedQ4
165.90 tok/sEstimated

Auto-generated benchmark

1GB
context-labs/meta-llama-Llama-3.2-3B-Instruct-FP16Q4
164.92 tok/sEstimated

Auto-generated benchmark

2GB
meta-llama/Llama-3.2-3B-InstructQ4
164.59 tok/sEstimated

Auto-generated benchmark

2GB
ibm-granite/granite-3.3-2b-instructQ4
162.95 tok/sEstimated

Auto-generated benchmark

1GB
deepseek-ai/DeepSeek-OCRQ4
162.84 tok/sEstimated

Auto-generated benchmark

2GB
ibm-research/PowerMoE-3bQ4
162.03 tok/sEstimated

Auto-generated benchmark

2GB
unsloth/Llama-3.2-3B-InstructQ4
161.10 tok/sEstimated

Auto-generated benchmark

2GB
nari-labs/Dia2-2BQ4
160.77 tok/sEstimated

Auto-generated benchmark

2GB
inference-net/Schematron-3BQ4
160.41 tok/sEstimated

Auto-generated benchmark

2GB
Qwen/Qwen2.5-3BQ4
159.97 tok/sEstimated

Auto-generated benchmark

2GB
WeiboAI/VibeThinker-1.5BQ4
158.73 tok/sEstimated

Auto-generated benchmark

1GB
bigcode/starcoder2-3bQ4
158.60 tok/sEstimated

Auto-generated benchmark

2GB
meta-llama/Llama-3.2-1B-InstructQ4
157.16 tok/sEstimated

Auto-generated benchmark

1GB
TinyLlama/TinyLlama-1.1B-Chat-v1.0Q4
155.42 tok/sEstimated

Auto-generated benchmark

1GB
deepseek-ai/deepseek-coder-1.3b-instructQ4
154.90 tok/sEstimated

Auto-generated benchmark

2GB
tencent/HunyuanOCRQ4
154.89 tok/sEstimated

Auto-generated benchmark

1GB
apple/OpenELM-1_1B-InstructQ4
152.45 tok/sEstimated

Auto-generated benchmark

1GB
facebook/sam3Q4
151.35 tok/sEstimated

Auto-generated benchmark

1GB
meta-llama/Llama-Guard-3-1BQ4
150.35 tok/sEstimated

Auto-generated benchmark

1GB
google/embeddinggemma-300mQ4
149.94 tok/sEstimated

Auto-generated benchmark

1GB
google/gemma-2-2b-itQ4
145.74 tok/sEstimated

Auto-generated benchmark

1GB
unsloth/gemma-3-1b-itQ4
145.07 tok/sEstimated

Auto-generated benchmark

1GB
meta-llama/Llama-3.2-1BQ4
144.41 tok/sEstimated

Auto-generated benchmark

1GB
google-t5/t5-3bQ4
143.36 tok/sEstimated

Auto-generated benchmark

2GB
allenai/OLMo-2-0425-1BQ4
143.24 tok/sEstimated

Auto-generated benchmark

1GB
meta-llama/Llama-3.2-3BQ4
142.91 tok/sEstimated

Auto-generated benchmark

2GB
Qwen/Qwen2.5-3B-InstructQ4
140.40 tok/sEstimated

Auto-generated benchmark

2GB
unsloth/Llama-3.2-1B-InstructQ4
140.09 tok/sEstimated

Auto-generated benchmark

1GB
google/gemma-2bQ4
140.07 tok/sEstimated

Auto-generated benchmark

1GB
google/gemma-3-1b-itQ4
138.73 tok/sEstimated

Auto-generated benchmark

1GB
Qwen/Qwen2.5-1.5B-InstructQ4
138.08 tok/sEstimated

Auto-generated benchmark

3GB
deepseek-ai/DeepSeek-V3-0324Q4
137.90 tok/sEstimated

Auto-generated benchmark

4GB
LiquidAI/LFM2-1.2BQ4
137.86 tok/sEstimated

Auto-generated benchmark

1GB
vikhyatk/moondream2Q4
137.37 tok/sEstimated

Auto-generated benchmark

4GB
GSAI-ML/LLaDA-8B-InstructQ4
137.31 tok/sEstimated

Auto-generated benchmark

4GB
zai-org/GLM-4.6-FP8Q4
137.03 tok/sEstimated

Auto-generated benchmark

4GB
Alibaba-NLP/gte-Qwen2-1.5B-instructQ4
136.97 tok/sEstimated

Auto-generated benchmark

3GB
Qwen/Qwen2-1.5B-InstructQ4
136.47 tok/sEstimated

Auto-generated benchmark

3GB
MiniMaxAI/MiniMax-M2Q4
135.86 tok/sEstimated

Auto-generated benchmark

4GB
Qwen/Qwen2.5-7B-InstructQ4
135.09 tok/sEstimated

Auto-generated benchmark

4GB
deepseek-ai/DeepSeek-V3Q4
135.03 tok/sEstimated

Auto-generated benchmark

4GB
meta-llama/Llama-3.1-8B-InstructQ4
135.02 tok/sEstimated

Auto-generated benchmark

4GB
dicta-il/dictalm2.0-instructQ4
134.86 tok/sEstimated

Auto-generated benchmark

4GB
microsoft/phi-4Q4
134.85 tok/sEstimated

Auto-generated benchmark

4GB
trl-internal-testing/tiny-Qwen2ForCausalLM-2.5Q4
134.44 tok/sEstimated

Auto-generated benchmark

4GB
HuggingFaceTB/SmolLM-135MQ4
134.06 tok/sEstimated

Auto-generated benchmark

4GB
Qwen/Qwen2.5-0.5BQ4
133.77 tok/sEstimated

Auto-generated benchmark

3GB
tencent/HunyuanVideo-1.5Q4
133.57 tok/sEstimated

Auto-generated benchmark

4GB
unsloth/Meta-Llama-3.1-8B-InstructQ4
133.52 tok/sEstimated

Auto-generated benchmark

4GB
deepseek-ai/DeepSeek-Coder-V2-Lite-InstructQ4
132.94 tok/sEstimated

Auto-generated benchmark

4GB
google-bert/bert-base-uncased
Q4
1GB
165.90 tok/sEstimated
Auto-generated benchmark
context-labs/meta-llama-Llama-3.2-3B-Instruct-FP16
Q4
2GB
164.92 tok/sEstimated
Auto-generated benchmark
meta-llama/Llama-3.2-3B-Instruct
Q4
2GB
164.59 tok/sEstimated
Auto-generated benchmark
ibm-granite/granite-3.3-2b-instruct
Q4
1GB
162.95 tok/sEstimated
Auto-generated benchmark
deepseek-ai/DeepSeek-OCR
Q4
2GB
162.84 tok/sEstimated
Auto-generated benchmark
ibm-research/PowerMoE-3b
Q4
2GB
162.03 tok/sEstimated
Auto-generated benchmark
unsloth/Llama-3.2-3B-Instruct
Q4
2GB
161.10 tok/sEstimated
Auto-generated benchmark
nari-labs/Dia2-2B
Q4
2GB
160.77 tok/sEstimated
Auto-generated benchmark
inference-net/Schematron-3B
Q4
2GB
160.41 tok/sEstimated
Auto-generated benchmark
Qwen/Qwen2.5-3B
Q4
2GB
159.97 tok/sEstimated
Auto-generated benchmark
WeiboAI/VibeThinker-1.5B
Q4
1GB
158.73 tok/sEstimated
Auto-generated benchmark
bigcode/starcoder2-3b
Q4
2GB
158.60 tok/sEstimated
Auto-generated benchmark
meta-llama/Llama-3.2-1B-Instruct
Q4
1GB
157.16 tok/sEstimated
Auto-generated benchmark
TinyLlama/TinyLlama-1.1B-Chat-v1.0
Q4
1GB
155.42 tok/sEstimated
Auto-generated benchmark
deepseek-ai/deepseek-coder-1.3b-instruct
Q4
2GB
154.90 tok/sEstimated
Auto-generated benchmark
tencent/HunyuanOCR
Q4
1GB
154.89 tok/sEstimated
Auto-generated benchmark
apple/OpenELM-1_1B-Instruct
Q4
1GB
152.45 tok/sEstimated
Auto-generated benchmark
facebook/sam3
Q4
1GB
151.35 tok/sEstimated
Auto-generated benchmark
meta-llama/Llama-Guard-3-1B
Q4
1GB
150.35 tok/sEstimated
Auto-generated benchmark
google/embeddinggemma-300m
Q4
1GB
149.94 tok/sEstimated
Auto-generated benchmark
google/gemma-2-2b-it
Q4
1GB
145.74 tok/sEstimated
Auto-generated benchmark
unsloth/gemma-3-1b-it
Q4
1GB
145.07 tok/sEstimated
Auto-generated benchmark
meta-llama/Llama-3.2-1B
Q4
1GB
144.41 tok/sEstimated
Auto-generated benchmark
google-t5/t5-3b
Q4
2GB
143.36 tok/sEstimated
Auto-generated benchmark
allenai/OLMo-2-0425-1B
Q4
1GB
143.24 tok/sEstimated
Auto-generated benchmark
meta-llama/Llama-3.2-3B
Q4
2GB
142.91 tok/sEstimated
Auto-generated benchmark
Qwen/Qwen2.5-3B-Instruct
Q4
2GB
140.40 tok/sEstimated
Auto-generated benchmark
unsloth/Llama-3.2-1B-Instruct
Q4
1GB
140.09 tok/sEstimated
Auto-generated benchmark
google/gemma-2b
Q4
1GB
140.07 tok/sEstimated
Auto-generated benchmark
google/gemma-3-1b-it
Q4
1GB
138.73 tok/sEstimated
Auto-generated benchmark
Qwen/Qwen2.5-1.5B-Instruct
Q4
3GB
138.08 tok/sEstimated
Auto-generated benchmark
deepseek-ai/DeepSeek-V3-0324
Q4
4GB
137.90 tok/sEstimated
Auto-generated benchmark
LiquidAI/LFM2-1.2B
Q4
1GB
137.86 tok/sEstimated
Auto-generated benchmark
vikhyatk/moondream2
Q4
4GB
137.37 tok/sEstimated
Auto-generated benchmark
GSAI-ML/LLaDA-8B-Instruct
Q4
4GB
137.31 tok/sEstimated
Auto-generated benchmark
zai-org/GLM-4.6-FP8
Q4
4GB
137.03 tok/sEstimated
Auto-generated benchmark
Alibaba-NLP/gte-Qwen2-1.5B-instruct
Q4
3GB
136.97 tok/sEstimated
Auto-generated benchmark
Qwen/Qwen2-1.5B-Instruct
Q4
3GB
136.47 tok/sEstimated
Auto-generated benchmark
MiniMaxAI/MiniMax-M2
Q4
4GB
135.86 tok/sEstimated
Auto-generated benchmark
Qwen/Qwen2.5-7B-Instruct
Q4
4GB
135.09 tok/sEstimated
Auto-generated benchmark
deepseek-ai/DeepSeek-V3
Q4
4GB
135.03 tok/sEstimated
Auto-generated benchmark
meta-llama/Llama-3.1-8B-Instruct
Q4
4GB
135.02 tok/sEstimated
Auto-generated benchmark
dicta-il/dictalm2.0-instruct
Q4
4GB
134.86 tok/sEstimated
Auto-generated benchmark
microsoft/phi-4
Q4
4GB
134.85 tok/sEstimated
Auto-generated benchmark
trl-internal-testing/tiny-Qwen2ForCausalLM-2.5
Q4
4GB
134.44 tok/sEstimated
Auto-generated benchmark
HuggingFaceTB/SmolLM-135M
Q4
4GB
134.06 tok/sEstimated
Auto-generated benchmark
Qwen/Qwen2.5-0.5B
Q4
3GB
133.77 tok/sEstimated
Auto-generated benchmark
tencent/HunyuanVideo-1.5
Q4
4GB
133.57 tok/sEstimated
Auto-generated benchmark
unsloth/Meta-Llama-3.1-8B-Instruct
Q4
4GB
133.52 tok/sEstimated
Auto-generated benchmark
deepseek-ai/DeepSeek-Coder-V2-Lite-Instruct
Q4
4GB
132.94 tok/sEstimated
Auto-generated benchmark

Note: Performance estimates are calculated. Real results may vary. Methodology · Submit real data

Model compatibility

ModelQuantizationVerdictEstimated speedVRAM needed
meta-llama/Meta-Llama-3-8B-InstructQ4Fits comfortably
123.79 tok/sEstimated
4GB (have 10GB)
Qwen/Qwen2.5-1.5BQ4Fits comfortably
121.15 tok/sEstimated
3GB (have 10GB)
meta-llama/Meta-Llama-3-8BQ4Fits comfortably
122.26 tok/sEstimated
4GB (have 10GB)
deepseek-ai/DeepSeek-R1-Distill-Qwen-7BQ8Fits comfortably
84.50 tok/sEstimated
7GB (have 10GB)
deepseek-ai/DeepSeek-R1-Distill-Qwen-7BFP16Not supported
49.55 tok/sEstimated
15GB (have 10GB)
deepseek-ai/DeepSeek-R1-Distill-Llama-8BQ8Fits (tight)
92.11 tok/sEstimated
9GB (have 10GB)
ibm-research/PowerMoE-3bQ8Fits comfortably
109.87 tok/sEstimated
3GB (have 10GB)
ibm-research/PowerMoE-3bFP16Fits comfortably
57.52 tok/sEstimated
6GB (have 10GB)
lmstudio-community/Qwen3-4B-Thinking-2507-MLX-6bitQ4Fits comfortably
124.62 tok/sEstimated
2GB (have 10GB)
lmstudio-community/Qwen3-4B-Thinking-2507-MLX-6bitQ8Fits comfortably
94.99 tok/sEstimated
4GB (have 10GB)
Qwen/Qwen3-1.7B-BaseFP16Not supported
43.21 tok/sEstimated
15GB (have 10GB)
baichuan-inc/Baichuan-M2-32BQ8Not supported
29.86 tok/sEstimated
33GB (have 10GB)
baichuan-inc/Baichuan-M2-32BFP16Not supported
16.22 tok/sEstimated
66GB (have 10GB)
ai-forever/ruGPT-3.5-13BQ4Fits comfortably
103.60 tok/sEstimated
7GB (have 10GB)
tencent/HunyuanOCRQ4Fits comfortably
154.89 tok/sEstimated
1GB (have 10GB)
tencent/HunyuanOCRQ8Fits comfortably
95.41 tok/sEstimated
2GB (have 10GB)
tencent/HunyuanOCRFP16Fits comfortably
58.12 tok/sEstimated
3GB (have 10GB)
facebook/sam3Q4Fits comfortably
151.35 tok/sEstimated
1GB (have 10GB)
facebook/sam3Q8Fits comfortably
105.55 tok/sEstimated
1GB (have 10GB)
facebook/sam3FP16Fits comfortably
58.82 tok/sEstimated
2GB (have 10GB)
Qwen/Qwen-Image-Edit-2509Q4Fits comfortably
115.89 tok/sEstimated
4GB (have 10GB)
google-bert/bert-base-uncasedQ8Fits comfortably
112.65 tok/sEstimated
1GB (have 10GB)
google-bert/bert-base-uncasedFP16Fits comfortably
56.55 tok/sEstimated
1GB (have 10GB)
MiniMaxAI/MiniMax-VL-01Q4Not supported
14.82 tok/sEstimated
256GB (have 10GB)
baichuan-inc/Baichuan-M2-32BQ4Not supported
42.25 tok/sEstimated
16GB (have 10GB)
lmstudio-community/Qwen3-Coder-30B-A3B-Instruct-MLX-8bitQ4Not supported
69.91 tok/sEstimated
15GB (have 10GB)
lmstudio-community/Qwen3-Coder-30B-A3B-Instruct-MLX-8bitQ8Not supported
47.97 tok/sEstimated
31GB (have 10GB)
lmstudio-community/Qwen3-Coder-30B-A3B-Instruct-MLX-6bitQ8Not supported
51.71 tok/sEstimated
31GB (have 10GB)
deepseek-ai/DeepSeek-Math-V2Q4Not supported
17.17 tok/sEstimated
383GB (have 10GB)
mistralai/Mistral-7B-Instruct-v0.1Q4Fits comfortably
124.36 tok/sEstimated
4GB (have 10GB)
mistralai/Mistral-7B-Instruct-v0.1Q8Fits comfortably
93.28 tok/sEstimated
7GB (have 10GB)
Qwen/Qwen2-7B-InstructQ8Fits comfortably
94.83 tok/sEstimated
7GB (have 10GB)
deepseek-ai/DeepSeek-V2.5Q8Not supported
33.71 tok/sEstimated
656GB (have 10GB)
Qwen/Qwen2.5-7B-InstructFP16Not supported
47.55 tok/sEstimated
15GB (have 10GB)
Qwen/Qwen3-0.6BQ4Fits comfortably
121.19 tok/sEstimated
3GB (have 10GB)
Qwen/Qwen2.5-3B-InstructQ4Fits comfortably
140.40 tok/sEstimated
2GB (have 10GB)
Qwen/Qwen2.5-3B-InstructQ8Fits comfortably
98.32 tok/sEstimated
3GB (have 10GB)
Qwen/Qwen2.5-3B-InstructFP16Fits comfortably
57.64 tok/sEstimated
6GB (have 10GB)
vikhyatk/moondream2Q4Fits comfortably
137.37 tok/sEstimated
4GB (have 10GB)
vikhyatk/moondream2Q8Fits comfortably
81.74 tok/sEstimated
7GB (have 10GB)
vikhyatk/moondream2FP16Not supported
49.42 tok/sEstimated
15GB (have 10GB)
petals-team/StableBeluga2Q4Fits comfortably
123.30 tok/sEstimated
4GB (have 10GB)
Qwen/Qwen3-0.6B-BaseQ4Fits comfortably
130.69 tok/sEstimated
3GB (have 10GB)
microsoft/Phi-4-mini-instructQ8Fits comfortably
89.90 tok/sEstimated
7GB (have 10GB)
microsoft/Phi-4-mini-instructFP16Not supported
46.24 tok/sEstimated
15GB (have 10GB)
lmstudio-community/Qwen3-Coder-30B-A3B-Instruct-MLX-5bitQ8Not supported
48.55 tok/sEstimated
31GB (have 10GB)
lmstudio-community/Qwen3-Coder-30B-A3B-Instruct-MLX-5bitFP16Not supported
25.26 tok/sEstimated
61GB (have 10GB)
moonshotai/Kimi-Linear-48B-A3B-InstructQ4Not supported
39.91 tok/sEstimated
25GB (have 10GB)
moonshotai/Kimi-Linear-48B-A3B-InstructQ8Not supported
29.40 tok/sEstimated
50GB (have 10GB)
openai/gpt-oss-20bQ8Not supported
51.68 tok/sEstimated
20GB (have 10GB)
meta-llama/Meta-Llama-3-8B-InstructQ4
Fits comfortably4GB required · 10GB available
123.79 tok/sEstimated
Qwen/Qwen2.5-1.5BQ4
Fits comfortably3GB required · 10GB available
121.15 tok/sEstimated
meta-llama/Meta-Llama-3-8BQ4
Fits comfortably4GB required · 10GB available
122.26 tok/sEstimated
deepseek-ai/DeepSeek-R1-Distill-Qwen-7BQ8
Fits comfortably7GB required · 10GB available
84.50 tok/sEstimated
deepseek-ai/DeepSeek-R1-Distill-Qwen-7BFP16
Not supported15GB required · 10GB available
49.55 tok/sEstimated
deepseek-ai/DeepSeek-R1-Distill-Llama-8BQ8
Fits (tight)9GB required · 10GB available
92.11 tok/sEstimated
ibm-research/PowerMoE-3bQ8
Fits comfortably3GB required · 10GB available
109.87 tok/sEstimated
ibm-research/PowerMoE-3bFP16
Fits comfortably6GB required · 10GB available
57.52 tok/sEstimated
lmstudio-community/Qwen3-4B-Thinking-2507-MLX-6bitQ4
Fits comfortably2GB required · 10GB available
124.62 tok/sEstimated
lmstudio-community/Qwen3-4B-Thinking-2507-MLX-6bitQ8
Fits comfortably4GB required · 10GB available
94.99 tok/sEstimated
Qwen/Qwen3-1.7B-BaseFP16
Not supported15GB required · 10GB available
43.21 tok/sEstimated
baichuan-inc/Baichuan-M2-32BQ8
Not supported33GB required · 10GB available
29.86 tok/sEstimated
baichuan-inc/Baichuan-M2-32BFP16
Not supported66GB required · 10GB available
16.22 tok/sEstimated
ai-forever/ruGPT-3.5-13BQ4
Fits comfortably7GB required · 10GB available
103.60 tok/sEstimated
tencent/HunyuanOCRQ4
Fits comfortably1GB required · 10GB available
154.89 tok/sEstimated
tencent/HunyuanOCRQ8
Fits comfortably2GB required · 10GB available
95.41 tok/sEstimated
tencent/HunyuanOCRFP16
Fits comfortably3GB required · 10GB available
58.12 tok/sEstimated
facebook/sam3Q4
Fits comfortably1GB required · 10GB available
151.35 tok/sEstimated
facebook/sam3Q8
Fits comfortably1GB required · 10GB available
105.55 tok/sEstimated
facebook/sam3FP16
Fits comfortably2GB required · 10GB available
58.82 tok/sEstimated
Qwen/Qwen-Image-Edit-2509Q4
Fits comfortably4GB required · 10GB available
115.89 tok/sEstimated
google-bert/bert-base-uncasedQ8
Fits comfortably1GB required · 10GB available
112.65 tok/sEstimated
google-bert/bert-base-uncasedFP16
Fits comfortably1GB required · 10GB available
56.55 tok/sEstimated
MiniMaxAI/MiniMax-VL-01Q4
Not supported256GB required · 10GB available
14.82 tok/sEstimated
baichuan-inc/Baichuan-M2-32BQ4
Not supported16GB required · 10GB available
42.25 tok/sEstimated
lmstudio-community/Qwen3-Coder-30B-A3B-Instruct-MLX-8bitQ4
Not supported15GB required · 10GB available
69.91 tok/sEstimated
lmstudio-community/Qwen3-Coder-30B-A3B-Instruct-MLX-8bitQ8
Not supported31GB required · 10GB available
47.97 tok/sEstimated
lmstudio-community/Qwen3-Coder-30B-A3B-Instruct-MLX-6bitQ8
Not supported31GB required · 10GB available
51.71 tok/sEstimated
deepseek-ai/DeepSeek-Math-V2Q4
Not supported383GB required · 10GB available
17.17 tok/sEstimated
mistralai/Mistral-7B-Instruct-v0.1Q4
Fits comfortably4GB required · 10GB available
124.36 tok/sEstimated
mistralai/Mistral-7B-Instruct-v0.1Q8
Fits comfortably7GB required · 10GB available
93.28 tok/sEstimated
Qwen/Qwen2-7B-InstructQ8
Fits comfortably7GB required · 10GB available
94.83 tok/sEstimated
deepseek-ai/DeepSeek-V2.5Q8
Not supported656GB required · 10GB available
33.71 tok/sEstimated
Qwen/Qwen2.5-7B-InstructFP16
Not supported15GB required · 10GB available
47.55 tok/sEstimated
Qwen/Qwen3-0.6BQ4
Fits comfortably3GB required · 10GB available
121.19 tok/sEstimated
Qwen/Qwen2.5-3B-InstructQ4
Fits comfortably2GB required · 10GB available
140.40 tok/sEstimated
Qwen/Qwen2.5-3B-InstructQ8
Fits comfortably3GB required · 10GB available
98.32 tok/sEstimated
Qwen/Qwen2.5-3B-InstructFP16
Fits comfortably6GB required · 10GB available
57.64 tok/sEstimated
vikhyatk/moondream2Q4
Fits comfortably4GB required · 10GB available
137.37 tok/sEstimated
vikhyatk/moondream2Q8
Fits comfortably7GB required · 10GB available
81.74 tok/sEstimated
vikhyatk/moondream2FP16
Not supported15GB required · 10GB available
49.42 tok/sEstimated
petals-team/StableBeluga2Q4
Fits comfortably4GB required · 10GB available
123.30 tok/sEstimated
Qwen/Qwen3-0.6B-BaseQ4
Fits comfortably3GB required · 10GB available
130.69 tok/sEstimated
microsoft/Phi-4-mini-instructQ8
Fits comfortably7GB required · 10GB available
89.90 tok/sEstimated
microsoft/Phi-4-mini-instructFP16
Not supported15GB required · 10GB available
46.24 tok/sEstimated
lmstudio-community/Qwen3-Coder-30B-A3B-Instruct-MLX-5bitQ8
Not supported31GB required · 10GB available
48.55 tok/sEstimated
lmstudio-community/Qwen3-Coder-30B-A3B-Instruct-MLX-5bitFP16
Not supported61GB required · 10GB available
25.26 tok/sEstimated
moonshotai/Kimi-Linear-48B-A3B-InstructQ4
Not supported25GB required · 10GB available
39.91 tok/sEstimated
moonshotai/Kimi-Linear-48B-A3B-InstructQ8
Not supported50GB required · 10GB available
29.40 tok/sEstimated
openai/gpt-oss-20bQ8
Not supported20GB required · 10GB available
51.68 tok/sEstimated

Note: Performance estimates are calculated. Real results may vary. Methodology · Submit real data

GPU FAQs

Data-backed answers pulled from community benchmarks, manufacturer specs, and live pricing.

What speed can a 10 GB RTX 3080 hit on Qwen 30B?

Owners running Qwen3-30B-A3B on a 10 GB RTX 3080 report roughly 15 tokens/sec after tuning, keeping interactive coding prompts responsive.

Source: Reddit – /r/LocalLLaMA (mquvxwc)

Why do benchmark sheets overstate requirements?

Some spec sheets assume higher ceilings, but real-world users note they already achieve ~10 tok/sec on a 10 GB 3080—showing how tuning beats blanket requirements.

Source: Reddit – /r/LocalLLaMA (mj408ke)

Why does Ollama still offload to CPU on 12B models?

With larger context windows, Ollama reports 40% of layers moving to system RAM even on 12B models—illustrating the need to tune gpu_layers on 10 GB cards.

Source: Reddit – /r/LocalLLaMA (mnspe0d)

What are the core specs of RTX 3080?

The RTX 3080 Founders Edition includes 10 GB GDDR6X, a 320 W board power, triple 8-pin power connectors, and NVIDIA recommends a 750 W PSU.

Source: TechPowerUp – RTX 3080 Specs

How much does an RTX 3080 cost right now?

Latest snapshot (Nov 2025): Amazon at $699 (check current availability).

Source: Supabase price tracker snapshot – 2025-11-03

Alternative GPUs

RTX 3090
24GB

Explore how RTX 3090 stacks up for local inference workloads.

RTX 3070
8GB

Explore how RTX 3070 stacks up for local inference workloads.

RTX 4070
12GB

Explore how RTX 4070 stacks up for local inference workloads.

RX 6800 XT
16GB

Explore how RX 6800 XT stacks up for local inference workloads.

RTX 4090
24GB

Explore how RTX 4090 stacks up for local inference workloads.

Compare RTX 3080

RTX 3080 vs RTX 3070

Side-by-side VRAM, throughput, efficiency, and pricing benchmarks for both GPUs.

RTX 3080 vs RTX 3090

Side-by-side VRAM, throughput, efficiency, and pricing benchmarks for both GPUs.

RTX 3080 vs RX 6900 XT

Side-by-side VRAM, throughput, efficiency, and pricing benchmarks for both GPUs.