L
localai.computer
ModelsGPUsSystemsAI SetupsBuildsMethodology

Resources

  • Methodology
  • Submit Benchmark
  • About

Browse

  • AI Models
  • GPUs
  • PC Builds

Community

  • Leaderboard

Legal

  • Privacy
  • Terms
  • Contact

© 2025 localai.computer. Hardware recommendations for running AI models locally.

ℹ️We earn from qualifying purchases through affiliate links at no extra cost to you. This supports our free content and research.

  1. Home
  2. GPUs
  3. AMD Instinct MI210

Quick Answer: AMD Instinct MI210 offers 64GB VRAM and starts around current market pricing. It delivers approximately 313 tokens/sec on meta-llama/Llama-Guard-3-1B. It typically draws 300W under load.

AMD Instinct MI210

Check availability
By AMDReleased 2021-11MSRP $6,000.00

This GPU offers reliable throughput for local AI workloads. Pair it with the right model quantization to hit your desired tokens/sec, and monitor prices below to catch the best deal.

Search on AmazonView Benchmarks
Specs snapshot
Key hardware metrics for AI workloads.
VRAM64GB
Cores6,656
TDP300W
ArchitectureCDNA 2

Where to Buy

Buy directly on Amazon with fast shipping and reliable customer service.

No purchase links available yet. Try the Amazon search results to find this GPU.

💡 Not ready to buy? Try cloud GPUs first

Test AMD Instinct MI210 performance in the cloud before investing in hardware. Pay by the hour with no commitment.

Vast.aifrom $0.20/hrRunPodfrom $0.30/hrLambda Labsenterprise-grade

AI benchmarks

ModelQuantizationTokens/secVRAM used
meta-llama/Llama-Guard-3-1BQ4
313.28 tok/sEstimated

Auto-generated benchmark

1GB
unsloth/gemma-3-1b-itQ4
311.17 tok/sEstimated

Auto-generated benchmark

1GB
meta-llama/Llama-3.2-1BQ4
310.04 tok/sEstimated

Auto-generated benchmark

1GB
facebook/sam3Q4
306.67 tok/sEstimated

Auto-generated benchmark

1GB
allenai/OLMo-2-0425-1BQ4
306.48 tok/sEstimated

Auto-generated benchmark

1GB
google-t5/t5-3bQ4
305.19 tok/sEstimated

Auto-generated benchmark

2GB
WeiboAI/VibeThinker-1.5BQ4
303.85 tok/sEstimated

Auto-generated benchmark

1GB
google/gemma-3-1b-itQ4
303.02 tok/sEstimated

Auto-generated benchmark

1GB
ibm-granite/granite-3.3-2b-instructQ4
299.41 tok/sEstimated

Auto-generated benchmark

1GB
Qwen/Qwen2.5-3BQ4
298.98 tok/sEstimated

Auto-generated benchmark

2GB
unsloth/Llama-3.2-3B-InstructQ4
298.34 tok/sEstimated

Auto-generated benchmark

2GB
LiquidAI/LFM2-1.2BQ4
294.14 tok/sEstimated

Auto-generated benchmark

1GB
meta-llama/Llama-3.2-3B-InstructQ4
291.87 tok/sEstimated

Auto-generated benchmark

2GB
deepseek-ai/DeepSeek-OCRQ4
291.52 tok/sEstimated

Auto-generated benchmark

2GB
Qwen/Qwen2.5-3B-InstructQ4
284.59 tok/sEstimated

Auto-generated benchmark

2GB
deepseek-ai/deepseek-coder-1.3b-instructQ4
281.97 tok/sEstimated

Auto-generated benchmark

2GB
context-labs/meta-llama-Llama-3.2-3B-Instruct-FP16Q4
280.16 tok/sEstimated

Auto-generated benchmark

2GB
google/embeddinggemma-300mQ4
278.18 tok/sEstimated

Auto-generated benchmark

1GB
meta-llama/Llama-3.2-3BQ4
276.71 tok/sEstimated

Auto-generated benchmark

2GB
nari-labs/Dia2-2BQ4
275.82 tok/sEstimated

Auto-generated benchmark

2GB
google/gemma-2-2b-itQ4
275.13 tok/sEstimated

Auto-generated benchmark

1GB
google/gemma-2bQ4
272.63 tok/sEstimated

Auto-generated benchmark

1GB
bigcode/starcoder2-3bQ4
272.50 tok/sEstimated

Auto-generated benchmark

2GB
TinyLlama/TinyLlama-1.1B-Chat-v1.0Q4
268.81 tok/sEstimated

Auto-generated benchmark

1GB
tencent/HunyuanOCRQ4
266.68 tok/sEstimated

Auto-generated benchmark

1GB
ibm-research/PowerMoE-3bQ4
266.19 tok/sEstimated

Auto-generated benchmark

2GB
apple/OpenELM-1_1B-InstructQ4
263.00 tok/sEstimated

Auto-generated benchmark

1GB
meta-llama/Llama-3.2-1B-InstructQ4
261.99 tok/sEstimated

Auto-generated benchmark

1GB
Qwen/Qwen2.5-1.5B-InstructQ4
260.84 tok/sEstimated

Auto-generated benchmark

3GB
meta-llama/Llama-3.2-3B-InstructQ4
260.81 tok/sEstimated

Auto-generated benchmark

2GB
inference-net/Schematron-3BQ4
260.77 tok/sEstimated

Auto-generated benchmark

2GB
microsoft/phi-2Q4
260.16 tok/sEstimated

Auto-generated benchmark

4GB
Qwen/Qwen2.5-0.5B-InstructQ4
260.08 tok/sEstimated

Auto-generated benchmark

3GB
meta-llama/Llama-Guard-3-8BQ4
258.95 tok/sEstimated

Auto-generated benchmark

4GB
unsloth/Llama-3.2-1B-InstructQ4
258.79 tok/sEstimated

Auto-generated benchmark

1GB
meta-llama/Meta-Llama-3-8BQ4
258.52 tok/sEstimated

Auto-generated benchmark

4GB
ibm-granite/granite-3.3-8b-instructQ4
258.48 tok/sEstimated

Auto-generated benchmark

4GB
meta-llama/Llama-3.1-8BQ4
257.69 tok/sEstimated

Auto-generated benchmark

4GB
parler-tts/parler-tts-large-v1Q4
257.67 tok/sEstimated

Auto-generated benchmark

4GB
Qwen/Qwen3-8B-BaseQ4
257.20 tok/sEstimated

Auto-generated benchmark

4GB
google-bert/bert-base-uncasedQ4
256.96 tok/sEstimated

Auto-generated benchmark

1GB
microsoft/VibeVoice-1.5BQ4
256.93 tok/sEstimated

Auto-generated benchmark

3GB
Qwen/Qwen3-1.7B-BaseQ4
256.63 tok/sEstimated

Auto-generated benchmark

4GB
Qwen/Qwen2.5-Math-1.5BQ4
256.46 tok/sEstimated

Auto-generated benchmark

3GB
microsoft/Phi-3-mini-128k-instructQ4
256.17 tok/sEstimated

Auto-generated benchmark

4GB
deepseek-ai/DeepSeek-R1-Distill-Qwen-1.5BQ4
255.34 tok/sEstimated

Auto-generated benchmark

3GB
microsoft/DialoGPT-smallQ4
254.55 tok/sEstimated

Auto-generated benchmark

4GB
lmstudio-community/DeepSeek-R1-0528-Qwen3-8B-MLX-4bitQ4
254.51 tok/sEstimated

Auto-generated benchmark

4GB
mistralai/Mistral-7B-v0.1Q4
254.10 tok/sEstimated

Auto-generated benchmark

4GB
ibm-granite/granite-docling-258MQ4
254.07 tok/sEstimated

Auto-generated benchmark

4GB
meta-llama/Llama-Guard-3-1B
Q4
1GB
313.28 tok/sEstimated
Auto-generated benchmark
unsloth/gemma-3-1b-it
Q4
1GB
311.17 tok/sEstimated
Auto-generated benchmark
meta-llama/Llama-3.2-1B
Q4
1GB
310.04 tok/sEstimated
Auto-generated benchmark
facebook/sam3
Q4
1GB
306.67 tok/sEstimated
Auto-generated benchmark
allenai/OLMo-2-0425-1B
Q4
1GB
306.48 tok/sEstimated
Auto-generated benchmark
google-t5/t5-3b
Q4
2GB
305.19 tok/sEstimated
Auto-generated benchmark
WeiboAI/VibeThinker-1.5B
Q4
1GB
303.85 tok/sEstimated
Auto-generated benchmark
google/gemma-3-1b-it
Q4
1GB
303.02 tok/sEstimated
Auto-generated benchmark
ibm-granite/granite-3.3-2b-instruct
Q4
1GB
299.41 tok/sEstimated
Auto-generated benchmark
Qwen/Qwen2.5-3B
Q4
2GB
298.98 tok/sEstimated
Auto-generated benchmark
unsloth/Llama-3.2-3B-Instruct
Q4
2GB
298.34 tok/sEstimated
Auto-generated benchmark
LiquidAI/LFM2-1.2B
Q4
1GB
294.14 tok/sEstimated
Auto-generated benchmark
meta-llama/Llama-3.2-3B-Instruct
Q4
2GB
291.87 tok/sEstimated
Auto-generated benchmark
deepseek-ai/DeepSeek-OCR
Q4
2GB
291.52 tok/sEstimated
Auto-generated benchmark
Qwen/Qwen2.5-3B-Instruct
Q4
2GB
284.59 tok/sEstimated
Auto-generated benchmark
deepseek-ai/deepseek-coder-1.3b-instruct
Q4
2GB
281.97 tok/sEstimated
Auto-generated benchmark
context-labs/meta-llama-Llama-3.2-3B-Instruct-FP16
Q4
2GB
280.16 tok/sEstimated
Auto-generated benchmark
google/embeddinggemma-300m
Q4
1GB
278.18 tok/sEstimated
Auto-generated benchmark
meta-llama/Llama-3.2-3B
Q4
2GB
276.71 tok/sEstimated
Auto-generated benchmark
nari-labs/Dia2-2B
Q4
2GB
275.82 tok/sEstimated
Auto-generated benchmark
google/gemma-2-2b-it
Q4
1GB
275.13 tok/sEstimated
Auto-generated benchmark
google/gemma-2b
Q4
1GB
272.63 tok/sEstimated
Auto-generated benchmark
bigcode/starcoder2-3b
Q4
2GB
272.50 tok/sEstimated
Auto-generated benchmark
TinyLlama/TinyLlama-1.1B-Chat-v1.0
Q4
1GB
268.81 tok/sEstimated
Auto-generated benchmark
tencent/HunyuanOCR
Q4
1GB
266.68 tok/sEstimated
Auto-generated benchmark
ibm-research/PowerMoE-3b
Q4
2GB
266.19 tok/sEstimated
Auto-generated benchmark
apple/OpenELM-1_1B-Instruct
Q4
1GB
263.00 tok/sEstimated
Auto-generated benchmark
meta-llama/Llama-3.2-1B-Instruct
Q4
1GB
261.99 tok/sEstimated
Auto-generated benchmark
Qwen/Qwen2.5-1.5B-Instruct
Q4
3GB
260.84 tok/sEstimated
Auto-generated benchmark
meta-llama/Llama-3.2-3B-Instruct
Q4
2GB
260.81 tok/sEstimated
Auto-generated benchmark
inference-net/Schematron-3B
Q4
2GB
260.77 tok/sEstimated
Auto-generated benchmark
microsoft/phi-2
Q4
4GB
260.16 tok/sEstimated
Auto-generated benchmark
Qwen/Qwen2.5-0.5B-Instruct
Q4
3GB
260.08 tok/sEstimated
Auto-generated benchmark
meta-llama/Llama-Guard-3-8B
Q4
4GB
258.95 tok/sEstimated
Auto-generated benchmark
unsloth/Llama-3.2-1B-Instruct
Q4
1GB
258.79 tok/sEstimated
Auto-generated benchmark
meta-llama/Meta-Llama-3-8B
Q4
4GB
258.52 tok/sEstimated
Auto-generated benchmark
ibm-granite/granite-3.3-8b-instruct
Q4
4GB
258.48 tok/sEstimated
Auto-generated benchmark
meta-llama/Llama-3.1-8B
Q4
4GB
257.69 tok/sEstimated
Auto-generated benchmark
parler-tts/parler-tts-large-v1
Q4
4GB
257.67 tok/sEstimated
Auto-generated benchmark
Qwen/Qwen3-8B-Base
Q4
4GB
257.20 tok/sEstimated
Auto-generated benchmark
google-bert/bert-base-uncased
Q4
1GB
256.96 tok/sEstimated
Auto-generated benchmark
microsoft/VibeVoice-1.5B
Q4
3GB
256.93 tok/sEstimated
Auto-generated benchmark
Qwen/Qwen3-1.7B-Base
Q4
4GB
256.63 tok/sEstimated
Auto-generated benchmark
Qwen/Qwen2.5-Math-1.5B
Q4
3GB
256.46 tok/sEstimated
Auto-generated benchmark
microsoft/Phi-3-mini-128k-instruct
Q4
4GB
256.17 tok/sEstimated
Auto-generated benchmark
deepseek-ai/DeepSeek-R1-Distill-Qwen-1.5B
Q4
3GB
255.34 tok/sEstimated
Auto-generated benchmark
microsoft/DialoGPT-small
Q4
4GB
254.55 tok/sEstimated
Auto-generated benchmark
lmstudio-community/DeepSeek-R1-0528-Qwen3-8B-MLX-4bit
Q4
4GB
254.51 tok/sEstimated
Auto-generated benchmark
mistralai/Mistral-7B-v0.1
Q4
4GB
254.10 tok/sEstimated
Auto-generated benchmark
ibm-granite/granite-docling-258M
Q4
4GB
254.07 tok/sEstimated
Auto-generated benchmark

Note: Performance estimates are calculated. Real results may vary. Methodology · Submit real data

Model compatibility

ModelQuantizationVerdictEstimated speedVRAM needed
Qwen/Qwen3-0.6BFP16Fits comfortably
93.54 tok/sEstimated
13GB (have 64GB)
Gensyn/Qwen2.5-0.5B-InstructQ4Fits comfortably
218.84 tok/sEstimated
3GB (have 64GB)
google/gemma-3-1b-itQ4Fits comfortably
303.02 tok/sEstimated
1GB (have 64GB)
google/gemma-3-1b-itQ8Fits comfortably
206.78 tok/sEstimated
1GB (have 64GB)
google/gemma-3-1b-itFP16Fits comfortably
117.47 tok/sEstimated
2GB (have 64GB)
Qwen/Qwen3-4B-Instruct-2507FP16Fits comfortably
86.15 tok/sEstimated
9GB (have 64GB)
meta-llama/Llama-3.2-1B-InstructQ4Fits comfortably
261.99 tok/sEstimated
1GB (have 64GB)
Qwen/Qwen3-8BQ8Fits comfortably
168.98 tok/sEstimated
9GB (have 64GB)
Qwen/Qwen3-8BFP16Fits comfortably
88.42 tok/sEstimated
17GB (have 64GB)
meta-llama/Meta-Llama-3-8BFP16Fits comfortably
98.52 tok/sEstimated
17GB (have 64GB)
Qwen/Qwen2.5-7BQ4Fits comfortably
235.85 tok/sEstimated
4GB (have 64GB)
Qwen/Qwen2.5-7BQ8Fits comfortably
154.48 tok/sEstimated
7GB (have 64GB)
Qwen/Qwen2.5-7BFP16Fits comfortably
91.71 tok/sEstimated
15GB (have 64GB)
Qwen/Qwen3-0.6B-BaseQ8Fits comfortably
173.51 tok/sEstimated
6GB (have 64GB)
Qwen/Qwen3-0.6B-BaseFP16Fits comfortably
92.22 tok/sEstimated
13GB (have 64GB)
Qwen/Qwen3-30B-A3BQ8Fits comfortably
82.87 tok/sEstimated
31GB (have 64GB)
Qwen/Qwen3-30B-A3BFP16Fits comfortably
47.62 tok/sEstimated
61GB (have 64GB)
microsoft/Phi-3.5-vision-instructQ8Fits comfortably
175.09 tok/sEstimated
7GB (have 64GB)
Qwen/Qwen2-7B-InstructQ8Fits comfortably
172.25 tok/sEstimated
7GB (have 64GB)
Qwen/Qwen2-7B-InstructFP16Fits comfortably
88.63 tok/sEstimated
15GB (have 64GB)
Qwen/Qwen3-4B-BaseQ4Fits comfortably
225.62 tok/sEstimated
2GB (have 64GB)
lmstudio-community/Qwen3-4B-Thinking-2507-MLX-4bitQ4Fits comfortably
241.68 tok/sEstimated
2GB (have 64GB)
lmstudio-community/Qwen3-4B-Thinking-2507-MLX-4bitQ8Fits comfortably
168.39 tok/sEstimated
4GB (have 64GB)
lmstudio-community/Qwen3-4B-Thinking-2507-MLX-4bitFP16Fits comfortably
85.74 tok/sEstimated
9GB (have 64GB)
unsloth/Llama-3.2-3B-InstructQ4Fits comfortably
298.34 tok/sEstimated
2GB (have 64GB)
unsloth/Llama-3.2-3B-InstructQ8Fits comfortably
186.17 tok/sEstimated
3GB (have 64GB)
OpenPipe/Qwen3-14B-InstructFP16Fits comfortably
73.55 tok/sEstimated
29GB (have 64GB)
openai-community/gpt2-xlQ4Fits comfortably
218.65 tok/sEstimated
4GB (have 64GB)
openai-community/gpt2-xlQ8Fits comfortably
167.09 tok/sEstimated
7GB (have 64GB)
openai-community/gpt2-xlFP16Fits comfortably
97.89 tok/sEstimated
15GB (have 64GB)
microsoft/Phi-3-mini-128k-instructQ4Fits comfortably
256.17 tok/sEstimated
4GB (have 64GB)
microsoft/Phi-3-mini-128k-instructQ8Fits comfortably
150.93 tok/sEstimated
7GB (have 64GB)
GSAI-ML/LLaDA-8B-InstructQ4Fits comfortably
227.64 tok/sEstimated
4GB (have 64GB)
GSAI-ML/LLaDA-8B-InstructQ8Fits comfortably
163.69 tok/sEstimated
9GB (have 64GB)
lmstudio-community/DeepSeek-R1-0528-Qwen3-8B-MLX-4bitQ8Fits comfortably
166.47 tok/sEstimated
9GB (have 64GB)
skt/kogpt2-base-v2Q8Fits comfortably
180.06 tok/sEstimated
7GB (have 64GB)
skt/kogpt2-base-v2FP16Fits comfortably
91.11 tok/sEstimated
15GB (have 64GB)
ibm-granite/granite-docling-258MQ4Fits comfortably
254.07 tok/sEstimated
4GB (have 64GB)
Qwen/Qwen3-Next-80B-A3B-ThinkingQ4Fits comfortably
50.19 tok/sEstimated
39GB (have 64GB)
Qwen/Qwen3-Next-80B-A3B-ThinkingQ8Not supported
34.41 tok/sEstimated
78GB (have 64GB)
Qwen/Qwen3-Next-80B-A3B-ThinkingFP16Not supported
18.47 tok/sEstimated
156GB (have 64GB)
meta-llama/Llama-2-13b-chat-hfQ4Fits comfortably
173.60 tok/sEstimated
7GB (have 64GB)
NousResearch/Meta-Llama-3.1-8B-InstructFP16Fits comfortably
98.95 tok/sEstimated
17GB (have 64GB)
apple/OpenELM-1_1B-InstructQ4Fits comfortably
263.00 tok/sEstimated
1GB (have 64GB)
lmstudio-community/Qwen3-Coder-30B-A3B-Instruct-MLX-5bitQ4Fits comfortably
125.81 tok/sEstimated
15GB (have 64GB)
lmstudio-community/Qwen3-Coder-30B-A3B-Instruct-MLX-5bitQ8Fits comfortably
94.83 tok/sEstimated
31GB (have 64GB)
lmstudio-community/Qwen3-Coder-30B-A3B-Instruct-MLX-5bitFP16Fits comfortably
49.67 tok/sEstimated
61GB (have 64GB)
Alibaba-NLP/gte-Qwen2-1.5B-instructQ4Fits comfortably
230.57 tok/sEstimated
3GB (have 64GB)
Qwen/QwQ-32B-PreviewQ8Fits comfortably
54.52 tok/sEstimated
34GB (have 64GB)
Qwen/Qwen3-0.6BQ8Fits comfortably
175.34 tok/sEstimated
6GB (have 64GB)
Qwen/Qwen3-0.6BFP16
Fits comfortably13GB required · 64GB available
93.54 tok/sEstimated
Gensyn/Qwen2.5-0.5B-InstructQ4
Fits comfortably3GB required · 64GB available
218.84 tok/sEstimated
google/gemma-3-1b-itQ4
Fits comfortably1GB required · 64GB available
303.02 tok/sEstimated
google/gemma-3-1b-itQ8
Fits comfortably1GB required · 64GB available
206.78 tok/sEstimated
google/gemma-3-1b-itFP16
Fits comfortably2GB required · 64GB available
117.47 tok/sEstimated
Qwen/Qwen3-4B-Instruct-2507FP16
Fits comfortably9GB required · 64GB available
86.15 tok/sEstimated
meta-llama/Llama-3.2-1B-InstructQ4
Fits comfortably1GB required · 64GB available
261.99 tok/sEstimated
Qwen/Qwen3-8BQ8
Fits comfortably9GB required · 64GB available
168.98 tok/sEstimated
Qwen/Qwen3-8BFP16
Fits comfortably17GB required · 64GB available
88.42 tok/sEstimated
meta-llama/Meta-Llama-3-8BFP16
Fits comfortably17GB required · 64GB available
98.52 tok/sEstimated
Qwen/Qwen2.5-7BQ4
Fits comfortably4GB required · 64GB available
235.85 tok/sEstimated
Qwen/Qwen2.5-7BQ8
Fits comfortably7GB required · 64GB available
154.48 tok/sEstimated
Qwen/Qwen2.5-7BFP16
Fits comfortably15GB required · 64GB available
91.71 tok/sEstimated
Qwen/Qwen3-0.6B-BaseQ8
Fits comfortably6GB required · 64GB available
173.51 tok/sEstimated
Qwen/Qwen3-0.6B-BaseFP16
Fits comfortably13GB required · 64GB available
92.22 tok/sEstimated
Qwen/Qwen3-30B-A3BQ8
Fits comfortably31GB required · 64GB available
82.87 tok/sEstimated
Qwen/Qwen3-30B-A3BFP16
Fits comfortably61GB required · 64GB available
47.62 tok/sEstimated
microsoft/Phi-3.5-vision-instructQ8
Fits comfortably7GB required · 64GB available
175.09 tok/sEstimated
Qwen/Qwen2-7B-InstructQ8
Fits comfortably7GB required · 64GB available
172.25 tok/sEstimated
Qwen/Qwen2-7B-InstructFP16
Fits comfortably15GB required · 64GB available
88.63 tok/sEstimated
Qwen/Qwen3-4B-BaseQ4
Fits comfortably2GB required · 64GB available
225.62 tok/sEstimated
lmstudio-community/Qwen3-4B-Thinking-2507-MLX-4bitQ4
Fits comfortably2GB required · 64GB available
241.68 tok/sEstimated
lmstudio-community/Qwen3-4B-Thinking-2507-MLX-4bitQ8
Fits comfortably4GB required · 64GB available
168.39 tok/sEstimated
lmstudio-community/Qwen3-4B-Thinking-2507-MLX-4bitFP16
Fits comfortably9GB required · 64GB available
85.74 tok/sEstimated
unsloth/Llama-3.2-3B-InstructQ4
Fits comfortably2GB required · 64GB available
298.34 tok/sEstimated
unsloth/Llama-3.2-3B-InstructQ8
Fits comfortably3GB required · 64GB available
186.17 tok/sEstimated
OpenPipe/Qwen3-14B-InstructFP16
Fits comfortably29GB required · 64GB available
73.55 tok/sEstimated
openai-community/gpt2-xlQ4
Fits comfortably4GB required · 64GB available
218.65 tok/sEstimated
openai-community/gpt2-xlQ8
Fits comfortably7GB required · 64GB available
167.09 tok/sEstimated
openai-community/gpt2-xlFP16
Fits comfortably15GB required · 64GB available
97.89 tok/sEstimated
microsoft/Phi-3-mini-128k-instructQ4
Fits comfortably4GB required · 64GB available
256.17 tok/sEstimated
microsoft/Phi-3-mini-128k-instructQ8
Fits comfortably7GB required · 64GB available
150.93 tok/sEstimated
GSAI-ML/LLaDA-8B-InstructQ4
Fits comfortably4GB required · 64GB available
227.64 tok/sEstimated
GSAI-ML/LLaDA-8B-InstructQ8
Fits comfortably9GB required · 64GB available
163.69 tok/sEstimated
lmstudio-community/DeepSeek-R1-0528-Qwen3-8B-MLX-4bitQ8
Fits comfortably9GB required · 64GB available
166.47 tok/sEstimated
skt/kogpt2-base-v2Q8
Fits comfortably7GB required · 64GB available
180.06 tok/sEstimated
skt/kogpt2-base-v2FP16
Fits comfortably15GB required · 64GB available
91.11 tok/sEstimated
ibm-granite/granite-docling-258MQ4
Fits comfortably4GB required · 64GB available
254.07 tok/sEstimated
Qwen/Qwen3-Next-80B-A3B-ThinkingQ4
Fits comfortably39GB required · 64GB available
50.19 tok/sEstimated
Qwen/Qwen3-Next-80B-A3B-ThinkingQ8
Not supported78GB required · 64GB available
34.41 tok/sEstimated
Qwen/Qwen3-Next-80B-A3B-ThinkingFP16
Not supported156GB required · 64GB available
18.47 tok/sEstimated
meta-llama/Llama-2-13b-chat-hfQ4
Fits comfortably7GB required · 64GB available
173.60 tok/sEstimated
NousResearch/Meta-Llama-3.1-8B-InstructFP16
Fits comfortably17GB required · 64GB available
98.95 tok/sEstimated
apple/OpenELM-1_1B-InstructQ4
Fits comfortably1GB required · 64GB available
263.00 tok/sEstimated
lmstudio-community/Qwen3-Coder-30B-A3B-Instruct-MLX-5bitQ4
Fits comfortably15GB required · 64GB available
125.81 tok/sEstimated
lmstudio-community/Qwen3-Coder-30B-A3B-Instruct-MLX-5bitQ8
Fits comfortably31GB required · 64GB available
94.83 tok/sEstimated
lmstudio-community/Qwen3-Coder-30B-A3B-Instruct-MLX-5bitFP16
Fits comfortably61GB required · 64GB available
49.67 tok/sEstimated
Alibaba-NLP/gte-Qwen2-1.5B-instructQ4
Fits comfortably3GB required · 64GB available
230.57 tok/sEstimated
Qwen/QwQ-32B-PreviewQ8
Fits comfortably34GB required · 64GB available
54.52 tok/sEstimated
Qwen/Qwen3-0.6BQ8
Fits comfortably6GB required · 64GB available
175.34 tok/sEstimated

Note: Performance estimates are calculated. Real results may vary. Methodology · Submit real data

Alternative GPUs

RTX 5070
12GB

Explore how RTX 5070 stacks up for local inference workloads.

RTX 4060 Ti 16GB
16GB

Explore how RTX 4060 Ti 16GB stacks up for local inference workloads.

RX 6800 XT
16GB

Explore how RX 6800 XT stacks up for local inference workloads.

RTX 4070 Super
12GB

Explore how RTX 4070 Super stacks up for local inference workloads.

RTX 3080
10GB

Explore how RTX 3080 stacks up for local inference workloads.