L
localai.computer
ModelsGPUsSystemsBuildsOpenClawMethodology

Resources

  • Methodology
  • Submit Benchmark
  • About

Browse

  • AI Models
  • GPUs
  • PC Builds
  • AI News

Guides

  • OpenClaw Guide
  • How-To Guides

Legal

  • Privacy
  • Terms
  • Contact

© 2026 localai.computer. Hardware recommendations for running AI models locally.

ℹ️We earn from qualifying purchases through affiliate links at no extra cost to you. This supports our free content and research.

  1. Home
  2. GPUs
  3. RTX 4090

Quick Answer: RTX 4090 offers 24GB VRAM and starts around current market pricing. It delivers approximately 270 tokens/sec on Deepseek AI Deepseek Ocr 2. It typically draws 450W under load.

RTX 4090

Check availability
By NVIDIAReleased 2022-10MSRP $1,599.00

RTX 4090 remains the go-to GPU for local AI workloads. It runs every mainstream 70B model, sustains the fastest consumer inference speeds, and anchors premium builds that scale to production deployments.

Search on AmazonView Benchmarks
Specs snapshot
Key hardware metrics for AI workloads.
VRAM24GB
Cores16,384
TDP450W
ArchitectureAda Lovelace
Key Takeaways
  • 24GB VRAM - runs models up to ~60B parameters
  • Flagship-class compute for maximum throughput
  • High power draw (450W) - requires robust PSU (850W+ recommended)
  • Strong price-to-VRAM value

What this means for you

With 24GB VRAM, RTX 4090 can run models up to approximately 60B parameters using 4-bit quantization. This handles most popular models including Llama 3 70B, Mistral 7B, and larger.

Who should buy

  • Running 70B parameter models at good speeds
  • Multiple model instances simultaneously
  • Production deployments on single GPU

Looking to upgrade?

Consider RTX 4090 or RTX 6000 Ada — 24GB Ada offers better efficiency than Ampere.

AI benchmarks

Showing 80 of 80 benchmark rows.

ModelSizeQuantizationTokens/secVRAM used
Deepseek AI Deepseek Ocr 2UnknownQ4
270.00 tok/sEstimated

Static estimation (DB-independent)

1GB
Deepseek AI Deepseek Math V2UnknownQ4
270.00 tok/sEstimated

Static estimation (DB-independent)

1GB
Deepseek AI Deepseek V2 5UnknownQ4
270.00 tok/sEstimated

Static estimation (DB-independent)

1GB
Deepseek AI Deepseek V3UnknownQ4
270.00 tok/sEstimated

Static estimation (DB-independent)

2GB
Deepseek AI Deepseek Coder V2 Lite InstructUnknownQ4
270.00 tok/sEstimated

Static estimation (DB-independent)

1GB
Deepseek AI Deepseek V3.1UnknownQ4
270.00 tok/sEstimated

Static estimation (DB-independent)

2GB
Deepseek AI Deepseek Coder 1.3B Instruct1.3BQ4
270.00 tok/sEstimated

Static estimation (DB-independent)

1GB
Deepseek AI Deepseek R1 Distill Qwen 1.5B1.5BQ4
270.00 tok/sEstimated

Static estimation (DB-independent)

1GB
Deepseek AI Deepseek OcrUnknownQ4
225.00 tok/sEstimated

Static estimation (DB-independent)

4GB
Lmstudio Community Deepseek R1 0528 Qwen3 8B Mlx 8bit8BQ4
225.00 tok/sEstimated

Static estimation (DB-independent)

4GB
Lmstudio Community Deepseek R1 0528 Qwen3 8B Mlx 4bit8BQ4
225.00 tok/sEstimated

Static estimation (DB-independent)

4GB
Deepseek AI Deepseek R1UnknownQ4
225.00 tok/sEstimated

Static estimation (DB-independent)

4GB
Deepseek AI Deepseek R1 0528UnknownQ4
225.00 tok/sEstimated

Static estimation (DB-independent)

4GB
Deepseek AI Deepseek R1 Distill Llama 8B8BQ4
225.00 tok/sEstimated

Static estimation (DB-independent)

4GB
Deepseek AI Deepseek R1 Distill Qwen 7B7BQ4
225.00 tok/sEstimated

Static estimation (DB-independent)

4GB
Nineninesix Kani Tts 2 EnUnknownQ4
216.00 tok/sEstimated

Static estimation (DB-independent)

1GB
Nanbeige Nanbeige4 1 3B3BQ4
216.00 tok/sEstimated

Static estimation (DB-independent)

2GB
Minimaxai Minimax M2 5UnknownQ4
216.00 tok/sEstimated

Static estimation (DB-independent)

1GB
Minimaxai Minimax M2 1UnknownQ4
216.00 tok/sEstimated

Static estimation (DB-independent)

1GB
Stepfun AI Step 3 5 FlashUnknownQ4
216.00 tok/sEstimated

Static estimation (DB-independent)

2GB
Qwen Qwen3 Coder NextUnknownQ4
216.00 tok/sEstimated

Static estimation (DB-independent)

2GB
Moonshotai Kimi K2 5UnknownQ4
216.00 tok/sEstimated

Static estimation (DB-independent)

1GB
Xiaomimimo Mimo V2 FlashUnknownQ4
216.00 tok/sEstimated

Static estimation (DB-independent)

1GB
Nari Labs Dia2 2B2BQ4
216.00 tok/sEstimated

Static estimation (DB-independent)

1GB
Google Embeddinggemma 300MUnknownQ4
216.00 tok/sEstimated

Static estimation (DB-independent)

1GB
Facebook Sam3UnknownQ4
216.00 tok/sEstimated

Static estimation (DB-independent)

2GB
Black Forest Labs Flux 2 DevUnknownQ4
216.00 tok/sEstimated

Static estimation (DB-independent)

1GB
Moonshotai Kimi K2 ThinkingUnknownQ4
216.00 tok/sEstimated

Static estimation (DB-independent)

1GB
Microsoft Phi 3 5 Mini InstructUnknownQ4
216.00 tok/sEstimated

Static estimation (DB-independent)

2GB
Meta Llama Llama 3 2 3B Instruct3BQ4
216.00 tok/sEstimated

Static estimation (DB-independent)

2GB
Qwen Qwen3 1.7B Base1.7BQ4
216.00 tok/sEstimated

Static estimation (DB-independent)

1GB
Dicta Il Dictalm2.0 InstructUnknownQ4
216.00 tok/sEstimated

Static estimation (DB-independent)

1GB
Qwen Qwen2 0.5B Instruct0.5BQ4
216.00 tok/sEstimated

Static estimation (DB-independent)

1GB
Alibaba Nlp Gte Qwen2 1.5B Instruct1.5BQ4
216.00 tok/sEstimated

Static estimation (DB-independent)

1GB
Apple Openelm 1 1B Instruct1BQ4
216.00 tok/sEstimated

Static estimation (DB-independent)

1GB
Qwen Qwen2.5 3B3BQ4
216.00 tok/sEstimated

Static estimation (DB-independent)

2GB
Unsloth Gemma 3 1B It1BQ4
216.00 tok/sEstimated

Static estimation (DB-independent)

1GB
Bigcode Starcoder2 3B3BQ4
216.00 tok/sEstimated

Static estimation (DB-independent)

2GB
Ibm Granite Granite Docling 258MUnknownQ4
216.00 tok/sEstimated

Static estimation (DB-independent)

1GB
Skt Kogpt2 Base V2UnknownQ4
216.00 tok/sEstimated

Static estimation (DB-independent)

1GB
Google Gemma 3 270M ItUnknownQ4
216.00 tok/sEstimated

Static estimation (DB-independent)

1GB
Eleutherai Pythia 70M DedupedUnknownQ4
216.00 tok/sEstimated

Static estimation (DB-independent)

1GB
Microsoft Vibevoice 1.5B1.5BQ4
216.00 tok/sEstimated

Static estimation (DB-independent)

1GB
Ibm Granite Granite 3.3 2B Instruct2BQ4
216.00 tok/sEstimated

Static estimation (DB-independent)

1GB
Google Gemma 2B2BQ4
216.00 tok/sEstimated

Static estimation (DB-independent)

1GB
Trl Internal Testing Tiny Llamaforcausallm 3.2UnknownQ4
216.00 tok/sEstimated

Static estimation (DB-independent)

2GB
Llamafactory Tiny Random Llama 3UnknownQ4
216.00 tok/sEstimated

Static estimation (DB-independent)

2GB
Unsloth Llama 3.2 1B Instruct1BQ4
216.00 tok/sEstimated

Static estimation (DB-independent)

1GB
Numind Nuextract 1.5UnknownQ4
216.00 tok/sEstimated

Static estimation (DB-independent)

1GB
Hmellor Tiny Random LlamaforcausallmUnknownQ4
216.00 tok/sEstimated

Static estimation (DB-independent)

1GB
Sshleifer Tiny Gpt2UnknownQ4
216.00 tok/sEstimated

Static estimation (DB-independent)

1GB
Openai Community Gpt2 XlUnknownQ4
216.00 tok/sEstimated

Static estimation (DB-independent)

1GB
Ibm Research Powermoe 3B3BQ4
216.00 tok/sEstimated

Static estimation (DB-independent)

2GB
Unsloth Llama 3.2 3B Instruct3BQ4
216.00 tok/sEstimated

Static estimation (DB-independent)

2GB
Meta Llama Llama 3.2 3B3BQ4
216.00 tok/sEstimated

Static estimation (DB-independent)

2GB
Eleutherai Gpt Neo 125MUnknownQ4
216.00 tok/sEstimated

Static estimation (DB-independent)

1GB
Meta Llama Llama Guard 3 1B1BQ4
216.00 tok/sEstimated

Static estimation (DB-independent)

1GB
Qwen Qwen2 1.5B Instruct1.5BQ4
216.00 tok/sEstimated

Static estimation (DB-independent)

1GB
Google Gemma 2 2B It2BQ4
216.00 tok/sEstimated

Static estimation (DB-independent)

1GB
Microsoft Phi 3.5 Mini InstructUnknownQ4
216.00 tok/sEstimated

Static estimation (DB-independent)

2GB
Microsoft Phi 3.5 Vision InstructUnknownQ4
216.00 tok/sEstimated

Static estimation (DB-independent)

2GB
Rinna Japanese Gpt Neox SmallUnknownQ4
216.00 tok/sEstimated

Static estimation (DB-independent)

1GB
Qwen Qwen2.5 Coder 1.5B1.5BQ4
216.00 tok/sEstimated

Static estimation (DB-independent)

1GB
Microsoft Dialogpt SmallUnknownQ4
216.00 tok/sEstimated

Static estimation (DB-independent)

1GB
Qwen Qwen3 0.6B Base0.6BQ4
216.00 tok/sEstimated

Static estimation (DB-independent)

1GB
Openai Community Gpt2 MediumUnknownQ4
216.00 tok/sEstimated

Static estimation (DB-independent)

1GB
Trl Internal Testing Tiny Random LlamaforcausallmUnknownQ4
216.00 tok/sEstimated

Static estimation (DB-independent)

1GB
Qwen Qwen2.5 Math 1.5B1.5BQ4
216.00 tok/sEstimated

Static estimation (DB-independent)

1GB
Huggingfacetb Smollm 135MUnknownQ4
216.00 tok/sEstimated

Static estimation (DB-independent)

1GB
Liquidai Lfm2 1.2B1.2BQ4
216.00 tok/sEstimated

Static estimation (DB-independent)

1GB
Qwen Qwen2 0.5B0.5BQ4
216.00 tok/sEstimated

Static estimation (DB-independent)

1GB
Minimaxai Minimax M2UnknownQ4
216.00 tok/sEstimated

Static estimation (DB-independent)

1GB
Huggingfacetb Smollm2 135MUnknownQ4
216.00 tok/sEstimated

Static estimation (DB-independent)

1GB
Microsoft Phi 2UnknownQ4
216.00 tok/sEstimated

Static estimation (DB-independent)

1GB
Qwen Qwen2.5 0.5B0.5BQ4
216.00 tok/sEstimated

Static estimation (DB-independent)

1GB
Qwen Qwen2.5 1.5B1.5BQ4
216.00 tok/sEstimated

Static estimation (DB-independent)

1GB
Qwen Qwen3 Reranker 0.6B0.6BQ4
216.00 tok/sEstimated

Static estimation (DB-independent)

1GB
Google T5 T5 3B3BQ4
216.00 tok/sEstimated

Static estimation (DB-independent)

2GB
Qwen Qwen3 1.7B1.7BQ4
216.00 tok/sEstimated

Static estimation (DB-independent)

1GB
Openai Community Gpt2 LargeUnknownQ4
216.00 tok/sEstimated

Static estimation (DB-independent)

1GB
Deepseek AI Deepseek Ocr 2
Q4 · Unknown
1GB
270.00 tok/sEstimated
Static estimation (DB-independent)
Deepseek AI Deepseek Math V2
Q4 · Unknown
1GB
270.00 tok/sEstimated
Static estimation (DB-independent)
Deepseek AI Deepseek V2 5
Q4 · Unknown
1GB
270.00 tok/sEstimated
Static estimation (DB-independent)
Deepseek AI Deepseek V3
Q4 · Unknown
2GB
270.00 tok/sEstimated
Static estimation (DB-independent)
Deepseek AI Deepseek Coder V2 Lite Instruct
Q4 · Unknown
1GB
270.00 tok/sEstimated
Static estimation (DB-independent)
Deepseek AI Deepseek V3.1
Q4 · Unknown
2GB
270.00 tok/sEstimated
Static estimation (DB-independent)
Deepseek AI Deepseek Coder 1.3B Instruct
Q4 · 1.3B
1GB
270.00 tok/sEstimated
Static estimation (DB-independent)
Deepseek AI Deepseek R1 Distill Qwen 1.5B
Q4 · 1.5B
1GB
270.00 tok/sEstimated
Static estimation (DB-independent)
Deepseek AI Deepseek Ocr
Q4 · Unknown
4GB
225.00 tok/sEstimated
Static estimation (DB-independent)
Lmstudio Community Deepseek R1 0528 Qwen3 8B Mlx 8bit
Q4 · 8B
4GB
225.00 tok/sEstimated
Static estimation (DB-independent)
Lmstudio Community Deepseek R1 0528 Qwen3 8B Mlx 4bit
Q4 · 8B
4GB
225.00 tok/sEstimated
Static estimation (DB-independent)
Deepseek AI Deepseek R1
Q4 · Unknown
4GB
225.00 tok/sEstimated
Static estimation (DB-independent)
Deepseek AI Deepseek R1 0528
Q4 · Unknown
4GB
225.00 tok/sEstimated
Static estimation (DB-independent)
Deepseek AI Deepseek R1 Distill Llama 8B
Q4 · 8B
4GB
225.00 tok/sEstimated
Static estimation (DB-independent)
Deepseek AI Deepseek R1 Distill Qwen 7B
Q4 · 7B
4GB
225.00 tok/sEstimated
Static estimation (DB-independent)
Nineninesix Kani Tts 2 En
Q4 · Unknown
1GB
216.00 tok/sEstimated
Static estimation (DB-independent)
Nanbeige Nanbeige4 1 3B
Q4 · 3B
2GB
216.00 tok/sEstimated
Static estimation (DB-independent)
Minimaxai Minimax M2 5
Q4 · Unknown
1GB
216.00 tok/sEstimated
Static estimation (DB-independent)
Minimaxai Minimax M2 1
Q4 · Unknown
1GB
216.00 tok/sEstimated
Static estimation (DB-independent)
Stepfun AI Step 3 5 Flash
Q4 · Unknown
2GB
216.00 tok/sEstimated
Static estimation (DB-independent)
Qwen Qwen3 Coder Next
Q4 · Unknown
2GB
216.00 tok/sEstimated
Static estimation (DB-independent)
Moonshotai Kimi K2 5
Q4 · Unknown
1GB
216.00 tok/sEstimated
Static estimation (DB-independent)
Xiaomimimo Mimo V2 Flash
Q4 · Unknown
1GB
216.00 tok/sEstimated
Static estimation (DB-independent)
Nari Labs Dia2 2B
Q4 · 2B
1GB
216.00 tok/sEstimated
Static estimation (DB-independent)
Google Embeddinggemma 300M
Q4 · Unknown
1GB
216.00 tok/sEstimated
Static estimation (DB-independent)
Facebook Sam3
Q4 · Unknown
2GB
216.00 tok/sEstimated
Static estimation (DB-independent)
Black Forest Labs Flux 2 Dev
Q4 · Unknown
1GB
216.00 tok/sEstimated
Static estimation (DB-independent)
Moonshotai Kimi K2 Thinking
Q4 · Unknown
1GB
216.00 tok/sEstimated
Static estimation (DB-independent)
Microsoft Phi 3 5 Mini Instruct
Q4 · Unknown
2GB
216.00 tok/sEstimated
Static estimation (DB-independent)
Meta Llama Llama 3 2 3B Instruct
Q4 · 3B
2GB
216.00 tok/sEstimated
Static estimation (DB-independent)
Qwen Qwen3 1.7B Base
Q4 · 1.7B
1GB
216.00 tok/sEstimated
Static estimation (DB-independent)
Dicta Il Dictalm2.0 Instruct
Q4 · Unknown
1GB
216.00 tok/sEstimated
Static estimation (DB-independent)
Qwen Qwen2 0.5B Instruct
Q4 · 0.5B
1GB
216.00 tok/sEstimated
Static estimation (DB-independent)
Alibaba Nlp Gte Qwen2 1.5B Instruct
Q4 · 1.5B
1GB
216.00 tok/sEstimated
Static estimation (DB-independent)
Apple Openelm 1 1B Instruct
Q4 · 1B
1GB
216.00 tok/sEstimated
Static estimation (DB-independent)
Qwen Qwen2.5 3B
Q4 · 3B
2GB
216.00 tok/sEstimated
Static estimation (DB-independent)
Unsloth Gemma 3 1B It
Q4 · 1B
1GB
216.00 tok/sEstimated
Static estimation (DB-independent)
Bigcode Starcoder2 3B
Q4 · 3B
2GB
216.00 tok/sEstimated
Static estimation (DB-independent)
Ibm Granite Granite Docling 258M
Q4 · Unknown
1GB
216.00 tok/sEstimated
Static estimation (DB-independent)
Skt Kogpt2 Base V2
Q4 · Unknown
1GB
216.00 tok/sEstimated
Static estimation (DB-independent)
Google Gemma 3 270M It
Q4 · Unknown
1GB
216.00 tok/sEstimated
Static estimation (DB-independent)
Eleutherai Pythia 70M Deduped
Q4 · Unknown
1GB
216.00 tok/sEstimated
Static estimation (DB-independent)
Microsoft Vibevoice 1.5B
Q4 · 1.5B
1GB
216.00 tok/sEstimated
Static estimation (DB-independent)
Ibm Granite Granite 3.3 2B Instruct
Q4 · 2B
1GB
216.00 tok/sEstimated
Static estimation (DB-independent)
Google Gemma 2B
Q4 · 2B
1GB
216.00 tok/sEstimated
Static estimation (DB-independent)
Trl Internal Testing Tiny Llamaforcausallm 3.2
Q4 · Unknown
2GB
216.00 tok/sEstimated
Static estimation (DB-independent)
Llamafactory Tiny Random Llama 3
Q4 · Unknown
2GB
216.00 tok/sEstimated
Static estimation (DB-independent)
Unsloth Llama 3.2 1B Instruct
Q4 · 1B
1GB
216.00 tok/sEstimated
Static estimation (DB-independent)
Numind Nuextract 1.5
Q4 · Unknown
1GB
216.00 tok/sEstimated
Static estimation (DB-independent)
Hmellor Tiny Random Llamaforcausallm
Q4 · Unknown
1GB
216.00 tok/sEstimated
Static estimation (DB-independent)
Sshleifer Tiny Gpt2
Q4 · Unknown
1GB
216.00 tok/sEstimated
Static estimation (DB-independent)
Openai Community Gpt2 Xl
Q4 · Unknown
1GB
216.00 tok/sEstimated
Static estimation (DB-independent)
Ibm Research Powermoe 3B
Q4 · 3B
2GB
216.00 tok/sEstimated
Static estimation (DB-independent)
Unsloth Llama 3.2 3B Instruct
Q4 · 3B
2GB
216.00 tok/sEstimated
Static estimation (DB-independent)
Meta Llama Llama 3.2 3B
Q4 · 3B
2GB
216.00 tok/sEstimated
Static estimation (DB-independent)
Eleutherai Gpt Neo 125M
Q4 · Unknown
1GB
216.00 tok/sEstimated
Static estimation (DB-independent)
Meta Llama Llama Guard 3 1B
Q4 · 1B
1GB
216.00 tok/sEstimated
Static estimation (DB-independent)
Qwen Qwen2 1.5B Instruct
Q4 · 1.5B
1GB
216.00 tok/sEstimated
Static estimation (DB-independent)
Google Gemma 2 2B It
Q4 · 2B
1GB
216.00 tok/sEstimated
Static estimation (DB-independent)
Microsoft Phi 3.5 Mini Instruct
Q4 · Unknown
2GB
216.00 tok/sEstimated
Static estimation (DB-independent)
Microsoft Phi 3.5 Vision Instruct
Q4 · Unknown
2GB
216.00 tok/sEstimated
Static estimation (DB-independent)
Rinna Japanese Gpt Neox Small
Q4 · Unknown
1GB
216.00 tok/sEstimated
Static estimation (DB-independent)
Qwen Qwen2.5 Coder 1.5B
Q4 · 1.5B
1GB
216.00 tok/sEstimated
Static estimation (DB-independent)
Microsoft Dialogpt Small
Q4 · Unknown
1GB
216.00 tok/sEstimated
Static estimation (DB-independent)
Qwen Qwen3 0.6B Base
Q4 · 0.6B
1GB
216.00 tok/sEstimated
Static estimation (DB-independent)
Openai Community Gpt2 Medium
Q4 · Unknown
1GB
216.00 tok/sEstimated
Static estimation (DB-independent)
Trl Internal Testing Tiny Random Llamaforcausallm
Q4 · Unknown
1GB
216.00 tok/sEstimated
Static estimation (DB-independent)
Qwen Qwen2.5 Math 1.5B
Q4 · 1.5B
1GB
216.00 tok/sEstimated
Static estimation (DB-independent)
Huggingfacetb Smollm 135M
Q4 · Unknown
1GB
216.00 tok/sEstimated
Static estimation (DB-independent)
Liquidai Lfm2 1.2B
Q4 · 1.2B
1GB
216.00 tok/sEstimated
Static estimation (DB-independent)
Qwen Qwen2 0.5B
Q4 · 0.5B
1GB
216.00 tok/sEstimated
Static estimation (DB-independent)
Minimaxai Minimax M2
Q4 · Unknown
1GB
216.00 tok/sEstimated
Static estimation (DB-independent)
Huggingfacetb Smollm2 135M
Q4 · Unknown
1GB
216.00 tok/sEstimated
Static estimation (DB-independent)
Microsoft Phi 2
Q4 · Unknown
1GB
216.00 tok/sEstimated
Static estimation (DB-independent)
Qwen Qwen2.5 0.5B
Q4 · 0.5B
1GB
216.00 tok/sEstimated
Static estimation (DB-independent)
Qwen Qwen2.5 1.5B
Q4 · 1.5B
1GB
216.00 tok/sEstimated
Static estimation (DB-independent)
Qwen Qwen3 Reranker 0.6B
Q4 · 0.6B
1GB
216.00 tok/sEstimated
Static estimation (DB-independent)
Google T5 T5 3B
Q4 · 3B
2GB
216.00 tok/sEstimated
Static estimation (DB-independent)
Qwen Qwen3 1.7B
Q4 · 1.7B
1GB
216.00 tok/sEstimated
Static estimation (DB-independent)
Openai Community Gpt2 Large
Q4 · Unknown
1GB
216.00 tok/sEstimated
Static estimation (DB-independent)

Model compatibility

Showing 240 of 240 compatibility rows.

ModelSizeQuantizationVerdictEstimated speedVRAM needed
01 AI Yi 1 5 34B Chat34BQ4Fits comfortably
63.00 tok/sEstimated
17GB (have 24GB)
01 AI Yi 1 5 34B Chat34BQ8Not supported
44.10 tok/sEstimated
34GB (have 24GB)
01 AI Yi 1 5 34B Chat34BFP16Not supported
23.94 tok/sEstimated
68GB (have 24GB)
AI Forever Rugpt 3.5 13B13BQ4Fits comfortably
135.00 tok/sEstimated
7GB (have 24GB)
AI Forever Rugpt 3.5 13B13BQ8Fits comfortably
94.50 tok/sEstimated
13GB (have 24GB)
AI Forever Rugpt 3.5 13B13BFP16Not supported
51.30 tok/sEstimated
26GB (have 24GB)
AI Mo Kimina Prover 72B72BQ4Not supported
36.00 tok/sEstimated
36GB (have 24GB)
AI Mo Kimina Prover 72B72BQ8Not supported
25.20 tok/sEstimated
72GB (have 24GB)
AI Mo Kimina Prover 72B72BFP16Not supported
13.68 tok/sEstimated
144GB (have 24GB)
Alibaba Nlp Gte Qwen2 1.5B Instruct1.5BQ4Fits comfortably
216.00 tok/sEstimated
1GB (have 24GB)
Alibaba Nlp Gte Qwen2 1.5B Instruct1.5BQ8Fits comfortably
151.20 tok/sEstimated
2GB (have 24GB)
Alibaba Nlp Gte Qwen2 1.5B Instruct1.5BFP16Fits comfortably
82.08 tok/sEstimated
3GB (have 24GB)
Allenai Olmo 2 0425 1B1BQ4Fits comfortably
216.00 tok/sEstimated
1GB (have 24GB)
Allenai Olmo 2 0425 1B1BQ8Fits comfortably
151.20 tok/sEstimated
1GB (have 24GB)
Allenai Olmo 2 0425 1B1BFP16Fits comfortably
82.08 tok/sEstimated
2GB (have 24GB)
Allenai Olmo 3 7B Think7BQ4Fits comfortably
180.00 tok/sEstimated
4GB (have 24GB)
Allenai Olmo 3 7B Think7BQ8Fits comfortably
126.00 tok/sEstimated
7GB (have 24GB)
Allenai Olmo 3 7B Think7BFP16Fits comfortably
68.40 tok/sEstimated
14GB (have 24GB)
Apple Openelm 1 1B Instruct1BQ4Fits comfortably
216.00 tok/sEstimated
1GB (have 24GB)
Apple Openelm 1 1B Instruct1BQ8Fits comfortably
151.20 tok/sEstimated
2GB (have 24GB)
Apple Openelm 1 1B Instruct1BFP16Fits comfortably
82.08 tok/sEstimated
3GB (have 24GB)
Baichuan Inc Baichuan M2 32B32BQ4Fits comfortably
63.00 tok/sEstimated
16GB (have 24GB)
Baichuan Inc Baichuan M2 32B32BQ8Not supported
44.10 tok/sEstimated
32GB (have 24GB)
Baichuan Inc Baichuan M2 32B32BFP16Not supported
23.94 tok/sEstimated
64GB (have 24GB)
Bigcode Starcoder2 3B3BQ4Fits comfortably
216.00 tok/sEstimated
2GB (have 24GB)
Bigcode Starcoder2 3B3BQ8Fits comfortably
151.20 tok/sEstimated
3GB (have 24GB)
Bigcode Starcoder2 3B3BFP16Fits comfortably
82.08 tok/sEstimated
6GB (have 24GB)
Bigscience Bloomz 560MUnknownQ4Fits comfortably
216.00 tok/sEstimated
1GB (have 24GB)
Bigscience Bloomz 560MUnknownQ8Fits comfortably
151.20 tok/sEstimated
1GB (have 24GB)
Bigscience Bloomz 560MUnknownFP16Fits comfortably
82.08 tok/sEstimated
2GB (have 24GB)
Black Forest Labs Flux 1 DevUnknownQ4Fits comfortably
180.00 tok/sEstimated
4GB (have 24GB)
Black Forest Labs Flux 1 DevUnknownQ8Fits comfortably
126.00 tok/sEstimated
7GB (have 24GB)
Black Forest Labs Flux 1 DevUnknownFP16Fits comfortably
68.40 tok/sEstimated
14GB (have 24GB)
Black Forest Labs Flux 2 DevUnknownQ4Fits comfortably
216.00 tok/sEstimated
1GB (have 24GB)
Black Forest Labs Flux 2 DevUnknownQ8Fits comfortably
151.20 tok/sEstimated
2GB (have 24GB)
Black Forest Labs Flux 2 DevUnknownFP16Fits comfortably
82.08 tok/sEstimated
4GB (have 24GB)
Bsc Lt Salamandrata 7B Instruct7BQ4Fits comfortably
180.00 tok/sEstimated
4GB (have 24GB)
Bsc Lt Salamandrata 7B Instruct7BQ8Fits comfortably
126.00 tok/sEstimated
7GB (have 24GB)
Bsc Lt Salamandrata 7B Instruct7BFP16Fits comfortably
68.40 tok/sEstimated
14GB (have 24GB)
Codellama Codellama 34B HF34BQ4Fits comfortably
63.00 tok/sEstimated
17GB (have 24GB)
Codellama Codellama 34B HF34BQ8Not supported
44.10 tok/sEstimated
34GB (have 24GB)
Codellama Codellama 34B HF34BFP16Not supported
23.94 tok/sEstimated
68GB (have 24GB)
Context Labs Meta Llama Llama 3.2 3B Instruct FP163BQ4Fits comfortably
216.00 tok/sEstimated
2GB (have 24GB)
Context Labs Meta Llama Llama 3.2 3B Instruct FP163BQ8Fits comfortably
151.20 tok/sEstimated
3GB (have 24GB)
Context Labs Meta Llama Llama 3.2 3B Instruct FP163BFP16Fits comfortably
82.08 tok/sEstimated
6GB (have 24GB)
Deepseek AI Deepseek Coder 1.3B Instruct1.3BQ4Fits comfortably
270.00 tok/sEstimated
1GB (have 24GB)
Deepseek AI Deepseek Coder 1.3B Instruct1.3BQ8Fits comfortably
189.00 tok/sEstimated
2GB (have 24GB)
Deepseek AI Deepseek Coder 1.3B Instruct1.3BFP16Fits comfortably
102.60 tok/sEstimated
3GB (have 24GB)
Deepseek AI Deepseek Coder 33B Instruct33BQ4Fits comfortably
78.75 tok/sEstimated
17GB (have 24GB)
Deepseek AI Deepseek Coder 33B Instruct33BQ8Not supported
55.12 tok/sEstimated
33GB (have 24GB)
Deepseek AI Deepseek Coder 33B Instruct33BFP16Not supported
29.92 tok/sEstimated
66GB (have 24GB)
Deepseek AI Deepseek Coder V2 Instruct 0724UnknownQ4Not supported
45.00 tok/sEstimated
36GB (have 24GB)
Deepseek AI Deepseek Coder V2 Instruct 0724UnknownQ8Not supported
31.50 tok/sEstimated
72GB (have 24GB)
Deepseek AI Deepseek Coder V2 Instruct 0724UnknownFP16Not supported
17.10 tok/sEstimated
144GB (have 24GB)
Deepseek AI Deepseek Coder V2 Lite InstructUnknownQ4Fits comfortably
270.00 tok/sEstimated
1GB (have 24GB)
Deepseek AI Deepseek Coder V2 Lite InstructUnknownQ8Fits comfortably
189.00 tok/sEstimated
2GB (have 24GB)
Deepseek AI Deepseek Coder V2 Lite InstructUnknownFP16Fits comfortably
102.60 tok/sEstimated
4GB (have 24GB)
Deepseek AI Deepseek Math V2UnknownQ4Fits comfortably
270.00 tok/sEstimated
1GB (have 24GB)
Deepseek AI Deepseek Math V2UnknownQ8Fits comfortably
189.00 tok/sEstimated
2GB (have 24GB)
Deepseek AI Deepseek Math V2UnknownFP16Fits comfortably
102.60 tok/sEstimated
4GB (have 24GB)
Deepseek AI Deepseek OcrUnknownQ4Fits comfortably
225.00 tok/sEstimated
4GB (have 24GB)
Deepseek AI Deepseek OcrUnknownQ8Fits comfortably
157.50 tok/sEstimated
7GB (have 24GB)
Deepseek AI Deepseek OcrUnknownFP16Fits comfortably
85.50 tok/sEstimated
14GB (have 24GB)
Deepseek AI Deepseek Ocr 2UnknownQ4Fits comfortably
270.00 tok/sEstimated
1GB (have 24GB)
Deepseek AI Deepseek Ocr 2UnknownQ8Fits comfortably
189.00 tok/sEstimated
2GB (have 24GB)
Deepseek AI Deepseek Ocr 2UnknownFP16Fits comfortably
102.60 tok/sEstimated
4GB (have 24GB)
Deepseek AI Deepseek R1UnknownQ4Fits comfortably
225.00 tok/sEstimated
4GB (have 24GB)
Deepseek AI Deepseek R1UnknownQ8Fits comfortably
157.50 tok/sEstimated
7GB (have 24GB)
Deepseek AI Deepseek R1UnknownFP16Fits comfortably
85.50 tok/sEstimated
14GB (have 24GB)
Deepseek AI Deepseek R1 0528UnknownQ4Fits comfortably
225.00 tok/sEstimated
4GB (have 24GB)
Deepseek AI Deepseek R1 0528UnknownQ8Fits comfortably
157.50 tok/sEstimated
8GB (have 24GB)
Deepseek AI Deepseek R1 0528UnknownFP16Fits comfortably
85.50 tok/sEstimated
16GB (have 24GB)
Deepseek AI Deepseek R1 Distill Llama 8B8BQ4Fits comfortably
225.00 tok/sEstimated
4GB (have 24GB)
Deepseek AI Deepseek R1 Distill Llama 8B8BQ8Fits comfortably
157.50 tok/sEstimated
8GB (have 24GB)
Deepseek AI Deepseek R1 Distill Llama 8B8BFP16Fits comfortably
85.50 tok/sEstimated
16GB (have 24GB)
Deepseek AI Deepseek R1 Distill Qwen 1.5B1.5BQ4Fits comfortably
270.00 tok/sEstimated
1GB (have 24GB)
Deepseek AI Deepseek R1 Distill Qwen 1.5B1.5BQ8Fits comfortably
189.00 tok/sEstimated
2GB (have 24GB)
Deepseek AI Deepseek R1 Distill Qwen 1.5B1.5BFP16Fits comfortably
102.60 tok/sEstimated
3GB (have 24GB)
Deepseek AI Deepseek R1 Distill Qwen 32B32BQ4Fits comfortably
78.75 tok/sEstimated
16GB (have 24GB)
Deepseek AI Deepseek R1 Distill Qwen 32B32BQ8Not supported
55.12 tok/sEstimated
32GB (have 24GB)
Deepseek AI Deepseek R1 Distill Qwen 32B32BFP16Not supported
29.92 tok/sEstimated
64GB (have 24GB)
Deepseek AI Deepseek R1 Distill Qwen 7B7BQ4Fits comfortably
225.00 tok/sEstimated
4GB (have 24GB)
Deepseek AI Deepseek R1 Distill Qwen 7B7BQ8Fits comfortably
157.50 tok/sEstimated
7GB (have 24GB)
Deepseek AI Deepseek R1 Distill Qwen 7B7BFP16Fits comfortably
85.50 tok/sEstimated
14GB (have 24GB)
Deepseek AI Deepseek V2 5UnknownQ4Fits comfortably
270.00 tok/sEstimated
1GB (have 24GB)
Deepseek AI Deepseek V2 5UnknownQ8Fits comfortably
189.00 tok/sEstimated
2GB (have 24GB)
Deepseek AI Deepseek V2 5UnknownFP16Fits comfortably
102.60 tok/sEstimated
4GB (have 24GB)
Deepseek AI Deepseek V3UnknownQ4Fits comfortably
270.00 tok/sEstimated
2GB (have 24GB)
Deepseek AI Deepseek V3UnknownQ8Fits comfortably
189.00 tok/sEstimated
3GB (have 24GB)
Deepseek AI Deepseek V3UnknownFP16Fits comfortably
102.60 tok/sEstimated
6GB (have 24GB)
Deepseek AI Deepseek V3 0324UnknownQ4Fits comfortably
78.75 tok/sEstimated
16GB (have 24GB)
Deepseek AI Deepseek V3 0324UnknownQ8Not supported
55.12 tok/sEstimated
32GB (have 24GB)
Deepseek AI Deepseek V3 0324UnknownFP16Not supported
29.92 tok/sEstimated
64GB (have 24GB)
Deepseek AI Deepseek V3.1UnknownQ4Fits comfortably
270.00 tok/sEstimated
2GB (have 24GB)
Deepseek AI Deepseek V3.1UnknownQ8Fits comfortably
189.00 tok/sEstimated
3GB (have 24GB)
Deepseek AI Deepseek V3.1UnknownFP16Fits comfortably
102.60 tok/sEstimated
6GB (have 24GB)
Dicta Il Dictalm2.0 InstructUnknownQ4Fits comfortably
216.00 tok/sEstimated
1GB (have 24GB)
Dicta Il Dictalm2.0 InstructUnknownQ8Fits comfortably
151.20 tok/sEstimated
2GB (have 24GB)
Dicta Il Dictalm2.0 InstructUnknownFP16Fits comfortably
82.08 tok/sEstimated
4GB (have 24GB)
Distilbert Distilgpt2UnknownQ4Fits comfortably
216.00 tok/sEstimated
1GB (have 24GB)
Distilbert Distilgpt2UnknownQ8Fits comfortably
151.20 tok/sEstimated
2GB (have 24GB)
Distilbert Distilgpt2UnknownFP16Fits comfortably
82.08 tok/sEstimated
4GB (have 24GB)
Dphn Dolphin 2.9.1 Yi 1.5 34B34BQ4Fits comfortably
63.00 tok/sEstimated
17GB (have 24GB)
Dphn Dolphin 2.9.1 Yi 1.5 34B34BQ8Not supported
44.10 tok/sEstimated
34GB (have 24GB)
Dphn Dolphin 2.9.1 Yi 1.5 34B34BFP16Not supported
23.94 tok/sEstimated
68GB (have 24GB)
Eleutherai Gpt Neo 125MUnknownQ4Fits comfortably
216.00 tok/sEstimated
1GB (have 24GB)
Eleutherai Gpt Neo 125MUnknownQ8Fits comfortably
151.20 tok/sEstimated
1GB (have 24GB)
Eleutherai Gpt Neo 125MUnknownFP16Fits comfortably
82.08 tok/sEstimated
1GB (have 24GB)
Eleutherai Pythia 70M DedupedUnknownQ4Fits comfortably
216.00 tok/sEstimated
1GB (have 24GB)
Eleutherai Pythia 70M DedupedUnknownQ8Fits comfortably
151.20 tok/sEstimated
1GB (have 24GB)
Eleutherai Pythia 70M DedupedUnknownFP16Fits comfortably
82.08 tok/sEstimated
1GB (have 24GB)
Essentialai Rnj 1UnknownQ4Fits comfortably
180.00 tok/sEstimated
4GB (have 24GB)
Essentialai Rnj 1UnknownQ8Fits comfortably
126.00 tok/sEstimated
7GB (have 24GB)
Essentialai Rnj 1UnknownFP16Fits comfortably
68.40 tok/sEstimated
14GB (have 24GB)
Facebook Opt 125MUnknownQ4Fits comfortably
216.00 tok/sEstimated
1GB (have 24GB)
Facebook Opt 125MUnknownQ8Fits comfortably
151.20 tok/sEstimated
1GB (have 24GB)
Facebook Opt 125MUnknownFP16Fits comfortably
82.08 tok/sEstimated
1GB (have 24GB)
Facebook Sam3UnknownQ4Fits comfortably
216.00 tok/sEstimated
2GB (have 24GB)
Facebook Sam3UnknownQ8Fits comfortably
151.20 tok/sEstimated
3GB (have 24GB)
Facebook Sam3UnknownFP16Fits comfortably
82.08 tok/sEstimated
6GB (have 24GB)
Fireredteam Firered Image Edit 1 0UnknownQ4Fits comfortably
180.00 tok/sEstimated
4GB (have 24GB)
Fireredteam Firered Image Edit 1 0UnknownQ8Fits comfortably
126.00 tok/sEstimated
7GB (have 24GB)
Fireredteam Firered Image Edit 1 0UnknownFP16Fits comfortably
68.40 tok/sEstimated
14GB (have 24GB)
Gensyn Qwen2.5 0.5B Instruct0.5BQ4Fits comfortably
216.00 tok/sEstimated
1GB (have 24GB)
Gensyn Qwen2.5 0.5B Instruct0.5BQ8Fits comfortably
151.20 tok/sEstimated
1GB (have 24GB)
Gensyn Qwen2.5 0.5B Instruct0.5BFP16Fits comfortably
82.08 tok/sEstimated
1GB (have 24GB)
Google Bert Bert Base UncasedUnknownQ4Fits comfortably
180.00 tok/sEstimated
4GB (have 24GB)
Google Bert Bert Base UncasedUnknownQ8Fits comfortably
126.00 tok/sEstimated
7GB (have 24GB)
Google Bert Bert Base UncasedUnknownFP16Fits comfortably
68.40 tok/sEstimated
14GB (have 24GB)
Google Embeddinggemma 300MUnknownQ4Fits comfortably
216.00 tok/sEstimated
1GB (have 24GB)
Google Embeddinggemma 300MUnknownQ8Fits comfortably
151.20 tok/sEstimated
1GB (have 24GB)
Google Embeddinggemma 300MUnknownFP16Fits comfortably
82.08 tok/sEstimated
1GB (have 24GB)
Google Gemma 2 27B It27BQ4Fits comfortably
99.00 tok/sEstimated
14GB (have 24GB)
Google Gemma 2 27B It27BQ8Not supported
69.30 tok/sEstimated
27GB (have 24GB)
Google Gemma 2 27B It27BFP16Not supported
37.62 tok/sEstimated
54GB (have 24GB)
Google Gemma 2 2B It2BQ4Fits comfortably
216.00 tok/sEstimated
1GB (have 24GB)
Google Gemma 2 2B It2BQ8Fits comfortably
151.20 tok/sEstimated
2GB (have 24GB)
Google Gemma 2 2B It2BFP16Fits comfortably
82.08 tok/sEstimated
4GB (have 24GB)
Google Gemma 2 9B It9BQ4Fits comfortably
135.00 tok/sEstimated
5GB (have 24GB)
Google Gemma 2 9B It9BQ8Fits comfortably
94.50 tok/sEstimated
9GB (have 24GB)
Google Gemma 2 9B It9BFP16Fits comfortably
51.30 tok/sEstimated
18GB (have 24GB)
Google Gemma 2B2BQ4Fits comfortably
216.00 tok/sEstimated
1GB (have 24GB)
Google Gemma 2B2BQ8Fits comfortably
151.20 tok/sEstimated
2GB (have 24GB)
Google Gemma 2B2BFP16Fits comfortably
82.08 tok/sEstimated
4GB (have 24GB)
Google Gemma 3 1B It1BQ4Fits comfortably
216.00 tok/sEstimated
1GB (have 24GB)
Google Gemma 3 1B It1BQ8Fits comfortably
151.20 tok/sEstimated
1GB (have 24GB)
Google Gemma 3 1B It1BFP16Fits comfortably
82.08 tok/sEstimated
2GB (have 24GB)
Google Gemma 3 270M ItUnknownQ4Fits comfortably
216.00 tok/sEstimated
1GB (have 24GB)
Google Gemma 3 270M ItUnknownQ8Fits comfortably
151.20 tok/sEstimated
1GB (have 24GB)
Google Gemma 3 270M ItUnknownFP16Fits comfortably
82.08 tok/sEstimated
1GB (have 24GB)
Google T5 T5 3B3BQ4Fits comfortably
216.00 tok/sEstimated
2GB (have 24GB)
Google T5 T5 3B3BQ8Fits comfortably
151.20 tok/sEstimated
3GB (have 24GB)
Google T5 T5 3B3BFP16Fits comfortably
82.08 tok/sEstimated
6GB (have 24GB)
Gsai Ml Llada 8B Base8BQ4Fits comfortably
180.00 tok/sEstimated
4GB (have 24GB)
Gsai Ml Llada 8B Base8BQ8Fits comfortably
126.00 tok/sEstimated
8GB (have 24GB)
Gsai Ml Llada 8B Base8BFP16Fits comfortably
68.40 tok/sEstimated
16GB (have 24GB)
Gsai Ml Llada 8B Instruct8BQ4Fits comfortably
180.00 tok/sEstimated
4GB (have 24GB)
Gsai Ml Llada 8B Instruct8BQ8Fits comfortably
126.00 tok/sEstimated
8GB (have 24GB)
Gsai Ml Llada 8B Instruct8BFP16Fits comfortably
68.40 tok/sEstimated
16GB (have 24GB)
Hmellor Tiny Random LlamaforcausallmUnknownQ4Fits comfortably
216.00 tok/sEstimated
1GB (have 24GB)
Hmellor Tiny Random LlamaforcausallmUnknownQ8Fits comfortably
151.20 tok/sEstimated
1GB (have 24GB)
Hmellor Tiny Random LlamaforcausallmUnknownFP16Fits comfortably
82.08 tok/sEstimated
1GB (have 24GB)
Huggingfaceh4 Zephyr 7B Beta7BQ4Fits comfortably
180.00 tok/sEstimated
4GB (have 24GB)
Huggingfaceh4 Zephyr 7B Beta7BQ8Fits comfortably
126.00 tok/sEstimated
7GB (have 24GB)
Huggingfaceh4 Zephyr 7B Beta7BFP16Fits comfortably
68.40 tok/sEstimated
14GB (have 24GB)
Huggingfacem4 Tiny Random LlamaforcausallmUnknownQ4Fits comfortably
180.00 tok/sEstimated
2GB (have 24GB)
Huggingfacem4 Tiny Random LlamaforcausallmUnknownQ8Fits comfortably
126.00 tok/sEstimated
4GB (have 24GB)
Huggingfacem4 Tiny Random LlamaforcausallmUnknownFP16Fits comfortably
68.40 tok/sEstimated
8GB (have 24GB)
Huggingfacetb Smollm 135MUnknownQ4Fits comfortably
216.00 tok/sEstimated
1GB (have 24GB)
Huggingfacetb Smollm 135MUnknownQ8Fits comfortably
151.20 tok/sEstimated
1GB (have 24GB)
Huggingfacetb Smollm 135MUnknownFP16Fits comfortably
82.08 tok/sEstimated
1GB (have 24GB)
Huggingfacetb Smollm2 135MUnknownQ4Fits comfortably
216.00 tok/sEstimated
1GB (have 24GB)
Huggingfacetb Smollm2 135MUnknownQ8Fits comfortably
151.20 tok/sEstimated
1GB (have 24GB)
Huggingfacetb Smollm2 135MUnknownFP16Fits comfortably
82.08 tok/sEstimated
1GB (have 24GB)
Huggyllama Llama 7B7BQ4Fits comfortably
180.00 tok/sEstimated
4GB (have 24GB)
Huggyllama Llama 7B7BQ8Fits comfortably
126.00 tok/sEstimated
7GB (have 24GB)
Huggyllama Llama 7B7BFP16Fits comfortably
68.40 tok/sEstimated
14GB (have 24GB)
Ibm Granite Granite 3.3 2B Instruct2BQ4Fits comfortably
216.00 tok/sEstimated
1GB (have 24GB)
Ibm Granite Granite 3.3 2B Instruct2BQ8Fits comfortably
151.20 tok/sEstimated
2GB (have 24GB)
Ibm Granite Granite 3.3 2B Instruct2BFP16Fits comfortably
82.08 tok/sEstimated
4GB (have 24GB)
Ibm Granite Granite 3.3 8B Instruct8BQ4Fits comfortably
180.00 tok/sEstimated
4GB (have 24GB)
Ibm Granite Granite 3.3 8B Instruct8BQ8Fits comfortably
126.00 tok/sEstimated
8GB (have 24GB)
Ibm Granite Granite 3.3 8B Instruct8BFP16Fits comfortably
68.40 tok/sEstimated
16GB (have 24GB)
Ibm Granite Granite Docling 258MUnknownQ4Fits comfortably
216.00 tok/sEstimated
1GB (have 24GB)
Ibm Granite Granite Docling 258MUnknownQ8Fits comfortably
151.20 tok/sEstimated
1GB (have 24GB)
Ibm Granite Granite Docling 258MUnknownFP16Fits comfortably
82.08 tok/sEstimated
1GB (have 24GB)
Ibm Research Powermoe 3B3BQ4Fits comfortably
216.00 tok/sEstimated
2GB (have 24GB)
Ibm Research Powermoe 3B3BQ8Fits comfortably
151.20 tok/sEstimated
3GB (have 24GB)
Ibm Research Powermoe 3B3BFP16Fits comfortably
82.08 tok/sEstimated
6GB (have 24GB)
Ilyagusev Saiga Llama3 8B8BQ4Fits comfortably
180.00 tok/sEstimated
2GB (have 24GB)
Ilyagusev Saiga Llama3 8B8BQ8Fits comfortably
126.00 tok/sEstimated
4GB (have 24GB)
Ilyagusev Saiga Llama3 8B8BFP16Fits comfortably
68.40 tok/sEstimated
8GB (have 24GB)
Inference Net Schematron 3B3BQ4Fits comfortably
216.00 tok/sEstimated
2GB (have 24GB)
Inference Net Schematron 3B3BQ8Fits comfortably
151.20 tok/sEstimated
3GB (have 24GB)
Inference Net Schematron 3B3BFP16Fits comfortably
82.08 tok/sEstimated
6GB (have 24GB)
Kaitchup Phi 3 Mini 4K Instruct Gptq 4bitUnknownQ4Fits comfortably
180.00 tok/sEstimated
2GB (have 24GB)
Kaitchup Phi 3 Mini 4K Instruct Gptq 4bitUnknownQ8Fits comfortably
126.00 tok/sEstimated
4GB (have 24GB)
Kaitchup Phi 3 Mini 4K Instruct Gptq 4bitUnknownFP16Fits comfortably
68.40 tok/sEstimated
8GB (have 24GB)
Liquidai Lfm2 1.2B1.2BQ4Fits comfortably
216.00 tok/sEstimated
1GB (have 24GB)
Liquidai Lfm2 1.2B1.2BQ8Fits comfortably
151.20 tok/sEstimated
2GB (have 24GB)
Liquidai Lfm2 1.2B1.2BFP16Fits comfortably
82.08 tok/sEstimated
3GB (have 24GB)
Liuhaotian Llava V1.5 7B7BQ4Fits comfortably
180.00 tok/sEstimated
4GB (have 24GB)
Liuhaotian Llava V1.5 7B7BQ8Fits comfortably
126.00 tok/sEstimated
7GB (have 24GB)
Liuhaotian Llava V1.5 7B7BFP16Fits comfortably
68.40 tok/sEstimated
14GB (have 24GB)
Llamafactory Tiny Random Llama 3UnknownQ4Fits comfortably
216.00 tok/sEstimated
2GB (have 24GB)
Llamafactory Tiny Random Llama 3UnknownQ8Fits comfortably
151.20 tok/sEstimated
3GB (have 24GB)
Llamafactory Tiny Random Llama 3UnknownFP16Fits comfortably
82.08 tok/sEstimated
6GB (have 24GB)
Lmstudio Community Deepseek R1 0528 Qwen3 8B Mlx 4bit8BQ4Fits comfortably
225.00 tok/sEstimated
4GB (have 24GB)
Lmstudio Community Deepseek R1 0528 Qwen3 8B Mlx 4bit8BQ8Fits comfortably
157.50 tok/sEstimated
8GB (have 24GB)
Lmstudio Community Deepseek R1 0528 Qwen3 8B Mlx 4bit8BFP16Fits comfortably
85.50 tok/sEstimated
16GB (have 24GB)
Lmstudio Community Deepseek R1 0528 Qwen3 8B Mlx 8bit8BQ4Fits comfortably
225.00 tok/sEstimated
4GB (have 24GB)
Lmstudio Community Deepseek R1 0528 Qwen3 8B Mlx 8bit8BQ8Fits comfortably
157.50 tok/sEstimated
8GB (have 24GB)
Lmstudio Community Deepseek R1 0528 Qwen3 8B Mlx 8bit8BFP16Fits comfortably
85.50 tok/sEstimated
16GB (have 24GB)
Lmstudio Community Qwen3 4B Thinking 2507 Mlx 4bit4BQ4Fits comfortably
180.00 tok/sEstimated
2GB (have 24GB)
Lmstudio Community Qwen3 4B Thinking 2507 Mlx 4bit4BQ8Fits comfortably
126.00 tok/sEstimated
4GB (have 24GB)
Lmstudio Community Qwen3 4B Thinking 2507 Mlx 4bit4BFP16Fits comfortably
68.40 tok/sEstimated
8GB (have 24GB)
Lmstudio Community Qwen3 4B Thinking 2507 Mlx 6bit4BQ4Fits comfortably
180.00 tok/sEstimated
2GB (have 24GB)
Lmstudio Community Qwen3 4B Thinking 2507 Mlx 6bit4BQ8Fits comfortably
126.00 tok/sEstimated
4GB (have 24GB)
Lmstudio Community Qwen3 4B Thinking 2507 Mlx 6bit4BFP16Fits comfortably
68.40 tok/sEstimated
8GB (have 24GB)
Lmstudio Community Qwen3 4B Thinking 2507 Mlx 8bit4BQ4Fits comfortably
180.00 tok/sEstimated
2GB (have 24GB)
Lmstudio Community Qwen3 4B Thinking 2507 Mlx 8bit4BQ8Fits comfortably
126.00 tok/sEstimated
4GB (have 24GB)
Lmstudio Community Qwen3 4B Thinking 2507 Mlx 8bit4BFP16Fits comfortably
68.40 tok/sEstimated
8GB (have 24GB)
Lmstudio Community Qwen3 Coder 30B A3b Instruct Mlx 4bit30BQ4Fits comfortably
99.00 tok/sEstimated
15GB (have 24GB)
Lmstudio Community Qwen3 Coder 30B A3b Instruct Mlx 4bit30BQ8Not supported
69.30 tok/sEstimated
30GB (have 24GB)
Lmstudio Community Qwen3 Coder 30B A3b Instruct Mlx 4bit30BFP16Not supported
37.62 tok/sEstimated
60GB (have 24GB)
Lmstudio Community Qwen3 Coder 30B A3b Instruct Mlx 5bit30BQ4Fits comfortably
99.00 tok/sEstimated
15GB (have 24GB)
Lmstudio Community Qwen3 Coder 30B A3b Instruct Mlx 5bit30BQ8Not supported
69.30 tok/sEstimated
30GB (have 24GB)
Lmstudio Community Qwen3 Coder 30B A3b Instruct Mlx 5bit30BFP16Not supported
37.62 tok/sEstimated
60GB (have 24GB)
Lmstudio Community Qwen3 Coder 30B A3b Instruct Mlx 6bit30BQ4Fits comfortably
99.00 tok/sEstimated
15GB (have 24GB)
Lmstudio Community Qwen3 Coder 30B A3b Instruct Mlx 6bit30BQ8Not supported
69.30 tok/sEstimated
30GB (have 24GB)
Lmstudio Community Qwen3 Coder 30B A3b Instruct Mlx 6bit30BFP16Not supported
37.62 tok/sEstimated
60GB (have 24GB)
Lmstudio Community Qwen3 Coder 30B A3b Instruct Mlx 8bit30BQ4Fits comfortably
99.00 tok/sEstimated
15GB (have 24GB)
Lmstudio Community Qwen3 Coder 30B A3b Instruct Mlx 8bit30BQ8Not supported
69.30 tok/sEstimated
30GB (have 24GB)
Lmstudio Community Qwen3 Coder 30B A3b Instruct Mlx 8bit30BFP16Not supported
37.62 tok/sEstimated
60GB (have 24GB)
Lmsys Vicuna 7B V1.57BQ4Fits comfortably
180.00 tok/sEstimated
4GB (have 24GB)
Lmsys Vicuna 7B V1.57BQ8Fits comfortably
126.00 tok/sEstimated
7GB (have 24GB)
Lmsys Vicuna 7B V1.57BFP16Fits comfortably
68.40 tok/sEstimated
14GB (have 24GB)
Meta Llama Llama 2 13B Chat HF13BQ4Fits comfortably
135.00 tok/sEstimated
7GB (have 24GB)
Meta Llama Llama 2 13B Chat HF13BQ8Fits comfortably
94.50 tok/sEstimated
13GB (have 24GB)
Meta Llama Llama 2 13B Chat HF13BFP16Not supported
51.30 tok/sEstimated
26GB (have 24GB)
01 AI Yi 1 5 34B ChatQ4
Size: 34B
Fits comfortably17GB required · 24GB available
63.00 tok/sEstimated
01 AI Yi 1 5 34B ChatQ8
Size: 34B
Not supported34GB required · 24GB available
44.10 tok/sEstimated
01 AI Yi 1 5 34B ChatFP16
Size: 34B
Not supported68GB required · 24GB available
23.94 tok/sEstimated
AI Forever Rugpt 3.5 13BQ4
Size: 13B
Fits comfortably7GB required · 24GB available
135.00 tok/sEstimated
AI Forever Rugpt 3.5 13BQ8
Size: 13B
Fits comfortably13GB required · 24GB available
94.50 tok/sEstimated
AI Forever Rugpt 3.5 13BFP16
Size: 13B
Not supported26GB required · 24GB available
51.30 tok/sEstimated
AI Mo Kimina Prover 72BQ4
Size: 72B
Not supported36GB required · 24GB available
36.00 tok/sEstimated
AI Mo Kimina Prover 72BQ8
Size: 72B
Not supported72GB required · 24GB available
25.20 tok/sEstimated
AI Mo Kimina Prover 72BFP16
Size: 72B
Not supported144GB required · 24GB available
13.68 tok/sEstimated
Alibaba Nlp Gte Qwen2 1.5B InstructQ4
Size: 1.5B
Fits comfortably1GB required · 24GB available
216.00 tok/sEstimated
Alibaba Nlp Gte Qwen2 1.5B InstructQ8
Size: 1.5B
Fits comfortably2GB required · 24GB available
151.20 tok/sEstimated
Alibaba Nlp Gte Qwen2 1.5B InstructFP16
Size: 1.5B
Fits comfortably3GB required · 24GB available
82.08 tok/sEstimated
Allenai Olmo 2 0425 1BQ4
Size: 1B
Fits comfortably1GB required · 24GB available
216.00 tok/sEstimated
Allenai Olmo 2 0425 1BQ8
Size: 1B
Fits comfortably1GB required · 24GB available
151.20 tok/sEstimated
Allenai Olmo 2 0425 1BFP16
Size: 1B
Fits comfortably2GB required · 24GB available
82.08 tok/sEstimated
Allenai Olmo 3 7B ThinkQ4
Size: 7B
Fits comfortably4GB required · 24GB available
180.00 tok/sEstimated
Allenai Olmo 3 7B ThinkQ8
Size: 7B
Fits comfortably7GB required · 24GB available
126.00 tok/sEstimated
Allenai Olmo 3 7B ThinkFP16
Size: 7B
Fits comfortably14GB required · 24GB available
68.40 tok/sEstimated
Apple Openelm 1 1B InstructQ4
Size: 1B
Fits comfortably1GB required · 24GB available
216.00 tok/sEstimated
Apple Openelm 1 1B InstructQ8
Size: 1B
Fits comfortably2GB required · 24GB available
151.20 tok/sEstimated
Apple Openelm 1 1B InstructFP16
Size: 1B
Fits comfortably3GB required · 24GB available
82.08 tok/sEstimated
Baichuan Inc Baichuan M2 32BQ4
Size: 32B
Fits comfortably16GB required · 24GB available
63.00 tok/sEstimated
Baichuan Inc Baichuan M2 32BQ8
Size: 32B
Not supported32GB required · 24GB available
44.10 tok/sEstimated
Baichuan Inc Baichuan M2 32BFP16
Size: 32B
Not supported64GB required · 24GB available
23.94 tok/sEstimated
Bigcode Starcoder2 3BQ4
Size: 3B
Fits comfortably2GB required · 24GB available
216.00 tok/sEstimated
Bigcode Starcoder2 3BQ8
Size: 3B
Fits comfortably3GB required · 24GB available
151.20 tok/sEstimated
Bigcode Starcoder2 3BFP16
Size: 3B
Fits comfortably6GB required · 24GB available
82.08 tok/sEstimated
Bigscience Bloomz 560MQ4
Size: Unknown
Fits comfortably1GB required · 24GB available
216.00 tok/sEstimated
Bigscience Bloomz 560MQ8
Size: Unknown
Fits comfortably1GB required · 24GB available
151.20 tok/sEstimated
Bigscience Bloomz 560MFP16
Size: Unknown
Fits comfortably2GB required · 24GB available
82.08 tok/sEstimated
Black Forest Labs Flux 1 DevQ4
Size: Unknown
Fits comfortably4GB required · 24GB available
180.00 tok/sEstimated
Black Forest Labs Flux 1 DevQ8
Size: Unknown
Fits comfortably7GB required · 24GB available
126.00 tok/sEstimated
Black Forest Labs Flux 1 DevFP16
Size: Unknown
Fits comfortably14GB required · 24GB available
68.40 tok/sEstimated
Black Forest Labs Flux 2 DevQ4
Size: Unknown
Fits comfortably1GB required · 24GB available
216.00 tok/sEstimated
Black Forest Labs Flux 2 DevQ8
Size: Unknown
Fits comfortably2GB required · 24GB available
151.20 tok/sEstimated
Black Forest Labs Flux 2 DevFP16
Size: Unknown
Fits comfortably4GB required · 24GB available
82.08 tok/sEstimated
Bsc Lt Salamandrata 7B InstructQ4
Size: 7B
Fits comfortably4GB required · 24GB available
180.00 tok/sEstimated
Bsc Lt Salamandrata 7B InstructQ8
Size: 7B
Fits comfortably7GB required · 24GB available
126.00 tok/sEstimated
Bsc Lt Salamandrata 7B InstructFP16
Size: 7B
Fits comfortably14GB required · 24GB available
68.40 tok/sEstimated
Codellama Codellama 34B HFQ4
Size: 34B
Fits comfortably17GB required · 24GB available
63.00 tok/sEstimated
Codellama Codellama 34B HFQ8
Size: 34B
Not supported34GB required · 24GB available
44.10 tok/sEstimated
Codellama Codellama 34B HFFP16
Size: 34B
Not supported68GB required · 24GB available
23.94 tok/sEstimated
Context Labs Meta Llama Llama 3.2 3B Instruct FP16Q4
Size: 3B
Fits comfortably2GB required · 24GB available
216.00 tok/sEstimated
Context Labs Meta Llama Llama 3.2 3B Instruct FP16Q8
Size: 3B
Fits comfortably3GB required · 24GB available
151.20 tok/sEstimated
Context Labs Meta Llama Llama 3.2 3B Instruct FP16FP16
Size: 3B
Fits comfortably6GB required · 24GB available
82.08 tok/sEstimated
Deepseek AI Deepseek Coder 1.3B InstructQ4
Size: 1.3B
Fits comfortably1GB required · 24GB available
270.00 tok/sEstimated
Deepseek AI Deepseek Coder 1.3B InstructQ8
Size: 1.3B
Fits comfortably2GB required · 24GB available
189.00 tok/sEstimated
Deepseek AI Deepseek Coder 1.3B InstructFP16
Size: 1.3B
Fits comfortably3GB required · 24GB available
102.60 tok/sEstimated
Deepseek AI Deepseek Coder 33B InstructQ4
Size: 33B
Fits comfortably17GB required · 24GB available
78.75 tok/sEstimated
Deepseek AI Deepseek Coder 33B InstructQ8
Size: 33B
Not supported33GB required · 24GB available
55.12 tok/sEstimated
Deepseek AI Deepseek Coder 33B InstructFP16
Size: 33B
Not supported66GB required · 24GB available
29.92 tok/sEstimated
Deepseek AI Deepseek Coder V2 Instruct 0724Q4
Size: Unknown
Not supported36GB required · 24GB available
45.00 tok/sEstimated
Deepseek AI Deepseek Coder V2 Instruct 0724Q8
Size: Unknown
Not supported72GB required · 24GB available
31.50 tok/sEstimated
Deepseek AI Deepseek Coder V2 Instruct 0724FP16
Size: Unknown
Not supported144GB required · 24GB available
17.10 tok/sEstimated
Deepseek AI Deepseek Coder V2 Lite InstructQ4
Size: Unknown
Fits comfortably1GB required · 24GB available
270.00 tok/sEstimated
Deepseek AI Deepseek Coder V2 Lite InstructQ8
Size: Unknown
Fits comfortably2GB required · 24GB available
189.00 tok/sEstimated
Deepseek AI Deepseek Coder V2 Lite InstructFP16
Size: Unknown
Fits comfortably4GB required · 24GB available
102.60 tok/sEstimated
Deepseek AI Deepseek Math V2Q4
Size: Unknown
Fits comfortably1GB required · 24GB available
270.00 tok/sEstimated
Deepseek AI Deepseek Math V2Q8
Size: Unknown
Fits comfortably2GB required · 24GB available
189.00 tok/sEstimated
Deepseek AI Deepseek Math V2FP16
Size: Unknown
Fits comfortably4GB required · 24GB available
102.60 tok/sEstimated
Deepseek AI Deepseek OcrQ4
Size: Unknown
Fits comfortably4GB required · 24GB available
225.00 tok/sEstimated
Deepseek AI Deepseek OcrQ8
Size: Unknown
Fits comfortably7GB required · 24GB available
157.50 tok/sEstimated
Deepseek AI Deepseek OcrFP16
Size: Unknown
Fits comfortably14GB required · 24GB available
85.50 tok/sEstimated
Deepseek AI Deepseek Ocr 2Q4
Size: Unknown
Fits comfortably1GB required · 24GB available
270.00 tok/sEstimated
Deepseek AI Deepseek Ocr 2Q8
Size: Unknown
Fits comfortably2GB required · 24GB available
189.00 tok/sEstimated
Deepseek AI Deepseek Ocr 2FP16
Size: Unknown
Fits comfortably4GB required · 24GB available
102.60 tok/sEstimated
Deepseek AI Deepseek R1Q4
Size: Unknown
Fits comfortably4GB required · 24GB available
225.00 tok/sEstimated
Deepseek AI Deepseek R1Q8
Size: Unknown
Fits comfortably7GB required · 24GB available
157.50 tok/sEstimated
Deepseek AI Deepseek R1FP16
Size: Unknown
Fits comfortably14GB required · 24GB available
85.50 tok/sEstimated
Deepseek AI Deepseek R1 0528Q4
Size: Unknown
Fits comfortably4GB required · 24GB available
225.00 tok/sEstimated
Deepseek AI Deepseek R1 0528Q8
Size: Unknown
Fits comfortably8GB required · 24GB available
157.50 tok/sEstimated
Deepseek AI Deepseek R1 0528FP16
Size: Unknown
Fits comfortably16GB required · 24GB available
85.50 tok/sEstimated
Deepseek AI Deepseek R1 Distill Llama 8BQ4
Size: 8B
Fits comfortably4GB required · 24GB available
225.00 tok/sEstimated
Deepseek AI Deepseek R1 Distill Llama 8BQ8
Size: 8B
Fits comfortably8GB required · 24GB available
157.50 tok/sEstimated
Deepseek AI Deepseek R1 Distill Llama 8BFP16
Size: 8B
Fits comfortably16GB required · 24GB available
85.50 tok/sEstimated
Deepseek AI Deepseek R1 Distill Qwen 1.5BQ4
Size: 1.5B
Fits comfortably1GB required · 24GB available
270.00 tok/sEstimated
Deepseek AI Deepseek R1 Distill Qwen 1.5BQ8
Size: 1.5B
Fits comfortably2GB required · 24GB available
189.00 tok/sEstimated
Deepseek AI Deepseek R1 Distill Qwen 1.5BFP16
Size: 1.5B
Fits comfortably3GB required · 24GB available
102.60 tok/sEstimated
Deepseek AI Deepseek R1 Distill Qwen 32BQ4
Size: 32B
Fits comfortably16GB required · 24GB available
78.75 tok/sEstimated
Deepseek AI Deepseek R1 Distill Qwen 32BQ8
Size: 32B
Not supported32GB required · 24GB available
55.12 tok/sEstimated
Deepseek AI Deepseek R1 Distill Qwen 32BFP16
Size: 32B
Not supported64GB required · 24GB available
29.92 tok/sEstimated
Deepseek AI Deepseek R1 Distill Qwen 7BQ4
Size: 7B
Fits comfortably4GB required · 24GB available
225.00 tok/sEstimated
Deepseek AI Deepseek R1 Distill Qwen 7BQ8
Size: 7B
Fits comfortably7GB required · 24GB available
157.50 tok/sEstimated
Deepseek AI Deepseek R1 Distill Qwen 7BFP16
Size: 7B
Fits comfortably14GB required · 24GB available
85.50 tok/sEstimated
Deepseek AI Deepseek V2 5Q4
Size: Unknown
Fits comfortably1GB required · 24GB available
270.00 tok/sEstimated
Deepseek AI Deepseek V2 5Q8
Size: Unknown
Fits comfortably2GB required · 24GB available
189.00 tok/sEstimated
Deepseek AI Deepseek V2 5FP16
Size: Unknown
Fits comfortably4GB required · 24GB available
102.60 tok/sEstimated
Deepseek AI Deepseek V3Q4
Size: Unknown
Fits comfortably2GB required · 24GB available
270.00 tok/sEstimated
Deepseek AI Deepseek V3Q8
Size: Unknown
Fits comfortably3GB required · 24GB available
189.00 tok/sEstimated
Deepseek AI Deepseek V3FP16
Size: Unknown
Fits comfortably6GB required · 24GB available
102.60 tok/sEstimated
Deepseek AI Deepseek V3 0324Q4
Size: Unknown
Fits comfortably16GB required · 24GB available
78.75 tok/sEstimated
Deepseek AI Deepseek V3 0324Q8
Size: Unknown
Not supported32GB required · 24GB available
55.12 tok/sEstimated
Deepseek AI Deepseek V3 0324FP16
Size: Unknown
Not supported64GB required · 24GB available
29.92 tok/sEstimated
Deepseek AI Deepseek V3.1Q4
Size: Unknown
Fits comfortably2GB required · 24GB available
270.00 tok/sEstimated
Deepseek AI Deepseek V3.1Q8
Size: Unknown
Fits comfortably3GB required · 24GB available
189.00 tok/sEstimated
Deepseek AI Deepseek V3.1FP16
Size: Unknown
Fits comfortably6GB required · 24GB available
102.60 tok/sEstimated
Dicta Il Dictalm2.0 InstructQ4
Size: Unknown
Fits comfortably1GB required · 24GB available
216.00 tok/sEstimated
Dicta Il Dictalm2.0 InstructQ8
Size: Unknown
Fits comfortably2GB required · 24GB available
151.20 tok/sEstimated
Dicta Il Dictalm2.0 InstructFP16
Size: Unknown
Fits comfortably4GB required · 24GB available
82.08 tok/sEstimated
Distilbert Distilgpt2Q4
Size: Unknown
Fits comfortably1GB required · 24GB available
216.00 tok/sEstimated
Distilbert Distilgpt2Q8
Size: Unknown
Fits comfortably2GB required · 24GB available
151.20 tok/sEstimated
Distilbert Distilgpt2FP16
Size: Unknown
Fits comfortably4GB required · 24GB available
82.08 tok/sEstimated
Dphn Dolphin 2.9.1 Yi 1.5 34BQ4
Size: 34B
Fits comfortably17GB required · 24GB available
63.00 tok/sEstimated
Dphn Dolphin 2.9.1 Yi 1.5 34BQ8
Size: 34B
Not supported34GB required · 24GB available
44.10 tok/sEstimated
Dphn Dolphin 2.9.1 Yi 1.5 34BFP16
Size: 34B
Not supported68GB required · 24GB available
23.94 tok/sEstimated
Eleutherai Gpt Neo 125MQ4
Size: Unknown
Fits comfortably1GB required · 24GB available
216.00 tok/sEstimated
Eleutherai Gpt Neo 125MQ8
Size: Unknown
Fits comfortably1GB required · 24GB available
151.20 tok/sEstimated
Eleutherai Gpt Neo 125MFP16
Size: Unknown
Fits comfortably1GB required · 24GB available
82.08 tok/sEstimated
Eleutherai Pythia 70M DedupedQ4
Size: Unknown
Fits comfortably1GB required · 24GB available
216.00 tok/sEstimated
Eleutherai Pythia 70M DedupedQ8
Size: Unknown
Fits comfortably1GB required · 24GB available
151.20 tok/sEstimated
Eleutherai Pythia 70M DedupedFP16
Size: Unknown
Fits comfortably1GB required · 24GB available
82.08 tok/sEstimated
Essentialai Rnj 1Q4
Size: Unknown
Fits comfortably4GB required · 24GB available
180.00 tok/sEstimated
Essentialai Rnj 1Q8
Size: Unknown
Fits comfortably7GB required · 24GB available
126.00 tok/sEstimated
Essentialai Rnj 1FP16
Size: Unknown
Fits comfortably14GB required · 24GB available
68.40 tok/sEstimated
Facebook Opt 125MQ4
Size: Unknown
Fits comfortably1GB required · 24GB available
216.00 tok/sEstimated
Facebook Opt 125MQ8
Size: Unknown
Fits comfortably1GB required · 24GB available
151.20 tok/sEstimated
Facebook Opt 125MFP16
Size: Unknown
Fits comfortably1GB required · 24GB available
82.08 tok/sEstimated
Facebook Sam3Q4
Size: Unknown
Fits comfortably2GB required · 24GB available
216.00 tok/sEstimated
Facebook Sam3Q8
Size: Unknown
Fits comfortably3GB required · 24GB available
151.20 tok/sEstimated
Facebook Sam3FP16
Size: Unknown
Fits comfortably6GB required · 24GB available
82.08 tok/sEstimated
Fireredteam Firered Image Edit 1 0Q4
Size: Unknown
Fits comfortably4GB required · 24GB available
180.00 tok/sEstimated
Fireredteam Firered Image Edit 1 0Q8
Size: Unknown
Fits comfortably7GB required · 24GB available
126.00 tok/sEstimated
Fireredteam Firered Image Edit 1 0FP16
Size: Unknown
Fits comfortably14GB required · 24GB available
68.40 tok/sEstimated
Gensyn Qwen2.5 0.5B InstructQ4
Size: 0.5B
Fits comfortably1GB required · 24GB available
216.00 tok/sEstimated
Gensyn Qwen2.5 0.5B InstructQ8
Size: 0.5B
Fits comfortably1GB required · 24GB available
151.20 tok/sEstimated
Gensyn Qwen2.5 0.5B InstructFP16
Size: 0.5B
Fits comfortably1GB required · 24GB available
82.08 tok/sEstimated
Google Bert Bert Base UncasedQ4
Size: Unknown
Fits comfortably4GB required · 24GB available
180.00 tok/sEstimated
Google Bert Bert Base UncasedQ8
Size: Unknown
Fits comfortably7GB required · 24GB available
126.00 tok/sEstimated
Google Bert Bert Base UncasedFP16
Size: Unknown
Fits comfortably14GB required · 24GB available
68.40 tok/sEstimated
Google Embeddinggemma 300MQ4
Size: Unknown
Fits comfortably1GB required · 24GB available
216.00 tok/sEstimated
Google Embeddinggemma 300MQ8
Size: Unknown
Fits comfortably1GB required · 24GB available
151.20 tok/sEstimated
Google Embeddinggemma 300MFP16
Size: Unknown
Fits comfortably1GB required · 24GB available
82.08 tok/sEstimated
Google Gemma 2 27B ItQ4
Size: 27B
Fits comfortably14GB required · 24GB available
99.00 tok/sEstimated
Google Gemma 2 27B ItQ8
Size: 27B
Not supported27GB required · 24GB available
69.30 tok/sEstimated
Google Gemma 2 27B ItFP16
Size: 27B
Not supported54GB required · 24GB available
37.62 tok/sEstimated
Google Gemma 2 2B ItQ4
Size: 2B
Fits comfortably1GB required · 24GB available
216.00 tok/sEstimated
Google Gemma 2 2B ItQ8
Size: 2B
Fits comfortably2GB required · 24GB available
151.20 tok/sEstimated
Google Gemma 2 2B ItFP16
Size: 2B
Fits comfortably4GB required · 24GB available
82.08 tok/sEstimated
Google Gemma 2 9B ItQ4
Size: 9B
Fits comfortably5GB required · 24GB available
135.00 tok/sEstimated
Google Gemma 2 9B ItQ8
Size: 9B
Fits comfortably9GB required · 24GB available
94.50 tok/sEstimated
Google Gemma 2 9B ItFP16
Size: 9B
Fits comfortably18GB required · 24GB available
51.30 tok/sEstimated
Google Gemma 2BQ4
Size: 2B
Fits comfortably1GB required · 24GB available
216.00 tok/sEstimated
Google Gemma 2BQ8
Size: 2B
Fits comfortably2GB required · 24GB available
151.20 tok/sEstimated
Google Gemma 2BFP16
Size: 2B
Fits comfortably4GB required · 24GB available
82.08 tok/sEstimated
Google Gemma 3 1B ItQ4
Size: 1B
Fits comfortably1GB required · 24GB available
216.00 tok/sEstimated
Google Gemma 3 1B ItQ8
Size: 1B
Fits comfortably1GB required · 24GB available
151.20 tok/sEstimated
Google Gemma 3 1B ItFP16
Size: 1B
Fits comfortably2GB required · 24GB available
82.08 tok/sEstimated
Google Gemma 3 270M ItQ4
Size: Unknown
Fits comfortably1GB required · 24GB available
216.00 tok/sEstimated
Google Gemma 3 270M ItQ8
Size: Unknown
Fits comfortably1GB required · 24GB available
151.20 tok/sEstimated
Google Gemma 3 270M ItFP16
Size: Unknown
Fits comfortably1GB required · 24GB available
82.08 tok/sEstimated
Google T5 T5 3BQ4
Size: 3B
Fits comfortably2GB required · 24GB available
216.00 tok/sEstimated
Google T5 T5 3BQ8
Size: 3B
Fits comfortably3GB required · 24GB available
151.20 tok/sEstimated
Google T5 T5 3BFP16
Size: 3B
Fits comfortably6GB required · 24GB available
82.08 tok/sEstimated
Gsai Ml Llada 8B BaseQ4
Size: 8B
Fits comfortably4GB required · 24GB available
180.00 tok/sEstimated
Gsai Ml Llada 8B BaseQ8
Size: 8B
Fits comfortably8GB required · 24GB available
126.00 tok/sEstimated
Gsai Ml Llada 8B BaseFP16
Size: 8B
Fits comfortably16GB required · 24GB available
68.40 tok/sEstimated
Gsai Ml Llada 8B InstructQ4
Size: 8B
Fits comfortably4GB required · 24GB available
180.00 tok/sEstimated
Gsai Ml Llada 8B InstructQ8
Size: 8B
Fits comfortably8GB required · 24GB available
126.00 tok/sEstimated
Gsai Ml Llada 8B InstructFP16
Size: 8B
Fits comfortably16GB required · 24GB available
68.40 tok/sEstimated
Hmellor Tiny Random LlamaforcausallmQ4
Size: Unknown
Fits comfortably1GB required · 24GB available
216.00 tok/sEstimated
Hmellor Tiny Random LlamaforcausallmQ8
Size: Unknown
Fits comfortably1GB required · 24GB available
151.20 tok/sEstimated
Hmellor Tiny Random LlamaforcausallmFP16
Size: Unknown
Fits comfortably1GB required · 24GB available
82.08 tok/sEstimated
Huggingfaceh4 Zephyr 7B BetaQ4
Size: 7B
Fits comfortably4GB required · 24GB available
180.00 tok/sEstimated
Huggingfaceh4 Zephyr 7B BetaQ8
Size: 7B
Fits comfortably7GB required · 24GB available
126.00 tok/sEstimated
Huggingfaceh4 Zephyr 7B BetaFP16
Size: 7B
Fits comfortably14GB required · 24GB available
68.40 tok/sEstimated
Huggingfacem4 Tiny Random LlamaforcausallmQ4
Size: Unknown
Fits comfortably2GB required · 24GB available
180.00 tok/sEstimated
Huggingfacem4 Tiny Random LlamaforcausallmQ8
Size: Unknown
Fits comfortably4GB required · 24GB available
126.00 tok/sEstimated
Huggingfacem4 Tiny Random LlamaforcausallmFP16
Size: Unknown
Fits comfortably8GB required · 24GB available
68.40 tok/sEstimated
Huggingfacetb Smollm 135MQ4
Size: Unknown
Fits comfortably1GB required · 24GB available
216.00 tok/sEstimated
Huggingfacetb Smollm 135MQ8
Size: Unknown
Fits comfortably1GB required · 24GB available
151.20 tok/sEstimated
Huggingfacetb Smollm 135MFP16
Size: Unknown
Fits comfortably1GB required · 24GB available
82.08 tok/sEstimated
Huggingfacetb Smollm2 135MQ4
Size: Unknown
Fits comfortably1GB required · 24GB available
216.00 tok/sEstimated
Huggingfacetb Smollm2 135MQ8
Size: Unknown
Fits comfortably1GB required · 24GB available
151.20 tok/sEstimated
Huggingfacetb Smollm2 135MFP16
Size: Unknown
Fits comfortably1GB required · 24GB available
82.08 tok/sEstimated
Huggyllama Llama 7BQ4
Size: 7B
Fits comfortably4GB required · 24GB available
180.00 tok/sEstimated
Huggyllama Llama 7BQ8
Size: 7B
Fits comfortably7GB required · 24GB available
126.00 tok/sEstimated
Huggyllama Llama 7BFP16
Size: 7B
Fits comfortably14GB required · 24GB available
68.40 tok/sEstimated
Ibm Granite Granite 3.3 2B InstructQ4
Size: 2B
Fits comfortably1GB required · 24GB available
216.00 tok/sEstimated
Ibm Granite Granite 3.3 2B InstructQ8
Size: 2B
Fits comfortably2GB required · 24GB available
151.20 tok/sEstimated
Ibm Granite Granite 3.3 2B InstructFP16
Size: 2B
Fits comfortably4GB required · 24GB available
82.08 tok/sEstimated
Ibm Granite Granite 3.3 8B InstructQ4
Size: 8B
Fits comfortably4GB required · 24GB available
180.00 tok/sEstimated
Ibm Granite Granite 3.3 8B InstructQ8
Size: 8B
Fits comfortably8GB required · 24GB available
126.00 tok/sEstimated
Ibm Granite Granite 3.3 8B InstructFP16
Size: 8B
Fits comfortably16GB required · 24GB available
68.40 tok/sEstimated
Ibm Granite Granite Docling 258MQ4
Size: Unknown
Fits comfortably1GB required · 24GB available
216.00 tok/sEstimated
Ibm Granite Granite Docling 258MQ8
Size: Unknown
Fits comfortably1GB required · 24GB available
151.20 tok/sEstimated
Ibm Granite Granite Docling 258MFP16
Size: Unknown
Fits comfortably1GB required · 24GB available
82.08 tok/sEstimated
Ibm Research Powermoe 3BQ4
Size: 3B
Fits comfortably2GB required · 24GB available
216.00 tok/sEstimated
Ibm Research Powermoe 3BQ8
Size: 3B
Fits comfortably3GB required · 24GB available
151.20 tok/sEstimated
Ibm Research Powermoe 3BFP16
Size: 3B
Fits comfortably6GB required · 24GB available
82.08 tok/sEstimated
Ilyagusev Saiga Llama3 8BQ4
Size: 8B
Fits comfortably2GB required · 24GB available
180.00 tok/sEstimated
Ilyagusev Saiga Llama3 8BQ8
Size: 8B
Fits comfortably4GB required · 24GB available
126.00 tok/sEstimated
Ilyagusev Saiga Llama3 8BFP16
Size: 8B
Fits comfortably8GB required · 24GB available
68.40 tok/sEstimated
Inference Net Schematron 3BQ4
Size: 3B
Fits comfortably2GB required · 24GB available
216.00 tok/sEstimated
Inference Net Schematron 3BQ8
Size: 3B
Fits comfortably3GB required · 24GB available
151.20 tok/sEstimated
Inference Net Schematron 3BFP16
Size: 3B
Fits comfortably6GB required · 24GB available
82.08 tok/sEstimated
Kaitchup Phi 3 Mini 4K Instruct Gptq 4bitQ4
Size: Unknown
Fits comfortably2GB required · 24GB available
180.00 tok/sEstimated
Kaitchup Phi 3 Mini 4K Instruct Gptq 4bitQ8
Size: Unknown
Fits comfortably4GB required · 24GB available
126.00 tok/sEstimated
Kaitchup Phi 3 Mini 4K Instruct Gptq 4bitFP16
Size: Unknown
Fits comfortably8GB required · 24GB available
68.40 tok/sEstimated
Liquidai Lfm2 1.2BQ4
Size: 1.2B
Fits comfortably1GB required · 24GB available
216.00 tok/sEstimated
Liquidai Lfm2 1.2BQ8
Size: 1.2B
Fits comfortably2GB required · 24GB available
151.20 tok/sEstimated
Liquidai Lfm2 1.2BFP16
Size: 1.2B
Fits comfortably3GB required · 24GB available
82.08 tok/sEstimated
Liuhaotian Llava V1.5 7BQ4
Size: 7B
Fits comfortably4GB required · 24GB available
180.00 tok/sEstimated
Liuhaotian Llava V1.5 7BQ8
Size: 7B
Fits comfortably7GB required · 24GB available
126.00 tok/sEstimated
Liuhaotian Llava V1.5 7BFP16
Size: 7B
Fits comfortably14GB required · 24GB available
68.40 tok/sEstimated
Llamafactory Tiny Random Llama 3Q4
Size: Unknown
Fits comfortably2GB required · 24GB available
216.00 tok/sEstimated
Llamafactory Tiny Random Llama 3Q8
Size: Unknown
Fits comfortably3GB required · 24GB available
151.20 tok/sEstimated
Llamafactory Tiny Random Llama 3FP16
Size: Unknown
Fits comfortably6GB required · 24GB available
82.08 tok/sEstimated
Lmstudio Community Deepseek R1 0528 Qwen3 8B Mlx 4bitQ4
Size: 8B
Fits comfortably4GB required · 24GB available
225.00 tok/sEstimated
Lmstudio Community Deepseek R1 0528 Qwen3 8B Mlx 4bitQ8
Size: 8B
Fits comfortably8GB required · 24GB available
157.50 tok/sEstimated
Lmstudio Community Deepseek R1 0528 Qwen3 8B Mlx 4bitFP16
Size: 8B
Fits comfortably16GB required · 24GB available
85.50 tok/sEstimated
Lmstudio Community Deepseek R1 0528 Qwen3 8B Mlx 8bitQ4
Size: 8B
Fits comfortably4GB required · 24GB available
225.00 tok/sEstimated
Lmstudio Community Deepseek R1 0528 Qwen3 8B Mlx 8bitQ8
Size: 8B
Fits comfortably8GB required · 24GB available
157.50 tok/sEstimated
Lmstudio Community Deepseek R1 0528 Qwen3 8B Mlx 8bitFP16
Size: 8B
Fits comfortably16GB required · 24GB available
85.50 tok/sEstimated
Lmstudio Community Qwen3 4B Thinking 2507 Mlx 4bitQ4
Size: 4B
Fits comfortably2GB required · 24GB available
180.00 tok/sEstimated
Lmstudio Community Qwen3 4B Thinking 2507 Mlx 4bitQ8
Size: 4B
Fits comfortably4GB required · 24GB available
126.00 tok/sEstimated
Lmstudio Community Qwen3 4B Thinking 2507 Mlx 4bitFP16
Size: 4B
Fits comfortably8GB required · 24GB available
68.40 tok/sEstimated
Lmstudio Community Qwen3 4B Thinking 2507 Mlx 6bitQ4
Size: 4B
Fits comfortably2GB required · 24GB available
180.00 tok/sEstimated
Lmstudio Community Qwen3 4B Thinking 2507 Mlx 6bitQ8
Size: 4B
Fits comfortably4GB required · 24GB available
126.00 tok/sEstimated
Lmstudio Community Qwen3 4B Thinking 2507 Mlx 6bitFP16
Size: 4B
Fits comfortably8GB required · 24GB available
68.40 tok/sEstimated
Lmstudio Community Qwen3 4B Thinking 2507 Mlx 8bitQ4
Size: 4B
Fits comfortably2GB required · 24GB available
180.00 tok/sEstimated
Lmstudio Community Qwen3 4B Thinking 2507 Mlx 8bitQ8
Size: 4B
Fits comfortably4GB required · 24GB available
126.00 tok/sEstimated
Lmstudio Community Qwen3 4B Thinking 2507 Mlx 8bitFP16
Size: 4B
Fits comfortably8GB required · 24GB available
68.40 tok/sEstimated
Lmstudio Community Qwen3 Coder 30B A3b Instruct Mlx 4bitQ4
Size: 30B
Fits comfortably15GB required · 24GB available
99.00 tok/sEstimated
Lmstudio Community Qwen3 Coder 30B A3b Instruct Mlx 4bitQ8
Size: 30B
Not supported30GB required · 24GB available
69.30 tok/sEstimated
Lmstudio Community Qwen3 Coder 30B A3b Instruct Mlx 4bitFP16
Size: 30B
Not supported60GB required · 24GB available
37.62 tok/sEstimated
Lmstudio Community Qwen3 Coder 30B A3b Instruct Mlx 5bitQ4
Size: 30B
Fits comfortably15GB required · 24GB available
99.00 tok/sEstimated
Lmstudio Community Qwen3 Coder 30B A3b Instruct Mlx 5bitQ8
Size: 30B
Not supported30GB required · 24GB available
69.30 tok/sEstimated
Lmstudio Community Qwen3 Coder 30B A3b Instruct Mlx 5bitFP16
Size: 30B
Not supported60GB required · 24GB available
37.62 tok/sEstimated
Lmstudio Community Qwen3 Coder 30B A3b Instruct Mlx 6bitQ4
Size: 30B
Fits comfortably15GB required · 24GB available
99.00 tok/sEstimated
Lmstudio Community Qwen3 Coder 30B A3b Instruct Mlx 6bitQ8
Size: 30B
Not supported30GB required · 24GB available
69.30 tok/sEstimated
Lmstudio Community Qwen3 Coder 30B A3b Instruct Mlx 6bitFP16
Size: 30B
Not supported60GB required · 24GB available
37.62 tok/sEstimated
Lmstudio Community Qwen3 Coder 30B A3b Instruct Mlx 8bitQ4
Size: 30B
Fits comfortably15GB required · 24GB available
99.00 tok/sEstimated
Lmstudio Community Qwen3 Coder 30B A3b Instruct Mlx 8bitQ8
Size: 30B
Not supported30GB required · 24GB available
69.30 tok/sEstimated
Lmstudio Community Qwen3 Coder 30B A3b Instruct Mlx 8bitFP16
Size: 30B
Not supported60GB required · 24GB available
37.62 tok/sEstimated
Lmsys Vicuna 7B V1.5Q4
Size: 7B
Fits comfortably4GB required · 24GB available
180.00 tok/sEstimated
Lmsys Vicuna 7B V1.5Q8
Size: 7B
Fits comfortably7GB required · 24GB available
126.00 tok/sEstimated
Lmsys Vicuna 7B V1.5FP16
Size: 7B
Fits comfortably14GB required · 24GB available
68.40 tok/sEstimated
Meta Llama Llama 2 13B Chat HFQ4
Size: 13B
Fits comfortably7GB required · 24GB available
135.00 tok/sEstimated
Meta Llama Llama 2 13B Chat HFQ8
Size: 13B
Fits comfortably13GB required · 24GB available
94.50 tok/sEstimated
Meta Llama Llama 2 13B Chat HFFP16
Size: 13B
Not supported26GB required · 24GB available
51.30 tok/sEstimated

Popular model checks on RTX 4090

Open direct compatibility pages for this GPU with VRAM fit and estimated speed.

Can rtx-4090 run xgen-universe-capybara?

Static benchmark coverage for xgen-universe-capybara on rtx-4090.

Can rtx-4090 run nineninesix-kani-tts-2-en?

Static benchmark coverage for nineninesix-kani-tts-2-en on rtx-4090.

Can rtx-4090 run fireredteam-firered-image-edit-1-0?

Static benchmark coverage for fireredteam-firered-image-edit-1-0 on rtx-4090.

Can rtx-4090 run nanbeige-nanbeige4-1-3b?

Static benchmark coverage for nanbeige-nanbeige4-1-3b on rtx-4090.

Can rtx-4090 run zai-org-glm-5?

Static benchmark coverage for zai-org-glm-5 on rtx-4090.

Can rtx-4090 run minimaxai-minimax-m2-5?

Static benchmark coverage for minimaxai-minimax-m2-5 on rtx-4090.

Can rtx-4090 run qwen-qwen3-tts-12hz-1-7b-customvoice?

Static benchmark coverage for qwen-qwen3-tts-12hz-1-7b-customvoice on rtx-4090.

Can rtx-4090 run minimaxai-minimax-m2-1?

Static benchmark coverage for minimaxai-minimax-m2-1 on rtx-4090.

Can rtx-4090 run microsoft-vibevoice-asr?

Static benchmark coverage for microsoft-vibevoice-asr on rtx-4090.

Can rtx-4090 run zai-org-glm-ocr?

Static benchmark coverage for zai-org-glm-ocr on rtx-4090.

Can rtx-4090 run zai-org-glm-4-7-flash?

Static benchmark coverage for zai-org-glm-4-7-flash on rtx-4090.

Can rtx-4090 run deepseek-ai-deepseek-ocr-2?

Static benchmark coverage for deepseek-ai-deepseek-ocr-2 on rtx-4090.

Browse all compatibility checks

Note: Performance estimates are calculated. Real results may vary. Methodology · Submit real data

Where to Buy

Buy directly on Amazon with fast shipping and reliable customer service.

No purchase links available yet. Try the Amazon search results to find this GPU.
Complete Your Build

Essential accessories to pair with RTX 4090

Corsair RM750x (2025) 750W
Minimum 750W recommended for RTX 40 series
$119
Find on Amazon
Corsair Vengeance 32GB DDR5-6000
32GB ideal for AI workloads
$129
Find on Amazon
Noctua NF-A12x25 PWM
Quiet and efficient cooling
$35
Find on Amazon
Thermal Grizzly Kryonaut
Premium thermal paste for optimal cooling
$15
Find on Amazon

Total Bundle Price

All items from Amazon

$797
Individual: $797
Find All on AmazonMore GPUs

💡 Not ready to buy? Try cloud GPUs first

Test RTX 4090 performance in the cloud before investing in hardware. Pay by the hour with no commitment.

Vast.aifrom $0.20/hrRunPodfrom $0.30/hrLambda Labsenterprise-grade

GPU FAQs

Data-backed answers pulled from community benchmarks, manufacturer specs, and live pricing.

What throughput does RTX 4090 deliver on modern 30B models?

Community llama.cpp benchmarks of the ubergarm/Qwen3-30B-A3B-GGUF build show the RTX 4090 sustaining roughly 150–160 tokens/sec with CUDA kernels, keeping decode latency under 7 ms per token.

Source: Reddit – /r/LocalLLaMA (mq59v1k)

Can a single RTX 4090 keep Llama 3.1 70B Q4 fully in VRAM?

No. Builders loading Llama 3.1 70B Q4_K_M report roughly half the tensor pages spilling to system RAM on a 24 GB 4090, which drags throughput because PCIe becomes the bottleneck. Multi-GPU setups or 48 GB cards avoid the spill.

Source: Reddit – /r/LocalLLaMA (mqcouez)

How many large models can RTX 4090 run simultaneously?

Power users running multi-4090 racks note that a single 4090 comfortably hosts one 32B-class model; parallel agents or MoE workloads need tensor parallelism across multiple GPUs to keep speeds high.

Source: Reddit – /r/LocalLLaMA (mqwkgv3)

What power supply and connectors does RTX 4090 require?

NVIDIA rates the RTX 4090 at 450 W board power and recommends at least an 850 W PSU with the 16-pin 12VHPWR connector to maintain headroom for AI workloads.

Source: TechPowerUp – RTX 4090 Specs

What is the current street price for RTX 4090?

Our price tracker (Nov 2025) shows Amazon at $1,599 in stock.

Source: Supabase price tracker snapshot – 2025-11-03

Alternative GPUs

RTX 4080
16GB

Explore how RTX 4080 stacks up for local inference workloads.

RTX 4070 Ti
12GB

Explore how RTX 4070 Ti stacks up for local inference workloads.

RTX 3090
24GB

Explore how RTX 3090 stacks up for local inference workloads.

RX 7900 XTX
24GB

Explore how RX 7900 XTX stacks up for local inference workloads.

RTX 4070
12GB

Explore how RTX 4070 stacks up for local inference workloads.

Can it play popular games?

Cyberpunk 2077
8GB VRAM

RPG • 2020

Baldur's Gate 3
8GB VRAM

RPG • 2023

Hogwarts Legacy
12GB VRAM

Action RPG • 2023

Starfield
8GB VRAM

RPG • 2023

Alan Wake 2
12GB VRAM

Survival Horror • 2023

Elden Ring
8GB VRAM

Action RPG • 2022

Black Myth: Wukong
12GB VRAM

Action RPG • 2024

Grand Theft Auto VI
12GB VRAM

Action Adventure • 2025

Resident Evil 4 Remake
12GB VRAM

Survival Horror • 2023

Marvel's Spider-Man Remastered
12GB VRAM

Action • 2022

The Last of Us Part I
12GB VRAM

Action Adventure • 2023

Red Dead Redemption 2
8GB VRAM

Action Adventure • 2019

View all 64 compatible games