NVIDIA RTX 4090 vs NVIDIA RTX 3090

How these GPUs compare for running local LLMs — VRAM, bandwidth, price, and per-model fit across popular open-weights models.

54 models compared24 GB vs 24 GB VRAM1008 GB/s vs 936 GB/s$1,999 vs $850
GPU A

NVIDIA RTX 4090

VRAM
24 GB
Bandwidth
1008 GB/s
Street price
$1,999
Vendor
nvidia
GPU BBetter value

NVIDIA RTX 3090

VRAM
24 GB
Bandwidth
936 GB/s
Street price
$850
Vendor
nvidia

The short answer

Both GPUs run all 33 models at nearly identical speeds (33 of 33 models are a tie, 0 models go to one side by a small margin). The specs show a bandwidth difference of just 8% — not enough to matter in practice. NVIDIA RTX 3090 costs $1,149 less (135% cheaper) — NVIDIA RTX 3090 is the better value.

33 equal (both run, <20% diff)
21 too large for either
NVIDIA RTX 3090 saves you $1,149 (135% cheaper) for the same real-world performance.

Model-by-model fit

Click any row for the full breakdown. Tie% shown in the Winner column when both GPUs run the model within 20% of each other.

ModelNVIDIA RTX 4090NVIDIA RTX 3090Winner
c4ai-command-r-v01 35B
35B · command
42 tok/s · Q4_K_M39 tok/s · Q4_K_M≈tie A +8%
Command-R+ 104B
104B · command
Too largeToo large
DeepSeek R1 Distill Llama 8B
8B · deepseek
52 tok/s · FP1648 tok/s · FP16≈tie A +8%
DeepSeek R1 Distill Qwen 14B
14.8B · deepseek
56 tok/s · Q8_052 tok/s · Q8_0≈tie A +8%
DeepSeek R1 Distill Llama 70B
70.6B · deepseek
Too largeToo large
DeepSeek R1 671B
671B · deepseek
Too largeToo large
DeepSeek-V3 685B
685B · deepseek
Too largeToo large
DeepSeek-V3.2 685.4B
685.4B · deepseek
Too largeToo large
gemma-2-9b
9.2B · gemma
45 tok/s · FP1642 tok/s · FP16≈tie A +7%
gemma-2-27b
27.2B · gemma
38 tok/s · Q6_K35 tok/s · Q6_K≈tie A +9%
Llama 3.1 8B Compact
8B · llama
47 tok/s · FP1643 tok/s · FP16≈tie A +9%
CodeLlama 34B
34B · llama
44 tok/s · Q4_K_M41 tok/s · Q4_K_M≈tie A +7%
CodeLlama 34B
34B · llama
44 tok/s · Q4_K_M41 tok/s · Q4_K_M≈tie A +7%
Llama 3.3 70B
70.6B · llama
Too largeToo large
Llama 3.1 70B
70.6B · llama
Too largeToo large
Llama 4 Scout 17B
109B · llama
Too largeToo large
Llama-4-Maverick-17B-128E
400B · llama
Too largeToo large
Llama 3.1 405B
405B · llama
Too largeToo large
Mistral 7B v0.1
7.25B · mistral
57 tok/s · FP1653 tok/s · FP16≈tie A +8%
Codestral 22B
22.2B · mistral
46 tok/s · Q6_K42 tok/s · Q6_K≈tie A +10%
Mixtral 8x7B Instruct v0.1
47B · mixtral
Too largeToo large
Mistral Large 2 123B
123B · mistral
Too largeToo large
Phi-4-mini 3.8B
3.8B · phi
106 tok/s · FP1698 tok/s · FP16≈tie A +8%
Phi-4 14B
14B · phi
53 tok/s · Q8_049 tok/s · Q8_0≈tie A +8%
Qwen 2.5 1.5B
1.5B · qwen
246 tok/s · FP16228 tok/s · FP16≈tie A +8%
Qwen 2.5 3B
3.1B · qwen
128 tok/s · FP16119 tok/s · FP16≈tie A +8%
Qwen3.5-4B
4.7B · qwen
87 tok/s · FP1680 tok/s · FP16≈tie A +9%
Qwen 2.5 7B
7.6B · qwen
55 tok/s · FP1651 tok/s · FP16≈tie A +8%
Qwen 2.5 7B
7.6B · qwen
55 tok/s · FP1651 tok/s · FP16≈tie A +8%
Qwen 3 8B
8B · qwen
47 tok/s · FP1643 tok/s · FP16≈tie A +9%
Qwen3.5-9B
9.7B · qwen
43 tok/s · FP1640 tok/s · FP16≈tie A +8%
Qwen 3 32B
32B · qwen
41 tok/s · Q4_K_M38 tok/s · Q4_K_M≈tie A +8%
Qwen3.5-35B-A3B
36B · qwen
41 tok/s · Q4_K_M38 tok/s · Q4_K_M≈tie A +8%
Qwen 2.5 72B
72.7B · qwen
Too largeToo large
Qwen 2.5 72B
72.7B · qwen
Too largeToo large
Llama 3.2 1B
1.24B · llama
300 tok/s · FP16279 tok/s · FP16≈tie A +8%
Llama 4 Scout 17B
109B · llama
Too largeToo large
DeepSeek R1 671B
671B · deepseek
Too largeToo large
Gemma 3 27B
27B · gemma
40 tok/s · Q5_K_M37 tok/s · Q5_K_M≈tie A +8%
Qwen 3 8B
8B · qwen
47 tok/s · FP1643 tok/s · FP16≈tie A +9%
Qwen 3 32B
32B · qwen
41 tok/s · Q4_K_M38 tok/s · Q4_K_M≈tie A +8%
Llama 3.1 8B Compact
8B · llama
47 tok/s · FP1643 tok/s · FP16≈tie A +9%
Mixtral 8x7B Instruct v0.1
47B · mixtral
Too largeToo large
Mistral Small 3.2 24B
24B · mistral
45 tok/s · Q6_K42 tok/s · Q6_K≈tie A +7%
Command A 111B
111B · command
Too largeToo large
DeepSeek R1 0528
685B · deepseek
Too largeToo large
DeepSeek-V3-0324
684.5B · deepseek
Too largeToo large
DeepSeek-R1-0528-Qwen3-8B
8.2B · qwen
53 tok/s · FP1650 tok/s · FP16≈tie A +6%
Qwen3-235B-A22B-Instruct-2507
235B · qwen
Too largeToo large
Qwen3-30B-A3B-Instruct-2507
30B · qwen
52 tok/s · Q4_K_M48 tok/s · Q4_K_M≈tie A +8%
Qwen3-4B-Instruct-2507
4B · qwen
110 tok/s · FP16102 tok/s · FP16≈tie A +8%
gemma-4-E4B-it
8B · gemma
55 tok/s · FP1651 tok/s · FP16≈tie A +8%
gemma-4-26B-A4B-it
26.5B · gemma
41 tok/s · Q6_K38 tok/s · Q6_K≈tie A +8%
gemma-4-31B-it
32.7B · gemma
48 tok/s · Q4_K_M44 tok/s · Q4_K_M≈tie A +9%

Want a different pairing? Browse all comparisons →

Stay ahead of local AI