NVIDIA RTX 4080 SUPER vs AMD RX 7900 XTX

How these GPUs compare for running local LLMs — VRAM, bandwidth, price, and per-model fit across popular open-weights models.

54 models compared16 GB vs 24 GB VRAM736 GB/s vs 960 GB/s$1,295 vs $879
GPU A

NVIDIA RTX 4080 SUPER

VRAM
16 GB
Bandwidth
736 GB/s
Street price
$1,295
Vendor
nvidia
GPU BRuns more models

AMD RX 7900 XTX

VRAM
24 GB
Bandwidth
960 GB/s
Street price
$879
Vendor
amd

The short answer

AMD RX 7900 XTX can run 10 models that NVIDIA RTX 4080 SUPER can't fit in VRAM — mostly the larger models. For the 23 models both can handle, speeds are similar. If you want headroom for bigger models, AMD RX 7900 XTX is the clear choice.

10 models B runs, A can't
11 models A is 20%+ faster
7 models B is 20%+ faster
5 equal (both run, <20% diff)
21 too large for either

Model-by-model fit

Click any row for the full breakdown. Tie% shown in the Winner column when both GPUs run the model within 20% of each other.

ModelNVIDIA RTX 4080 SUPERAMD RX 7900 XTXWinner
c4ai-command-r-v01 35B
35B · command
Too large40 tok/s · Q4_K_MB (only)
Command-R+ 104B
104B · command
Too largeToo large
DeepSeek R1 Distill Llama 8B
8B · deepseek
74 tok/s · Q8_049 tok/s · FP16A (faster)
DeepSeek R1 Distill Qwen 14B
14.8B · deepseek
50 tok/s · Q6_K53 tok/s · Q8_0≈tie B +6%
DeepSeek R1 Distill Llama 70B
70.6B · deepseek
Too largeToo large
DeepSeek R1 671B
671B · deepseek
Too largeToo large
DeepSeek-V3 685B
685B · deepseek
Too largeToo large
DeepSeek-V3.2 685.4B
685.4B · deepseek
Too largeToo large
gemma-2-9b
9.2B · gemma
65 tok/s · Q8_043 tok/s · FP16A (faster)
gemma-2-27b
27.2B · gemma
Too large36 tok/s · Q6_KB (only)
Llama 3.1 8B Compact
8B · llama
68 tok/s · Q8_044 tok/s · FP16A (faster)
CodeLlama 34B
34B · llama
Too large42 tok/s · Q4_K_MB (only)
CodeLlama 34B
34B · llama
Too large42 tok/s · Q4_K_MB (only)
Llama 3.3 70B
70.6B · llama
Too largeToo large
Llama 3.1 70B
70.6B · llama
Too largeToo large
Llama 4 Scout 17B
109B · llama
Too largeToo large
Llama-4-Maverick-17B-128E
400B · llama
Too largeToo large
Llama 3.1 405B
405B · llama
Too largeToo large
Mistral 7B v0.1
7.25B · mistral
42 tok/s · FP1654 tok/s · FP16B (faster)
Codestral 22B
22.2B · mistral
47 tok/s · Q4_K_M43 tok/s · Q6_K≈tie A +9%
Mixtral 8x7B Instruct v0.1
47B · mixtral
Too largeToo large
Mistral Large 2 123B
123B · mistral
Too largeToo large
Phi-4-mini 3.8B
3.8B · phi
77 tok/s · FP16101 tok/s · FP16B (faster)
Phi-4 14B
14B · phi
48 tok/s · Q6_K51 tok/s · Q8_0≈tie B +6%
Qwen 2.5 1.5B
1.5B · qwen
179 tok/s · FP16234 tok/s · FP16B (faster)
Qwen 2.5 3B
3.1B · qwen
94 tok/s · FP16122 tok/s · FP16B (faster)
Qwen3.5-4B
4.7B · qwen
63 tok/s · FP1683 tok/s · FP16B (faster)
Qwen 2.5 7B
7.6B · qwen
77 tok/s · Q8_052 tok/s · FP16A (faster)
Qwen 2.5 7B
7.6B · qwen
77 tok/s · Q8_052 tok/s · FP16A (faster)
Qwen 3 8B
8B · qwen
68 tok/s · Q8_044 tok/s · FP16A (faster)
Qwen3.5-9B
9.7B · qwen
61 tok/s · Q8_041 tok/s · FP16A (faster)
Qwen 3 32B
32B · qwen
Too large39 tok/s · Q4_K_MB (only)
Qwen3.5-35B-A3B
36B · qwen
Too large39 tok/s · Q4_K_MB (only)
Qwen 2.5 72B
72.7B · qwen
Too largeToo large
Qwen 2.5 72B
72.7B · qwen
Too largeToo large
Llama 3.2 1B
1.24B · llama
219 tok/s · FP16286 tok/s · FP16B (faster)
Llama 4 Scout 17B
109B · llama
Too largeToo large
DeepSeek R1 671B
671B · deepseek
Too largeToo large
Gemma 3 27B
27B · gemma
Too large38 tok/s · Q5_K_MB (only)
Qwen 3 8B
8B · qwen
68 tok/s · Q8_044 tok/s · FP16A (faster)
Qwen 3 32B
32B · qwen
Too large39 tok/s · Q4_K_MB (only)
Llama 3.1 8B Compact
8B · llama
68 tok/s · Q8_044 tok/s · FP16A (faster)
Mixtral 8x7B Instruct v0.1
47B · mixtral
Too largeToo large
Mistral Small 3.2 24B
24B · mistral
47 tok/s · Q4_K_M43 tok/s · Q6_K≈tie A +9%
Command A 111B
111B · command
Too largeToo large
DeepSeek R1 0528
685B · deepseek
Too largeToo large
DeepSeek-V3-0324
684.5B · deepseek
Too largeToo large
DeepSeek-R1-0528-Qwen3-8B
8.2B · qwen
78 tok/s · Q8_051 tok/s · FP16A (faster)
Qwen3-235B-A22B-Instruct-2507
235B · qwen
Too largeToo large
Qwen3-30B-A3B-Instruct-2507
30B · qwen
Too large50 tok/s · Q4_K_MB (only)
Qwen3-4B-Instruct-2507
4B · qwen
80 tok/s · FP16104 tok/s · FP16B (faster)
gemma-4-E4B-it
8B · gemma
80 tok/s · Q8_052 tok/s · FP16A (faster)
gemma-4-26B-A4B-it
26.5B · gemma
43 tok/s · Q4_K_M39 tok/s · Q6_K≈tie A +10%
gemma-4-31B-it
32.7B · gemma
Too large45 tok/s · Q4_K_MB (only)

Want a different pairing? Browse all comparisons →

Stay ahead of local AI