Apple M2 Ultra (192GB) vs AMD RX 7900 XTX

How these GPUs compare for running local LLMs — VRAM, bandwidth, price, and per-model fit across popular open-weights models.

54 models compared192 GB vs 24 GB VRAM800 GB/s vs 960 GB/s$5,499 vs $879
GPU ARuns more models

Apple M2 Ultra (192GB)

VRAM
192 GB
Bandwidth
800 GB/s
Street price
$5,499
Vendor
apple
GPU B

AMD RX 7900 XTX

VRAM
24 GB
Bandwidth
960 GB/s
Street price
$879
Vendor
amd

The short answer

Apple M2 Ultra (192GB) can run 13 models that AMD RX 7900 XTX can't fit in VRAM — mostly the larger models. For the 33 models both can handle, speeds are similar. If you want headroom for bigger models, Apple M2 Ultra (192GB) is the clear choice.

13 models A runs, B can't
24 models B is 20%+ faster
9 equal (both run, <20% diff)
8 too large for either

Model-by-model fit

Click any row for the full breakdown. Tie% shown in the Winner column when both GPUs run the model within 20% of each other.

ModelApple M2 Ultra (192GB)AMD RX 7900 XTXWinner
c4ai-command-r-v01 35B
35B · command
10 tok/s · FP1640 tok/s · Q4_K_MB (faster)
Command-R+ 104B
104B · command
6 tok/s · Q8_0Too largeA (only)
DeepSeek R1 Distill Llama 8B
8B · deepseek
41 tok/s · FP1649 tok/s · FP16≈tie B +20%
DeepSeek R1 Distill Qwen 14B
14.8B · deepseek
23 tok/s · FP1653 tok/s · Q8_0B (faster)
DeepSeek R1 Distill Llama 70B
70.6B · deepseek
5 tok/s · FP16Too largeA (only)
DeepSeek R1 671B
671B · deepseek
Too largeToo large
DeepSeek-V3 685B
685B · deepseek
Too largeToo large
DeepSeek-V3.2 685.4B
685.4B · deepseek
Too largeToo large
gemma-2-9b
9.2B · gemma
36 tok/s · FP1643 tok/s · FP16≈tie B +19%
gemma-2-27b
27.2B · gemma
12 tok/s · FP1636 tok/s · Q6_KB (faster)
Llama 3.1 8B Compact
8B · llama
37 tok/s · FP1644 tok/s · FP16≈tie B +19%
CodeLlama 34B
34B · llama
10 tok/s · FP1642 tok/s · Q4_K_MB (faster)
CodeLlama 34B
34B · llama
10 tok/s · FP1642 tok/s · Q4_K_MB (faster)
Llama 3.3 70B
70.6B · llama
5 tok/s · FP16Too largeA (only)
Llama 3.1 70B
70.6B · llama
5 tok/s · FP16Too largeA (only)
Llama 4 Scout 17B
109B · llama
5 tok/s · Q8_0Too largeA (only)
Llama-4-Maverick-17B-128E
400B · llama
Too largeToo large
Llama 3.1 405B
405B · llama
Too largeToo large
Mistral 7B v0.1
7.25B · mistral
45 tok/s · FP1654 tok/s · FP16B (faster)
Codestral 22B
22.2B · mistral
15 tok/s · FP1643 tok/s · Q6_KB (faster)
Mixtral 8x7B Instruct v0.1
47B · mixtral
6 tok/s · FP16Too largeA (only)
Mistral Large 2 123B
123B · mistral
5 tok/s · Q8_0Too largeA (only)
Phi-4-mini 3.8B
3.8B · phi
84 tok/s · FP16101 tok/s · FP16B (faster)
Phi-4 14B
14B · phi
21 tok/s · FP1651 tok/s · Q8_0B (faster)
Qwen 2.5 1.5B
1.5B · qwen
195 tok/s · FP16234 tok/s · FP16B (faster)
Qwen 2.5 3B
3.1B · qwen
102 tok/s · FP16122 tok/s · FP16≈tie B +20%
Qwen3.5-4B
4.7B · qwen
69 tok/s · FP1683 tok/s · FP16B (faster)
Qwen 2.5 7B
7.6B · qwen
43 tok/s · FP1652 tok/s · FP16B (faster)
Qwen 2.5 7B
7.6B · qwen
43 tok/s · FP1652 tok/s · FP16B (faster)
Qwen 3 8B
8B · qwen
37 tok/s · FP1644 tok/s · FP16≈tie B +19%
Qwen3.5-9B
9.7B · qwen
34 tok/s · FP1641 tok/s · FP16B (faster)
Qwen 3 32B
32B · qwen
9 tok/s · FP1639 tok/s · Q4_K_MB (faster)
Qwen3.5-35B-A3B
36B · qwen
9 tok/s · FP1639 tok/s · Q4_K_MB (faster)
Qwen 2.5 72B
72.7B · qwen
5 tok/s · FP16Too largeA (only)
Qwen 2.5 72B
72.7B · qwen
5 tok/s · FP16Too largeA (only)
Llama 3.2 1B
1.24B · llama
238 tok/s · FP16286 tok/s · FP16B (faster)
Llama 4 Scout 17B
109B · llama
5 tok/s · Q8_0Too largeA (only)
DeepSeek R1 671B
671B · deepseek
Too largeToo large
Gemma 3 27B
27B · gemma
11 tok/s · FP1638 tok/s · Q5_K_MB (faster)
Qwen 3 8B
8B · qwen
37 tok/s · FP1644 tok/s · FP16≈tie B +19%
Qwen 3 32B
32B · qwen
9 tok/s · FP1639 tok/s · Q4_K_MB (faster)
Llama 3.1 8B Compact
8B · llama
37 tok/s · FP1644 tok/s · FP16≈tie B +19%
Mixtral 8x7B Instruct v0.1
47B · mixtral
6 tok/s · FP16Too largeA (only)
Mistral Small 3.2 24B
24B · mistral
15 tok/s · FP1643 tok/s · Q6_KB (faster)
Command A 111B
111B · command
6 tok/s · Q8_0Too largeA (only)
DeepSeek R1 0528
685B · deepseek
Too largeToo large
DeepSeek-V3-0324
684.5B · deepseek
Too largeToo large
DeepSeek-R1-0528-Qwen3-8B
8.2B · qwen
42 tok/s · FP1651 tok/s · FP16B (faster)
Qwen3-235B-A22B-Instruct-2507
235B · qwen
5 tok/s · Q4_K_MToo largeA (only)
Qwen3-30B-A3B-Instruct-2507
30B · qwen
23 tok/s · Q8_050 tok/s · Q4_K_MB (faster)
Qwen3-4B-Instruct-2507
4B · qwen
87 tok/s · FP16104 tok/s · FP16≈tie B +20%
gemma-4-E4B-it
8B · gemma
44 tok/s · FP1652 tok/s · FP16≈tie B +18%
gemma-4-26B-A4B-it
26.5B · gemma
26 tok/s · Q8_039 tok/s · Q6_KB (faster)
gemma-4-31B-it
32.7B · gemma
21 tok/s · Q8_045 tok/s · Q4_K_MB (faster)

Want a different pairing? Browse all comparisons →

Stay ahead of local AI