Back to Models

Qwen3-235B-A22B-Instruct-2507

Mixture of Experts 235.0B Parameters

Active Parameters: 22.0B

Model Specifications

Layers 94
Hidden Dimension 4,096
Attention Heads 64
KV Heads 4
Max Context 262K tokens
Vocabulary Size 151,936

VRAM Requirements

VRAM usage for all quantization and cache format combinations. Base overhead: 1.5 GB (CUDA context + activations).

Quantization Cache Format Model Weights 8K Context 16K Context 32K Context 65K Context 131K Context 262K Context
FP16 16.0 bpw FP32 493.5 GB 497.94 GB (+2.94 KV) 500.88 GB (+5.88 KV) 506.75 GB (+11.75 KV) 518.5 GB (+23.5 KV) 542.0 GB (+47.0 KV) 589.0 GB (+94.0 KV)
FP16 16.0 bpw FP16 493.5 GB 496.47 GB (+1.47 KV) 497.94 GB (+2.94 KV) 500.88 GB (+5.88 KV) 506.75 GB (+11.75 KV) 518.5 GB (+23.5 KV) 542.0 GB (+47.0 KV)
FP16 16.0 bpw Q8_0 493.5 GB 495.81 GB (+0.81 KV) 496.62 GB (+1.62 KV) 498.23 GB (+3.23 KV) 501.46 GB (+6.46 KV) 507.93 GB (+12.93 KV) 520.85 GB (+25.85 KV)
FP16 16.0 bpw FP8 (Exp) 493.5 GB 495.73 GB (+0.73 KV) 496.47 GB (+1.47 KV) 497.94 GB (+2.94 KV) 500.88 GB (+5.88 KV) 506.75 GB (+11.75 KV) 518.5 GB (+23.5 KV)
FP16 16.0 bpw Q4_0 (Exp) 493.5 GB 495.44 GB (+0.44 KV) 495.88 GB (+0.88 KV) 496.76 GB (+1.76 KV) 498.52 GB (+3.52 KV) 502.05 GB (+7.05 KV) 509.1 GB (+14.1 KV)
Q8_0 8.0 bpw FP32 246.75 GB 251.19 GB (+2.94 KV) 254.12 GB (+5.88 KV) 260.0 GB (+11.75 KV) 271.75 GB (+23.5 KV) 295.25 GB (+47.0 KV) 342.25 GB (+94.0 KV)
Q8_0 8.0 bpw FP16 246.75 GB 249.72 GB (+1.47 KV) 251.19 GB (+2.94 KV) 254.12 GB (+5.88 KV) 260.0 GB (+11.75 KV) 271.75 GB (+23.5 KV) 295.25 GB (+47.0 KV)
Q8_0 8.0 bpw Q8_0 246.75 GB 249.06 GB (+0.81 KV) 249.87 GB (+1.62 KV) 251.48 GB (+3.23 KV) 254.71 GB (+6.46 KV) 261.18 GB (+12.93 KV) 274.1 GB (+25.85 KV)
Q8_0 8.0 bpw FP8 (Exp) 246.75 GB 248.98 GB (+0.73 KV) 249.72 GB (+1.47 KV) 251.19 GB (+2.94 KV) 254.12 GB (+5.88 KV) 260.0 GB (+11.75 KV) 271.75 GB (+23.5 KV)
Q8_0 8.0 bpw Q4_0 (Exp) 246.75 GB 248.69 GB (+0.44 KV) 249.13 GB (+0.88 KV) 250.01 GB (+1.76 KV) 251.78 GB (+3.52 KV) 255.3 GB (+7.05 KV) 262.35 GB (+14.1 KV)
Q4_K_M 4.65 bpw FP32 143.42 GB 147.86 GB (+2.94 KV) 150.8 GB (+5.88 KV) 156.67 GB (+11.75 KV) 168.42 GB (+23.5 KV) 191.92 GB (+47.0 KV) 238.92 GB (+94.0 KV)
Q4_K_M 4.65 bpw FP16 143.42 GB 146.39 GB (+1.47 KV) 147.86 GB (+2.94 KV) 150.8 GB (+5.88 KV) 156.67 GB (+11.75 KV) 168.42 GB (+23.5 KV) 191.92 GB (+47.0 KV)
Q4_K_M 4.65 bpw Q8_0 143.42 GB 145.73 GB (+0.81 KV) 146.54 GB (+1.62 KV) 148.15 GB (+3.23 KV) 151.39 GB (+6.46 KV) 157.85 GB (+12.93 KV) 170.77 GB (+25.85 KV)
Q4_K_M 4.65 bpw FP8 (Exp) 143.42 GB 145.66 GB (+0.73 KV) 146.39 GB (+1.47 KV) 147.86 GB (+2.94 KV) 150.8 GB (+5.88 KV) 156.67 GB (+11.75 KV) 168.42 GB (+23.5 KV)
Q4_K_M 4.65 bpw Q4_0 (Exp) 143.42 GB 145.36 GB (+0.44 KV) 145.8 GB (+0.88 KV) 146.69 GB (+1.76 KV) 148.45 GB (+3.52 KV) 151.97 GB (+7.05 KV) 159.02 GB (+14.1 KV)
Q4_K_S 4.58 bpw FP32 141.26 GB 145.7 GB (+2.94 KV) 148.64 GB (+5.88 KV) 154.51 GB (+11.75 KV) 166.26 GB (+23.5 KV) 189.76 GB (+47.0 KV) 236.76 GB (+94.0 KV)
Q4_K_S 4.58 bpw FP16 141.26 GB 144.23 GB (+1.47 KV) 145.7 GB (+2.94 KV) 148.64 GB (+5.88 KV) 154.51 GB (+11.75 KV) 166.26 GB (+23.5 KV) 189.76 GB (+47.0 KV)
Q4_K_S 4.58 bpw Q8_0 141.26 GB 143.57 GB (+0.81 KV) 144.38 GB (+1.62 KV) 146.0 GB (+3.23 KV) 149.23 GB (+6.46 KV) 155.69 GB (+12.93 KV) 168.61 GB (+25.85 KV)
Q4_K_S 4.58 bpw FP8 (Exp) 141.26 GB 143.5 GB (+0.73 KV) 144.23 GB (+1.47 KV) 145.7 GB (+2.94 KV) 148.64 GB (+5.88 KV) 154.51 GB (+11.75 KV) 166.26 GB (+23.5 KV)
Q4_K_S 4.58 bpw Q4_0 (Exp) 141.26 GB 143.21 GB (+0.44 KV) 143.65 GB (+0.88 KV) 144.53 GB (+1.76 KV) 146.29 GB (+3.52 KV) 149.81 GB (+7.05 KV) 156.86 GB (+14.1 KV)
Q3_K_M 3.91 bpw FP32 120.6 GB 125.04 GB (+2.94 KV) 127.97 GB (+5.88 KV) 133.85 GB (+11.75 KV) 145.6 GB (+23.5 KV) 169.1 GB (+47.0 KV) 216.1 GB (+94.0 KV)
Q3_K_M 3.91 bpw FP16 120.6 GB 123.57 GB (+1.47 KV) 125.04 GB (+2.94 KV) 127.97 GB (+5.88 KV) 133.85 GB (+11.75 KV) 145.6 GB (+23.5 KV) 169.1 GB (+47.0 KV)
Q3_K_M 3.91 bpw Q8_0 120.6 GB 122.91 GB (+0.81 KV) 123.71 GB (+1.62 KV) 125.33 GB (+3.23 KV) 128.56 GB (+6.46 KV) 135.02 GB (+12.93 KV) 147.95 GB (+25.85 KV)
Q3_K_M 3.91 bpw FP8 (Exp) 120.6 GB 122.83 GB (+0.73 KV) 123.57 GB (+1.47 KV) 125.04 GB (+2.94 KV) 127.97 GB (+5.88 KV) 133.85 GB (+11.75 KV) 145.6 GB (+23.5 KV)
Q3_K_M 3.91 bpw Q4_0 (Exp) 120.6 GB 122.54 GB (+0.44 KV) 122.98 GB (+0.88 KV) 123.86 GB (+1.76 KV) 125.62 GB (+3.52 KV) 129.15 GB (+7.05 KV) 136.2 GB (+14.1 KV)
Q2_K 2.63 bpw FP32 81.12 GB 85.56 GB (+2.94 KV) 88.49 GB (+5.88 KV) 94.37 GB (+11.75 KV) 106.12 GB (+23.5 KV) 129.62 GB (+47.0 KV) 176.62 GB (+94.0 KV)
Q2_K 2.63 bpw FP16 81.12 GB 84.09 GB (+1.47 KV) 85.56 GB (+2.94 KV) 88.49 GB (+5.88 KV) 94.37 GB (+11.75 KV) 106.12 GB (+23.5 KV) 129.62 GB (+47.0 KV)
Q2_K 2.63 bpw Q8_0 81.12 GB 83.43 GB (+0.81 KV) 84.23 GB (+1.62 KV) 85.85 GB (+3.23 KV) 89.08 GB (+6.46 KV) 95.54 GB (+12.93 KV) 108.47 GB (+25.85 KV)
Q2_K 2.63 bpw FP8 (Exp) 81.12 GB 83.35 GB (+0.73 KV) 84.09 GB (+1.47 KV) 85.56 GB (+2.94 KV) 88.49 GB (+5.88 KV) 94.37 GB (+11.75 KV) 106.12 GB (+23.5 KV)
Q2_K 2.63 bpw Q4_0 (Exp) 81.12 GB 83.06 GB (+0.44 KV) 83.5 GB (+0.88 KV) 84.38 GB (+1.76 KV) 86.14 GB (+3.52 KV) 89.67 GB (+7.05 KV) 96.72 GB (+14.1 KV)

Total VRAM = Model Weights + KV Cache + 1.5 GB overhead. Actual usage may vary ±5% based on inference engine and optimizations.

Check if your GPU can run Qwen3-235B-A22B-Instruct-2507

Use our calculator to see if this model fits your specific hardware configuration.