Back to Models

Qwen3-Coder-480B-A35B-Instruct

Mixture of Experts 480.0B Parameters

Active Parameters: 35.0B

Model Specifications

Layers 62
Hidden Dimension 6,144
Attention Heads 96
KV Heads 8
Max Context 262K tokens
Vocabulary Size 151,936

VRAM Requirements

VRAM usage for all quantization and cache format combinations. Base overhead: 1.5 GB (CUDA context + activations).

Quantization Cache Format Model Weights 8K Context 16K Context 32K Context 65K Context 131K Context 262K Context
FP16 16.0 bpw FP32 1008.0 GB 1013.38 GB (+3.88 KV) 1017.25 GB (+7.75 KV) 1025.0 GB (+15.5 KV) 1040.5 GB (+31.0 KV) 1071.5 GB (+62.0 KV) 1133.5 GB (+124.0 KV)
FP16 16.0 bpw FP16 1008.0 GB 1011.44 GB (+1.94 KV) 1013.38 GB (+3.88 KV) 1017.25 GB (+7.75 KV) 1025.0 GB (+15.5 KV) 1040.5 GB (+31.0 KV) 1071.5 GB (+62.0 KV)
FP16 16.0 bpw Q8_0 1008.0 GB 1010.57 GB (+1.07 KV) 1011.63 GB (+2.13 KV) 1013.76 GB (+4.26 KV) 1018.02 GB (+8.53 KV) 1026.55 GB (+17.05 KV) 1043.6 GB (+34.1 KV)
FP16 16.0 bpw FP8 (Exp) 1008.0 GB 1010.47 GB (+0.97 KV) 1011.44 GB (+1.94 KV) 1013.38 GB (+3.88 KV) 1017.25 GB (+7.75 KV) 1025.0 GB (+15.5 KV) 1040.5 GB (+31.0 KV)
FP16 16.0 bpw Q4_0 (Exp) 1008.0 GB 1010.08 GB (+0.58 KV) 1010.66 GB (+1.16 KV) 1011.83 GB (+2.32 KV) 1014.15 GB (+4.65 KV) 1018.8 GB (+9.3 KV) 1028.1 GB (+18.6 KV)
Q8_0 8.0 bpw FP32 504.0 GB 509.38 GB (+3.88 KV) 513.25 GB (+7.75 KV) 521.0 GB (+15.5 KV) 536.5 GB (+31.0 KV) 567.5 GB (+62.0 KV) 629.5 GB (+124.0 KV)
Q8_0 8.0 bpw FP16 504.0 GB 507.44 GB (+1.94 KV) 509.38 GB (+3.88 KV) 513.25 GB (+7.75 KV) 521.0 GB (+15.5 KV) 536.5 GB (+31.0 KV) 567.5 GB (+62.0 KV)
Q8_0 8.0 bpw Q8_0 504.0 GB 506.57 GB (+1.07 KV) 507.63 GB (+2.13 KV) 509.76 GB (+4.26 KV) 514.02 GB (+8.53 KV) 522.55 GB (+17.05 KV) 539.6 GB (+34.1 KV)
Q8_0 8.0 bpw FP8 (Exp) 504.0 GB 506.47 GB (+0.97 KV) 507.44 GB (+1.94 KV) 509.38 GB (+3.88 KV) 513.25 GB (+7.75 KV) 521.0 GB (+15.5 KV) 536.5 GB (+31.0 KV)
Q8_0 8.0 bpw Q4_0 (Exp) 504.0 GB 506.08 GB (+0.58 KV) 506.66 GB (+1.16 KV) 507.82 GB (+2.32 KV) 510.15 GB (+4.65 KV) 514.8 GB (+9.3 KV) 524.1 GB (+18.6 KV)
Q4_K_M 4.65 bpw FP32 292.95 GB 298.32 GB (+3.88 KV) 302.2 GB (+7.75 KV) 309.95 GB (+15.5 KV) 325.45 GB (+31.0 KV) 356.45 GB (+62.0 KV) 418.45 GB (+124.0 KV)
Q4_K_M 4.65 bpw FP16 292.95 GB 296.39 GB (+1.94 KV) 298.32 GB (+3.88 KV) 302.2 GB (+7.75 KV) 309.95 GB (+15.5 KV) 325.45 GB (+31.0 KV) 356.45 GB (+62.0 KV)
Q4_K_M 4.65 bpw Q8_0 292.95 GB 295.52 GB (+1.07 KV) 296.58 GB (+2.13 KV) 298.71 GB (+4.26 KV) 302.97 GB (+8.53 KV) 311.5 GB (+17.05 KV) 328.55 GB (+34.1 KV)
Q4_K_M 4.65 bpw FP8 (Exp) 292.95 GB 295.42 GB (+0.97 KV) 296.39 GB (+1.94 KV) 298.32 GB (+3.88 KV) 302.2 GB (+7.75 KV) 309.95 GB (+15.5 KV) 325.45 GB (+31.0 KV)
Q4_K_M 4.65 bpw Q4_0 (Exp) 292.95 GB 295.03 GB (+0.58 KV) 295.61 GB (+1.16 KV) 296.77 GB (+2.32 KV) 299.1 GB (+4.65 KV) 303.75 GB (+9.3 KV) 313.05 GB (+18.6 KV)
Q4_K_S 4.58 bpw FP32 288.54 GB 293.92 GB (+3.88 KV) 297.79 GB (+7.75 KV) 305.54 GB (+15.5 KV) 321.04 GB (+31.0 KV) 352.04 GB (+62.0 KV) 414.04 GB (+124.0 KV)
Q4_K_S 4.58 bpw FP16 288.54 GB 291.98 GB (+1.94 KV) 293.92 GB (+3.88 KV) 297.79 GB (+7.75 KV) 305.54 GB (+15.5 KV) 321.04 GB (+31.0 KV) 352.04 GB (+62.0 KV)
Q4_K_S 4.58 bpw Q8_0 288.54 GB 291.11 GB (+1.07 KV) 292.17 GB (+2.13 KV) 294.3 GB (+4.26 KV) 298.56 GB (+8.53 KV) 307.09 GB (+17.05 KV) 324.14 GB (+34.1 KV)
Q4_K_S 4.58 bpw FP8 (Exp) 288.54 GB 291.01 GB (+0.97 KV) 291.98 GB (+1.94 KV) 293.92 GB (+3.88 KV) 297.79 GB (+7.75 KV) 305.54 GB (+15.5 KV) 321.04 GB (+31.0 KV)
Q4_K_S 4.58 bpw Q4_0 (Exp) 288.54 GB 290.62 GB (+0.58 KV) 291.2 GB (+1.16 KV) 292.37 GB (+2.32 KV) 294.69 GB (+4.65 KV) 299.34 GB (+9.3 KV) 308.64 GB (+18.6 KV)
Q3_K_M 3.91 bpw FP32 246.33 GB 251.71 GB (+3.88 KV) 255.58 GB (+7.75 KV) 263.33 GB (+15.5 KV) 278.83 GB (+31.0 KV) 309.83 GB (+62.0 KV) 371.83 GB (+124.0 KV)
Q3_K_M 3.91 bpw FP16 246.33 GB 249.77 GB (+1.94 KV) 251.71 GB (+3.88 KV) 255.58 GB (+7.75 KV) 263.33 GB (+15.5 KV) 278.83 GB (+31.0 KV) 309.83 GB (+62.0 KV)
Q3_K_M 3.91 bpw Q8_0 246.33 GB 248.9 GB (+1.07 KV) 249.96 GB (+2.13 KV) 252.09 GB (+4.26 KV) 256.36 GB (+8.53 KV) 264.88 GB (+17.05 KV) 281.93 GB (+34.1 KV)
Q3_K_M 3.91 bpw FP8 (Exp) 246.33 GB 248.8 GB (+0.97 KV) 249.77 GB (+1.94 KV) 251.71 GB (+3.88 KV) 255.58 GB (+7.75 KV) 263.33 GB (+15.5 KV) 278.83 GB (+31.0 KV)
Q3_K_M 3.91 bpw Q4_0 (Exp) 246.33 GB 248.41 GB (+0.58 KV) 248.99 GB (+1.16 KV) 250.16 GB (+2.32 KV) 252.48 GB (+4.65 KV) 257.13 GB (+9.3 KV) 266.43 GB (+18.6 KV)
Q2_K 2.63 bpw FP32 165.69 GB 171.06 GB (+3.88 KV) 174.94 GB (+7.75 KV) 182.69 GB (+15.5 KV) 198.19 GB (+31.0 KV) 229.19 GB (+62.0 KV) 291.19 GB (+124.0 KV)
Q2_K 2.63 bpw FP16 165.69 GB 169.13 GB (+1.94 KV) 171.06 GB (+3.88 KV) 174.94 GB (+7.75 KV) 182.69 GB (+15.5 KV) 198.19 GB (+31.0 KV) 229.19 GB (+62.0 KV)
Q2_K 2.63 bpw Q8_0 165.69 GB 168.26 GB (+1.07 KV) 169.32 GB (+2.13 KV) 171.45 GB (+4.26 KV) 175.72 GB (+8.53 KV) 184.24 GB (+17.05 KV) 201.29 GB (+34.1 KV)
Q2_K 2.63 bpw FP8 (Exp) 165.69 GB 168.16 GB (+0.97 KV) 169.13 GB (+1.94 KV) 171.06 GB (+3.88 KV) 174.94 GB (+7.75 KV) 182.69 GB (+15.5 KV) 198.19 GB (+31.0 KV)
Q2_K 2.63 bpw Q4_0 (Exp) 165.69 GB 167.77 GB (+0.58 KV) 168.35 GB (+1.16 KV) 169.51 GB (+2.32 KV) 171.84 GB (+4.65 KV) 176.49 GB (+9.3 KV) 185.79 GB (+18.6 KV)

Total VRAM = Model Weights + KV Cache + 1.5 GB overhead. Actual usage may vary ±5% based on inference engine and optimizations.

Check if your GPU can run Qwen3-Coder-480B-A35B-Instruct

Use our calculator to see if this model fits your specific hardware configuration.