Back to Models

MiniMax-M2.1

Mixture of Experts 230.0B Parameters

Active Parameters: 10.0B

Model Specifications

Layers 62
Hidden Dimension 3,072
Attention Heads 48
KV Heads 8
Max Context 200K tokens
Vocabulary Size 200,064

VRAM Requirements

VRAM usage for all quantization and cache format combinations. Base overhead: 1.5 GB (CUDA context + activations).

Quantization Cache Format Model Weights 8K Context 16K Context 32K Context 65K Context 131K Context 200K Context
FP16 16.0 bpw FP32 483.0 GB 488.38 GB (+3.88 KV) 492.25 GB (+7.75 KV) 500.0 GB (+15.5 KV) 515.5 GB (+31.0 KV) 546.5 GB (+62.0 KV) 579.13 GB (+94.63 KV)
FP16 16.0 bpw FP16 483.0 GB 486.44 GB (+1.94 KV) 488.38 GB (+3.88 KV) 492.25 GB (+7.75 KV) 500.0 GB (+15.5 KV) 515.5 GB (+31.0 KV) 531.82 GB (+47.32 KV)
FP16 16.0 bpw Q8_0 483.0 GB 485.57 GB (+1.07 KV) 486.63 GB (+2.13 KV) 488.76 GB (+4.26 KV) 493.02 GB (+8.53 KV) 501.55 GB (+17.05 KV) 510.52 GB (+26.02 KV)
FP16 16.0 bpw FP8 (Exp) 483.0 GB 485.47 GB (+0.97 KV) 486.44 GB (+1.94 KV) 488.38 GB (+3.88 KV) 492.25 GB (+7.75 KV) 500.0 GB (+15.5 KV) 508.16 GB (+23.66 KV)
FP16 16.0 bpw Q4_0 (Exp) 483.0 GB 485.08 GB (+0.58 KV) 485.66 GB (+1.16 KV) 486.82 GB (+2.32 KV) 489.15 GB (+4.65 KV) 493.8 GB (+9.3 KV) 498.69 GB (+14.19 KV)
Q8_0 8.0 bpw FP32 241.5 GB 246.88 GB (+3.88 KV) 250.75 GB (+7.75 KV) 258.5 GB (+15.5 KV) 274.0 GB (+31.0 KV) 305.0 GB (+62.0 KV) 337.63 GB (+94.63 KV)
Q8_0 8.0 bpw FP16 241.5 GB 244.94 GB (+1.94 KV) 246.88 GB (+3.88 KV) 250.75 GB (+7.75 KV) 258.5 GB (+15.5 KV) 274.0 GB (+31.0 KV) 290.32 GB (+47.32 KV)
Q8_0 8.0 bpw Q8_0 241.5 GB 244.07 GB (+1.07 KV) 245.13 GB (+2.13 KV) 247.26 GB (+4.26 KV) 251.53 GB (+8.53 KV) 260.05 GB (+17.05 KV) 269.02 GB (+26.02 KV)
Q8_0 8.0 bpw FP8 (Exp) 241.5 GB 243.97 GB (+0.97 KV) 244.94 GB (+1.94 KV) 246.88 GB (+3.88 KV) 250.75 GB (+7.75 KV) 258.5 GB (+15.5 KV) 266.66 GB (+23.66 KV)
Q8_0 8.0 bpw Q4_0 (Exp) 241.5 GB 243.58 GB (+0.58 KV) 244.16 GB (+1.16 KV) 245.32 GB (+2.32 KV) 247.65 GB (+4.65 KV) 252.3 GB (+9.3 KV) 257.19 GB (+14.19 KV)
Q4_K_M 4.65 bpw FP32 140.37 GB 145.75 GB (+3.88 KV) 149.62 GB (+7.75 KV) 157.37 GB (+15.5 KV) 172.87 GB (+31.0 KV) 203.87 GB (+62.0 KV) 236.5 GB (+94.63 KV)
Q4_K_M 4.65 bpw FP16 140.37 GB 143.81 GB (+1.94 KV) 145.75 GB (+3.88 KV) 149.62 GB (+7.75 KV) 157.37 GB (+15.5 KV) 172.87 GB (+31.0 KV) 189.19 GB (+47.32 KV)
Q4_K_M 4.65 bpw Q8_0 140.37 GB 142.94 GB (+1.07 KV) 144.0 GB (+2.13 KV) 146.13 GB (+4.26 KV) 150.4 GB (+8.53 KV) 158.92 GB (+17.05 KV) 167.9 GB (+26.02 KV)
Q4_K_M 4.65 bpw FP8 (Exp) 140.37 GB 142.84 GB (+0.97 KV) 143.81 GB (+1.94 KV) 145.75 GB (+3.88 KV) 149.62 GB (+7.75 KV) 157.37 GB (+15.5 KV) 165.53 GB (+23.66 KV)
Q4_K_M 4.65 bpw Q4_0 (Exp) 140.37 GB 142.45 GB (+0.58 KV) 143.03 GB (+1.16 KV) 144.2 GB (+2.32 KV) 146.52 GB (+4.65 KV) 151.17 GB (+9.3 KV) 156.07 GB (+14.19 KV)
Q4_K_S 4.58 bpw FP32 138.26 GB 143.63 GB (+3.88 KV) 147.51 GB (+7.75 KV) 155.26 GB (+15.5 KV) 170.76 GB (+31.0 KV) 201.76 GB (+62.0 KV) 234.39 GB (+94.63 KV)
Q4_K_S 4.58 bpw FP16 138.26 GB 141.7 GB (+1.94 KV) 143.63 GB (+3.88 KV) 147.51 GB (+7.75 KV) 155.26 GB (+15.5 KV) 170.76 GB (+31.0 KV) 187.08 GB (+47.32 KV)
Q4_K_S 4.58 bpw Q8_0 138.26 GB 140.82 GB (+1.07 KV) 141.89 GB (+2.13 KV) 144.02 GB (+4.26 KV) 148.28 GB (+8.53 KV) 156.81 GB (+17.05 KV) 165.78 GB (+26.02 KV)
Q4_K_S 4.58 bpw FP8 (Exp) 138.26 GB 140.73 GB (+0.97 KV) 141.7 GB (+1.94 KV) 143.63 GB (+3.88 KV) 147.51 GB (+7.75 KV) 155.26 GB (+15.5 KV) 163.42 GB (+23.66 KV)
Q4_K_S 4.58 bpw Q4_0 (Exp) 138.26 GB 140.34 GB (+0.58 KV) 140.92 GB (+1.16 KV) 142.08 GB (+2.32 KV) 144.41 GB (+4.65 KV) 149.06 GB (+9.3 KV) 153.95 GB (+14.19 KV)
Q3_K_M 3.91 bpw FP32 118.03 GB 123.41 GB (+3.88 KV) 127.28 GB (+7.75 KV) 135.03 GB (+15.5 KV) 150.53 GB (+31.0 KV) 181.53 GB (+62.0 KV) 214.17 GB (+94.63 KV)
Q3_K_M 3.91 bpw FP16 118.03 GB 121.47 GB (+1.94 KV) 123.41 GB (+3.88 KV) 127.28 GB (+7.75 KV) 135.03 GB (+15.5 KV) 150.53 GB (+31.0 KV) 166.85 GB (+47.32 KV)
Q3_K_M 3.91 bpw Q8_0 118.03 GB 120.6 GB (+1.07 KV) 121.66 GB (+2.13 KV) 123.8 GB (+4.26 KV) 128.06 GB (+8.53 KV) 136.58 GB (+17.05 KV) 145.56 GB (+26.02 KV)
Q3_K_M 3.91 bpw FP8 (Exp) 118.03 GB 120.5 GB (+0.97 KV) 121.47 GB (+1.94 KV) 123.41 GB (+3.88 KV) 127.28 GB (+7.75 KV) 135.03 GB (+15.5 KV) 143.19 GB (+23.66 KV)
Q3_K_M 3.91 bpw Q4_0 (Exp) 118.03 GB 120.11 GB (+0.58 KV) 120.7 GB (+1.16 KV) 121.86 GB (+2.32 KV) 124.18 GB (+4.65 KV) 128.83 GB (+9.3 KV) 133.73 GB (+14.19 KV)
Q2_K 2.63 bpw FP32 79.39 GB 84.77 GB (+3.88 KV) 88.64 GB (+7.75 KV) 96.39 GB (+15.5 KV) 111.89 GB (+31.0 KV) 142.89 GB (+62.0 KV) 175.53 GB (+94.63 KV)
Q2_K 2.63 bpw FP16 79.39 GB 82.83 GB (+1.94 KV) 84.77 GB (+3.88 KV) 88.64 GB (+7.75 KV) 96.39 GB (+15.5 KV) 111.89 GB (+31.0 KV) 128.21 GB (+47.32 KV)
Q2_K 2.63 bpw Q8_0 79.39 GB 81.96 GB (+1.07 KV) 83.02 GB (+2.13 KV) 85.16 GB (+4.26 KV) 89.42 GB (+8.53 KV) 97.94 GB (+17.05 KV) 106.92 GB (+26.02 KV)
Q2_K 2.63 bpw FP8 (Exp) 79.39 GB 81.86 GB (+0.97 KV) 82.83 GB (+1.94 KV) 84.77 GB (+3.88 KV) 88.64 GB (+7.75 KV) 96.39 GB (+15.5 KV) 104.55 GB (+23.66 KV)
Q2_K 2.63 bpw Q4_0 (Exp) 79.39 GB 81.47 GB (+0.58 KV) 82.06 GB (+1.16 KV) 83.22 GB (+2.32 KV) 85.54 GB (+4.65 KV) 90.19 GB (+9.3 KV) 95.09 GB (+14.19 KV)

Total VRAM = Model Weights + KV Cache + 1.5 GB overhead. Actual usage may vary ±5% based on inference engine and optimizations.

Check if your GPU can run MiniMax-M2.1

Use our calculator to see if this model fits your specific hardware configuration.