Back to Models

Kimi-k2.5

MLA 1000.0B Parameters

Active Parameters: 32.0B

Model Specifications

Layers 61
Hidden Dimension 7,168
Attention Heads 64
Max Context 262K tokens
Vocabulary Size 163,840
KV LoRA Rank 512
RoPE Dimension 64

VRAM Requirements

VRAM usage for all quantization and cache format combinations. Base overhead: 1.5 GB (CUDA context + activations).

Quantization Cache Format Model Weights 8K Context 16K Context 32K Context 65K Context 131K Context 262K Context
FP16 16.0 bpw FP32 2100.0 GB 2102.57 GB (+1.07 KV) 2103.64 GB (+2.14 KV) 2105.79 GB (+4.29 KV) 2110.08 GB (+8.58 KV) 2118.66 GB (+17.16 KV) 2135.81 GB (+34.31 KV)
FP16 16.0 bpw FP16 2100.0 GB 2102.04 GB (+0.54 KV) 2102.57 GB (+1.07 KV) 2103.64 GB (+2.14 KV) 2105.79 GB (+4.29 KV) 2110.08 GB (+8.58 KV) 2118.66 GB (+17.16 KV)
FP16 16.0 bpw Q8_0 2100.0 GB 2101.79 GB (+0.29 KV) 2102.09 GB (+0.59 KV) 2102.68 GB (+1.18 KV) 2103.86 GB (+2.36 KV) 2106.22 GB (+4.72 KV) 2110.94 GB (+9.44 KV)
FP16 16.0 bpw FP8 (Exp) 2100.0 GB 2101.77 GB (+0.27 KV) 2102.04 GB (+0.54 KV) 2102.57 GB (+1.07 KV) 2103.64 GB (+2.14 KV) 2105.79 GB (+4.29 KV) 2110.08 GB (+8.58 KV)
FP16 16.0 bpw Q4_0 (Exp) 2100.0 GB 2101.66 GB (+0.16 KV) 2101.82 GB (+0.32 KV) 2102.14 GB (+0.64 KV) 2102.79 GB (+1.29 KV) 2104.07 GB (+2.57 KV) 2106.65 GB (+5.15 KV)
Q8_0 8.0 bpw FP32 1050.0 GB 1052.57 GB (+1.07 KV) 1053.64 GB (+2.14 KV) 1055.79 GB (+4.29 KV) 1060.08 GB (+8.58 KV) 1068.66 GB (+17.16 KV) 1085.81 GB (+34.31 KV)
Q8_0 8.0 bpw FP16 1050.0 GB 1052.04 GB (+0.54 KV) 1052.57 GB (+1.07 KV) 1053.64 GB (+2.14 KV) 1055.79 GB (+4.29 KV) 1060.08 GB (+8.58 KV) 1068.66 GB (+17.16 KV)
Q8_0 8.0 bpw Q8_0 1050.0 GB 1051.79 GB (+0.29 KV) 1052.09 GB (+0.59 KV) 1052.68 GB (+1.18 KV) 1053.86 GB (+2.36 KV) 1056.22 GB (+4.72 KV) 1060.94 GB (+9.44 KV)
Q8_0 8.0 bpw FP8 (Exp) 1050.0 GB 1051.77 GB (+0.27 KV) 1052.04 GB (+0.54 KV) 1052.57 GB (+1.07 KV) 1053.64 GB (+2.14 KV) 1055.79 GB (+4.29 KV) 1060.08 GB (+8.58 KV)
Q8_0 8.0 bpw Q4_0 (Exp) 1050.0 GB 1051.66 GB (+0.16 KV) 1051.82 GB (+0.32 KV) 1052.14 GB (+0.64 KV) 1052.79 GB (+1.29 KV) 1054.07 GB (+2.57 KV) 1056.65 GB (+5.15 KV)
Q4_K_M 4.65 bpw FP32 610.31 GB 612.88 GB (+1.07 KV) 613.96 GB (+2.14 KV) 616.1 GB (+4.29 KV) 620.39 GB (+8.58 KV) 628.97 GB (+17.16 KV) 646.12 GB (+34.31 KV)
Q4_K_M 4.65 bpw FP16 610.31 GB 612.35 GB (+0.54 KV) 612.88 GB (+1.07 KV) 613.96 GB (+2.14 KV) 616.1 GB (+4.29 KV) 620.39 GB (+8.58 KV) 628.97 GB (+17.16 KV)
Q4_K_M 4.65 bpw Q8_0 610.31 GB 612.11 GB (+0.29 KV) 612.4 GB (+0.59 KV) 612.99 GB (+1.18 KV) 614.17 GB (+2.36 KV) 616.53 GB (+4.72 KV) 621.25 GB (+9.44 KV)
Q4_K_M 4.65 bpw FP8 (Exp) 610.31 GB 612.08 GB (+0.27 KV) 612.35 GB (+0.54 KV) 612.88 GB (+1.07 KV) 613.96 GB (+2.14 KV) 616.1 GB (+4.29 KV) 620.39 GB (+8.58 KV)
Q4_K_M 4.65 bpw Q4_0 (Exp) 610.31 GB 611.97 GB (+0.16 KV) 612.13 GB (+0.32 KV) 612.46 GB (+0.64 KV) 613.1 GB (+1.29 KV) 614.39 GB (+2.57 KV) 616.96 GB (+5.15 KV)
Q4_K_S 4.58 bpw FP32 601.12 GB 603.7 GB (+1.07 KV) 604.77 GB (+2.14 KV) 606.91 GB (+4.29 KV) 611.2 GB (+8.58 KV) 619.78 GB (+17.16 KV) 636.94 GB (+34.31 KV)
Q4_K_S 4.58 bpw FP16 601.12 GB 603.16 GB (+0.54 KV) 603.7 GB (+1.07 KV) 604.77 GB (+2.14 KV) 606.91 GB (+4.29 KV) 611.2 GB (+8.58 KV) 619.78 GB (+17.16 KV)
Q4_K_S 4.58 bpw Q8_0 601.12 GB 602.92 GB (+0.29 KV) 603.21 GB (+0.59 KV) 603.8 GB (+1.18 KV) 604.98 GB (+2.36 KV) 607.34 GB (+4.72 KV) 612.06 GB (+9.44 KV)
Q4_K_S 4.58 bpw FP8 (Exp) 601.12 GB 602.89 GB (+0.27 KV) 603.16 GB (+0.54 KV) 603.7 GB (+1.07 KV) 604.77 GB (+2.14 KV) 606.91 GB (+4.29 KV) 611.2 GB (+8.58 KV)
Q4_K_S 4.58 bpw Q4_0 (Exp) 601.12 GB 602.79 GB (+0.16 KV) 602.95 GB (+0.32 KV) 603.27 GB (+0.64 KV) 603.91 GB (+1.29 KV) 605.2 GB (+2.57 KV) 607.77 GB (+5.15 KV)
Q3_K_M 3.91 bpw FP32 513.19 GB 515.76 GB (+1.07 KV) 516.83 GB (+2.14 KV) 518.98 GB (+4.29 KV) 523.27 GB (+8.58 KV) 531.84 GB (+17.16 KV) 549.0 GB (+34.31 KV)
Q3_K_M 3.91 bpw FP16 513.19 GB 515.22 GB (+0.54 KV) 515.76 GB (+1.07 KV) 516.83 GB (+2.14 KV) 518.98 GB (+4.29 KV) 523.27 GB (+8.58 KV) 531.84 GB (+17.16 KV)
Q3_K_M 3.91 bpw Q8_0 513.19 GB 514.98 GB (+0.29 KV) 515.28 GB (+0.59 KV) 515.87 GB (+1.18 KV) 517.05 GB (+2.36 KV) 519.41 GB (+4.72 KV) 524.12 GB (+9.44 KV)
Q3_K_M 3.91 bpw FP8 (Exp) 513.19 GB 514.96 GB (+0.27 KV) 515.22 GB (+0.54 KV) 515.76 GB (+1.07 KV) 516.83 GB (+2.14 KV) 518.98 GB (+4.29 KV) 523.27 GB (+8.58 KV)
Q3_K_M 3.91 bpw Q4_0 (Exp) 513.19 GB 514.85 GB (+0.16 KV) 515.01 GB (+0.32 KV) 515.33 GB (+0.64 KV) 515.97 GB (+1.29 KV) 517.26 GB (+2.57 KV) 519.83 GB (+5.15 KV)
Q2_K 2.63 bpw FP32 345.19 GB 347.76 GB (+1.07 KV) 348.83 GB (+2.14 KV) 350.98 GB (+4.29 KV) 355.27 GB (+8.58 KV) 363.84 GB (+17.16 KV) 381.0 GB (+34.31 KV)
Q2_K 2.63 bpw FP16 345.19 GB 347.22 GB (+0.54 KV) 347.76 GB (+1.07 KV) 348.83 GB (+2.14 KV) 350.98 GB (+4.29 KV) 355.27 GB (+8.58 KV) 363.84 GB (+17.16 KV)
Q2_K 2.63 bpw Q8_0 345.19 GB 346.98 GB (+0.29 KV) 347.28 GB (+0.59 KV) 347.87 GB (+1.18 KV) 349.05 GB (+2.36 KV) 351.41 GB (+4.72 KV) 356.12 GB (+9.44 KV)
Q2_K 2.63 bpw FP8 (Exp) 345.19 GB 346.96 GB (+0.27 KV) 347.22 GB (+0.54 KV) 347.76 GB (+1.07 KV) 348.83 GB (+2.14 KV) 350.98 GB (+4.29 KV) 355.27 GB (+8.58 KV)
Q2_K 2.63 bpw Q4_0 (Exp) 345.19 GB 346.85 GB (+0.16 KV) 347.01 GB (+0.32 KV) 347.33 GB (+0.64 KV) 347.97 GB (+1.29 KV) 349.26 GB (+2.57 KV) 351.83 GB (+5.15 KV)

Total VRAM = Model Weights + KV Cache + 1.5 GB overhead. Actual usage may vary ±5% based on inference engine and optimizations.

Check if your GPU can run Kimi-k2.5

Use our calculator to see if this model fits your specific hardware configuration.