Back to Models

Kimi-K2-Base

MLA 1040.0B Parameters

Active Parameters: 32.0B

Model Specifications

Layers 61
Hidden Dimension 7,168
Attention Heads 64
Max Context 131K tokens
Vocabulary Size 163,840
KV LoRA Rank 512
RoPE Dimension 64

VRAM Requirements

VRAM usage for all quantization and cache format combinations. Base overhead: 1.5 GB (CUDA context + activations).

Quantization Cache Format Model Weights 8K Context 16K Context 32K Context 65K Context 131K Context
FP16 16.0 bpw FP32 2184.0 GB 2186.57 GB (+1.07 KV) 2187.64 GB (+2.14 KV) 2189.79 GB (+4.29 KV) 2194.08 GB (+8.58 KV) 2202.66 GB (+17.16 KV)
FP16 16.0 bpw FP16 2184.0 GB 2186.04 GB (+0.54 KV) 2186.57 GB (+1.07 KV) 2187.64 GB (+2.14 KV) 2189.79 GB (+4.29 KV) 2194.08 GB (+8.58 KV)
FP16 16.0 bpw Q8_0 2184.0 GB 2185.79 GB (+0.29 KV) 2186.09 GB (+0.59 KV) 2186.68 GB (+1.18 KV) 2187.86 GB (+2.36 KV) 2190.22 GB (+4.72 KV)
FP16 16.0 bpw FP8 (Exp) 2184.0 GB 2185.77 GB (+0.27 KV) 2186.04 GB (+0.54 KV) 2186.57 GB (+1.07 KV) 2187.64 GB (+2.14 KV) 2189.79 GB (+4.29 KV)
FP16 16.0 bpw Q4_0 (Exp) 2184.0 GB 2185.66 GB (+0.16 KV) 2185.82 GB (+0.32 KV) 2186.14 GB (+0.64 KV) 2186.79 GB (+1.29 KV) 2188.07 GB (+2.57 KV)
Q8_0 8.0 bpw FP32 1092.0 GB 1094.57 GB (+1.07 KV) 1095.64 GB (+2.14 KV) 1097.79 GB (+4.29 KV) 1102.08 GB (+8.58 KV) 1110.66 GB (+17.16 KV)
Q8_0 8.0 bpw FP16 1092.0 GB 1094.04 GB (+0.54 KV) 1094.57 GB (+1.07 KV) 1095.64 GB (+2.14 KV) 1097.79 GB (+4.29 KV) 1102.08 GB (+8.58 KV)
Q8_0 8.0 bpw Q8_0 1092.0 GB 1093.79 GB (+0.29 KV) 1094.09 GB (+0.59 KV) 1094.68 GB (+1.18 KV) 1095.86 GB (+2.36 KV) 1098.22 GB (+4.72 KV)
Q8_0 8.0 bpw FP8 (Exp) 1092.0 GB 1093.77 GB (+0.27 KV) 1094.04 GB (+0.54 KV) 1094.57 GB (+1.07 KV) 1095.64 GB (+2.14 KV) 1097.79 GB (+4.29 KV)
Q8_0 8.0 bpw Q4_0 (Exp) 1092.0 GB 1093.66 GB (+0.16 KV) 1093.82 GB (+0.32 KV) 1094.14 GB (+0.64 KV) 1094.79 GB (+1.29 KV) 1096.07 GB (+2.57 KV)
Q4_K_M 4.65 bpw FP32 634.73 GB 637.3 GB (+1.07 KV) 638.37 GB (+2.14 KV) 640.51 GB (+4.29 KV) 644.8 GB (+8.58 KV) 653.38 GB (+17.16 KV)
Q4_K_M 4.65 bpw FP16 634.73 GB 636.76 GB (+0.54 KV) 637.3 GB (+1.07 KV) 638.37 GB (+2.14 KV) 640.51 GB (+4.29 KV) 644.8 GB (+8.58 KV)
Q4_K_M 4.65 bpw Q8_0 634.73 GB 636.52 GB (+0.29 KV) 636.81 GB (+0.59 KV) 637.4 GB (+1.18 KV) 638.58 GB (+2.36 KV) 640.94 GB (+4.72 KV)
Q4_K_M 4.65 bpw FP8 (Exp) 634.73 GB 636.49 GB (+0.27 KV) 636.76 GB (+0.54 KV) 637.3 GB (+1.07 KV) 638.37 GB (+2.14 KV) 640.51 GB (+4.29 KV)
Q4_K_M 4.65 bpw Q4_0 (Exp) 634.73 GB 636.39 GB (+0.16 KV) 636.55 GB (+0.32 KV) 636.87 GB (+0.64 KV) 637.51 GB (+1.29 KV) 638.8 GB (+2.57 KV)
Q4_K_S 4.58 bpw FP32 625.17 GB 627.74 GB (+1.07 KV) 628.81 GB (+2.14 KV) 630.96 GB (+4.29 KV) 635.25 GB (+8.58 KV) 643.83 GB (+17.16 KV)
Q4_K_S 4.58 bpw FP16 625.17 GB 627.21 GB (+0.54 KV) 627.74 GB (+1.07 KV) 628.81 GB (+2.14 KV) 630.96 GB (+4.29 KV) 635.25 GB (+8.58 KV)
Q4_K_S 4.58 bpw Q8_0 625.17 GB 626.96 GB (+0.29 KV) 627.26 GB (+0.59 KV) 627.85 GB (+1.18 KV) 629.03 GB (+2.36 KV) 631.39 GB (+4.72 KV)
Q4_K_S 4.58 bpw FP8 (Exp) 625.17 GB 626.94 GB (+0.27 KV) 627.21 GB (+0.54 KV) 627.74 GB (+1.07 KV) 628.81 GB (+2.14 KV) 630.96 GB (+4.29 KV)
Q4_K_S 4.58 bpw Q4_0 (Exp) 625.17 GB 626.83 GB (+0.16 KV) 626.99 GB (+0.32 KV) 627.31 GB (+0.64 KV) 627.96 GB (+1.29 KV) 629.24 GB (+2.57 KV)
Q3_K_M 3.91 bpw FP32 533.72 GB 536.29 GB (+1.07 KV) 537.36 GB (+2.14 KV) 539.5 GB (+4.29 KV) 543.79 GB (+8.58 KV) 552.37 GB (+17.16 KV)
Q3_K_M 3.91 bpw FP16 533.72 GB 535.75 GB (+0.54 KV) 536.29 GB (+1.07 KV) 537.36 GB (+2.14 KV) 539.5 GB (+4.29 KV) 543.79 GB (+8.58 KV)
Q3_K_M 3.91 bpw Q8_0 533.72 GB 535.51 GB (+0.29 KV) 535.8 GB (+0.59 KV) 536.39 GB (+1.18 KV) 537.57 GB (+2.36 KV) 539.93 GB (+4.72 KV)
Q3_K_M 3.91 bpw FP8 (Exp) 533.72 GB 535.48 GB (+0.27 KV) 535.75 GB (+0.54 KV) 536.29 GB (+1.07 KV) 537.36 GB (+2.14 KV) 539.5 GB (+4.29 KV)
Q3_K_M 3.91 bpw Q4_0 (Exp) 533.72 GB 535.38 GB (+0.16 KV) 535.54 GB (+0.32 KV) 535.86 GB (+0.64 KV) 536.5 GB (+1.29 KV) 537.79 GB (+2.57 KV)
Q2_K 2.63 bpw FP32 359.0 GB 361.57 GB (+1.07 KV) 362.64 GB (+2.14 KV) 364.78 GB (+4.29 KV) 369.07 GB (+8.58 KV) 377.65 GB (+17.16 KV)
Q2_K 2.63 bpw FP16 359.0 GB 361.03 GB (+0.54 KV) 361.57 GB (+1.07 KV) 362.64 GB (+2.14 KV) 364.78 GB (+4.29 KV) 369.07 GB (+8.58 KV)
Q2_K 2.63 bpw Q8_0 359.0 GB 360.79 GB (+0.29 KV) 361.08 GB (+0.59 KV) 361.67 GB (+1.18 KV) 362.85 GB (+2.36 KV) 365.21 GB (+4.72 KV)
Q2_K 2.63 bpw FP8 (Exp) 359.0 GB 360.76 GB (+0.27 KV) 361.03 GB (+0.54 KV) 361.57 GB (+1.07 KV) 362.64 GB (+2.14 KV) 364.78 GB (+4.29 KV)
Q2_K 2.63 bpw Q4_0 (Exp) 359.0 GB 360.66 GB (+0.16 KV) 360.82 GB (+0.32 KV) 361.14 GB (+0.64 KV) 361.78 GB (+1.29 KV) 363.07 GB (+2.57 KV)

Total VRAM = Model Weights + KV Cache + 1.5 GB overhead. Actual usage may vary ±5% based on inference engine and optimizations.

Check if your GPU can run Kimi-K2-Base

Use our calculator to see if this model fits your specific hardware configuration.