Back to Models

Qwen2.5-Math-1.5B-Instruct

Standard Transformer 1.5B Parameters

Model Specifications

Layers 28
Hidden Dimension 1,536
Attention Heads 12
KV Heads 2
Max Context 4K tokens
Vocabulary Size 151,936

VRAM Requirements

VRAM usage for all quantization and cache format combinations. Base overhead: 0.52 GB (CUDA context + activations).

Quantization Cache Format Model Weights 4K Context
FP16 16.0 bpw FP32 3.15 GB 3.88 GB (+0.22 KV)
FP16 16.0 bpw FP16 3.15 GB 3.77 GB (+0.11 KV)
FP16 16.0 bpw Q8_0 3.15 GB 3.73 GB (+0.06 KV)
FP16 16.0 bpw FP8 (Exp) 3.15 GB 3.72 GB (+0.05 KV)
FP16 16.0 bpw Q4_0 (Exp) 3.15 GB 3.7 GB (+0.03 KV)
Q8_0 8.0 bpw FP32 1.58 GB 2.31 GB (+0.22 KV)
Q8_0 8.0 bpw FP16 1.58 GB 2.2 GB (+0.11 KV)
Q8_0 8.0 bpw Q8_0 1.58 GB 2.15 GB (+0.06 KV)
Q8_0 8.0 bpw FP8 (Exp) 1.58 GB 2.14 GB (+0.05 KV)
Q8_0 8.0 bpw Q4_0 (Exp) 1.58 GB 2.12 GB (+0.03 KV)
Q4_K_M 4.65 bpw FP32 0.92 GB 1.65 GB (+0.22 KV)
Q4_K_M 4.65 bpw FP16 0.92 GB 1.54 GB (+0.11 KV)
Q4_K_M 4.65 bpw Q8_0 0.92 GB 1.49 GB (+0.06 KV)
Q4_K_M 4.65 bpw FP8 (Exp) 0.92 GB 1.49 GB (+0.05 KV)
Q4_K_M 4.65 bpw Q4_0 (Exp) 0.92 GB 1.46 GB (+0.03 KV)
Q4_K_S 4.58 bpw FP32 0.9 GB 1.64 GB (+0.22 KV)
Q4_K_S 4.58 bpw FP16 0.9 GB 1.53 GB (+0.11 KV)
Q4_K_S 4.58 bpw Q8_0 0.9 GB 1.48 GB (+0.06 KV)
Q4_K_S 4.58 bpw FP8 (Exp) 0.9 GB 1.47 GB (+0.05 KV)
Q4_K_S 4.58 bpw Q4_0 (Exp) 0.9 GB 1.45 GB (+0.03 KV)
Q3_K_M 3.91 bpw FP32 0.77 GB 1.5 GB (+0.22 KV)
Q3_K_M 3.91 bpw FP16 0.77 GB 1.39 GB (+0.11 KV)
Q3_K_M 3.91 bpw Q8_0 0.77 GB 1.34 GB (+0.06 KV)
Q3_K_M 3.91 bpw FP8 (Exp) 0.77 GB 1.34 GB (+0.05 KV)
Q3_K_M 3.91 bpw Q4_0 (Exp) 0.77 GB 1.32 GB (+0.03 KV)
Q2_K 2.63 bpw FP32 0.52 GB 1.25 GB (+0.22 KV)
Q2_K 2.63 bpw FP16 0.52 GB 1.14 GB (+0.11 KV)
Q2_K 2.63 bpw Q8_0 0.52 GB 1.09 GB (+0.06 KV)
Q2_K 2.63 bpw FP8 (Exp) 0.52 GB 1.09 GB (+0.05 KV)
Q2_K 2.63 bpw Q4_0 (Exp) 0.52 GB 1.07 GB (+0.03 KV)

Total VRAM = Model Weights + KV Cache + 0.52 GB overhead. Actual usage may vary ±5% based on inference engine and optimizations.

Check if your GPU can run Qwen2.5-Math-1.5B-Instruct

Use our calculator to see if this model fits your specific hardware configuration.