Back to Models

Qwen2.5-Math-7B-Instruct

Standard Transformer 7.6B Parameters

Model Specifications

Layers 28
Hidden Dimension 3,584
Attention Heads 28
KV Heads 4
Max Context 4K tokens
Vocabulary Size 152,064

VRAM Requirements

VRAM usage for all quantization and cache format combinations. Base overhead: 0.58 GB (CUDA context + activations).

Quantization Cache Format Model Weights 4K Context
FP16 16.0 bpw FP32 15.96 GB 16.97 GB (+0.44 KV)
FP16 16.0 bpw FP16 15.96 GB 16.75 GB (+0.22 KV)
FP16 16.0 bpw Q8_0 15.96 GB 16.66 GB (+0.12 KV)
FP16 16.0 bpw FP8 (Exp) 15.96 GB 16.65 GB (+0.11 KV)
FP16 16.0 bpw Q4_0 (Exp) 15.96 GB 16.6 GB (+0.07 KV)
Q8_0 8.0 bpw FP32 7.98 GB 8.99 GB (+0.44 KV)
Q8_0 8.0 bpw FP16 7.98 GB 8.77 GB (+0.22 KV)
Q8_0 8.0 bpw Q8_0 7.98 GB 8.68 GB (+0.12 KV)
Q8_0 8.0 bpw FP8 (Exp) 7.98 GB 8.67 GB (+0.11 KV)
Q8_0 8.0 bpw Q4_0 (Exp) 7.98 GB 8.62 GB (+0.07 KV)
Q4_K_M 4.65 bpw FP32 4.64 GB 5.65 GB (+0.44 KV)
Q4_K_M 4.65 bpw FP16 4.64 GB 5.43 GB (+0.22 KV)
Q4_K_M 4.65 bpw Q8_0 4.64 GB 5.33 GB (+0.12 KV)
Q4_K_M 4.65 bpw FP8 (Exp) 4.64 GB 5.32 GB (+0.11 KV)
Q4_K_M 4.65 bpw Q4_0 (Exp) 4.64 GB 5.28 GB (+0.07 KV)
Q4_K_S 4.58 bpw FP32 4.57 GB 5.58 GB (+0.44 KV)
Q4_K_S 4.58 bpw FP16 4.57 GB 5.36 GB (+0.22 KV)
Q4_K_S 4.58 bpw Q8_0 4.57 GB 5.26 GB (+0.12 KV)
Q4_K_S 4.58 bpw FP8 (Exp) 4.57 GB 5.25 GB (+0.11 KV)
Q4_K_S 4.58 bpw Q4_0 (Exp) 4.57 GB 5.21 GB (+0.07 KV)
Q3_K_M 3.91 bpw FP32 3.9 GB 4.91 GB (+0.44 KV)
Q3_K_M 3.91 bpw FP16 3.9 GB 4.69 GB (+0.22 KV)
Q3_K_M 3.91 bpw Q8_0 3.9 GB 4.6 GB (+0.12 KV)
Q3_K_M 3.91 bpw FP8 (Exp) 3.9 GB 4.59 GB (+0.11 KV)
Q3_K_M 3.91 bpw Q4_0 (Exp) 3.9 GB 4.54 GB (+0.07 KV)
Q2_K 2.63 bpw FP32 2.62 GB 3.64 GB (+0.44 KV)
Q2_K 2.63 bpw FP16 2.62 GB 3.42 GB (+0.22 KV)
Q2_K 2.63 bpw Q8_0 2.62 GB 3.32 GB (+0.12 KV)
Q2_K 2.63 bpw FP8 (Exp) 2.62 GB 3.31 GB (+0.11 KV)
Q2_K 2.63 bpw Q4_0 (Exp) 2.62 GB 3.27 GB (+0.07 KV)

Total VRAM = Model Weights + KV Cache + 0.58 GB overhead. Actual usage may vary ±5% based on inference engine and optimizations.

Check if your GPU can run Qwen2.5-Math-7B-Instruct

Use our calculator to see if this model fits your specific hardware configuration.