BF16 (Brain Float 16)

Hardware

A 16-bit floating point format developed by Google with the same exponent range as FP32 — making it more numerically stable for training than FP16.

BF16 (Brain Float 16) keeps the 8-bit exponent of FP32 but reduces the mantissa to 7 bits. This preserves the full dynamic range of FP32 (avoiding the overflow/underflow issues of FP16) at the same memory footprint.

Comparison: FP32 vs FP16 vs BF16

FormatBitsExponentMantissaDynamic RangeUse case
FP3232823~1.2×10⁻³⁸ – 3.4×10³⁸Training (CPU/GPU)
FP1616510~6×10⁻⁸ – 6.5×10⁴GPU inference
BF161687same as FP32Training + inference on A100/H100

Hardware Support

BF16 is natively accelerated on NVIDIA A100, H100, H200, all Google TPUs, and AMD MI300X. Consumer cards (RTX 30xx, 40xx) support BF16 compute but at lower throughput than FP16 — check your card's specs before assuming BF16 is faster.

Why It Matters for On-Premise

If you own enterprise-grade hardware (A100, H100), loading models in BF16 gives you near-FP32 quality without the doubled memory cost. For most consumer-grade on-premise setups, FP16 is the better default since RTX 40xx cards are highly optimised for it. Running 7B models at BF16 requires ~14 GB VRAM — same as FP16 — but the numerical stability improvements make BF16 the preferred format for fine-tuning runs.