NVIDIA is developing its next language models Nemotron-3 Super and Ultra with an innovative technique: pre-training in FP4 format. This approach, which leverages the computing capabilities of NVIDIA GPUs, presents significant challenges due to the low numerical precision.

FP4 Pre-training

The use of FP4 format for pre-training is a novelty in the industry. NVIDIA aims to gain advantages in terms of performance and efficiency, thanks to the high FP4 throughput offered by its GPUs. However, training advanced language models with only four bits requires careful management of numerical issues.

Expected release and development model

The estimated release date for the Nemotron-3 Super and Ultra models is set for the first half of 2026. A peculiar aspect of NVIDIA's culture, which emerged during an interview, is its nature as a "company of volunteers." This is reflected in a decentralized development model, where teams self-organize and collaborate on projects like Nemotron.

For those evaluating on-premise deployments, there are trade-offs to consider. AI-RADAR offers analytical frameworks on /llm-onpremise to evaluate these aspects.