StepFun AI is preparing to release Step-3.5-Flash-Base, a new language model, and promises further news to celebrate the Chinese New Year.

Optimizations and collaboration with NVIDIA

The team also announced that they are in contact with NVIDIA regarding the implementation of NVFP4, a quantization technique that could improve the model's efficiency. Work is also underway to optimize token usage, in response to user feedback.

For those evaluating on-premise deployments, there are performance and cost trade-offs that AI-RADAR analyzes in detail in the /llm-onpremise section.