Introduction

DeepSeek, the Hangzhou-based startup, has announced the release of preview versions of its latest Large Language Models: V4-Pro and V4-Flash. These new models have been made available on the Hugging Face platform, marking a significant step in the evolution of Open Source LLMs. DeepSeek's initiative is part of a rapidly evolving landscape where the availability of performant and accessible models is crucial for technological innovation and for deployment strategies in enterprise contexts.

The choice to release preview versions allows the developer community and businesses to begin exploring the capabilities of these models, providing valuable feedback that can guide future iterations. This collaborative approach is typical of the Open Source ecosystem and fosters rapid adoption and improvement of technologies.

Performance and Positioning

The V4-Pro model, in particular, comes with ambitious performance claims. According to DeepSeek's statements, V4-Pro excels in coding and mathematics tasks, positioning itself at the top among available Open Source models. This specialization makes it particularly interesting for development teams and researchers who require advanced capabilities in these specific areas.

In terms of general knowledge, the model ranks immediately behind Gemini 3.1-Pro, surpassing many other competitors in the LLM landscape. DeepSeek has also indicated that V4-Pro approaches the levels of GPT-5.4 and Gemini 3.1-Pro, with an estimated time gap of approximately three to six months. This assessment suggests rapid progression in development and significant future potential, indicating that DeepSeek is rapidly closing the gap with market leaders.

Implications for On-Premise Deployment

The Open Source nature of V4-Pro and V4-Flash makes them interesting candidates for organizations considering on-premise or hybrid deployment strategies. The ability to access the source code offers greater control over customization, security, and data sovereignty, which are fundamental aspects for regulated sectors or companies with specific compliance requirements. This is a key factor for CTOs and infrastructure architects who must balance performance and regulatory needs.

Deploying LLMs in self-hosted environments requires careful evaluation of the hardware infrastructure, including the VRAM available on GPUs, the throughput capacity to handle inference workloads, and quantization strategies to optimize resource utilization. For those evaluating on-premise deployment, AI-RADAR offers analytical frameworks on /llm-onpremise to assess the trade-offs between costs, performance, and control, providing tools for informed decisions without direct recommendations.

Future Prospects and Ecosystem Contribution

The release of these models by DeepSeek underscores the growing vitality of the Open Source LLM ecosystem. Offering performant alternatives to proprietary models stimulates innovation and democratizes access to advanced technologies, allowing a greater number of players to develop and implement AI-based solutions. This contributes to a more competitive and dynamic environment.

For CTOs and infrastructure architects, the availability of models like V4-Pro and V4-Flash means having more options to build robust and scalable AI solutions, balancing TCO (Total Cost of Ownership) with performance and control needs. The evolution of these models will be a key indicator of future directions in artificial intelligence, especially regarding the balance between proprietary and Open Source models.