The next AI won’t be powered by better models alone

San Francisco is buzzing with engineers and founders during the AI Engineer World’s Fair, yet the loudest conversation – about increasingly capable models – might be masking the real earthquake. Vytautas Savickas, CEO of Oxylabs, speaking from a privileged vantage point on global web scraping, paints a different picture: “For the past three years, AI has largely…” the sentence trails off, but the point is sharp. The next wave won’t be fueled solely by deeper network architectures or trillions of extra parameters. It will be the data – its provenance, freshness, cleanliness – that draws the line between a generic LLM and one that delivers reliable enterprise insights.

The oil is no longer the model

For years we witnessed an arms race: who released a model with more parameters, who nudged benchmarks up by a few percentage points. A complementary narrative is now emerging. Having a powerful engine without a steady supply of quality fuel is like owning a Ferrari with no oil. Oxylabs, which daily extracts petabytes from the public web, sits at the center of this silent transition. Companies that train or refine LLMs are increasingly hungry not only for GPUs, but for fresh, structured, legally collected information to feed retrieval-augmented generation (RAG), fine-tuning, and continuous updates of knowledge bases.

Beyond the black box: pipelines and sovereignty

Those choosing to keep their LLMs on-prem – for compliance, privacy, or cost control reasons – immediately feel the shockwave. A self-hosted model with no well-orchestrated data flow ages quickly, delivering answers that feel like faded photographs. The issue is not purely technical: when sensitive data must never leave the corporate perimeter, the ability to extract, clean, and index information from the open web (or proprietary sources) becomes a primary infrastructure skill, on par with managing Kubernetes clusters or quantizing models. It’s no accident that organizations with the strictest sovereignty requirements are investing in local crawlers, rotating proxies, and parsing engines running on-prem, far from cloud services.

The lesson for local deployments

Savickas’s insight, however sketchy in the available fragment, highlights a truth AI-RADAR has been observing for some time: the total cost of ownership (TCO) of an on-prem AI system is not measured in GPUs and VRAM alone. It includes the cost of keeping data alive. For a team evaluating a local deployment, the checklist must grow: how often do I refresh the data? How do I handle deduplication and quality? Are there backup sources if an endpoint goes silent? These questions shift the conversation from raw compute power to the resilience of the information pipeline. In this light, frameworks like Oxylabs (or equivalent open source solutions) become as strategic as the LLM serving runtime itself.

The competitive edge

What looms on the horizon is a market where models, however advanced, will tend toward commoditization. The real differentiation will lie in the ability to bend information, to shape it around the needs of a vertical domain without losing the rhythm of a web that changes every second. This isn’t science fiction: it’s the direction being taken by the most mature AI teams, those that test three different models in parallel but pour equal energy into data engineering. For those working on-prem, this evolution offers a subtle opportunity: owning the data stack means owning the context, making enterprise AI not only more accurate but also more defensible from a regulatory standpoint.