You don’t need a big-tech infrastructure to build a deep research agent that can compete with closed systems. The Natural Language Processing team at Ohio State University has just released QUEST-35B, a 35-billion-parameter LLM trained on roughly 32 H100 GPUs and a dataset of just 8,000 synthetic examples. The result is an agent capable of complex research tasks, question decomposition, and information synthesis—now fully open in every component.

Under 40 GPUs and a pocket-sized dataset

Behind QUEST-35B lies a training recipe that gives pause: around 32 NVIDIA H100s—a meaningful but not prohibitive compute footprint, equivalent to a cluster many universities or mid-size enterprises can afford bare-metal or rent for a few days in the cloud. The training data is entirely synthetic: 8,000 samples generated to teach the agent how to navigate sources, reason over multiple steps, and produce structured answers.

The team opted for complete openness. Weights, code, training configuration, and dataset are public. That means anyone can replicate the experiment, adapt the model to specific domains, or integrate it into self-hosted pipelines without relying on third-party APIs—a shift that puts control back in the hands of the user, a critical point for regulated industries or sensitive data environments.

Training cost matters, but inference matters more

For those evaluating on-premise deployment, QUEST-35B’s technical dossier offers an important lesson: the bulk of the expense is concentrated in the training phase. Once trained, a 35-billion-parameter model can run inference on much lighter hardware, especially when quantization techniques are applied. In an enterprise scenario, this translates to the ability to execute the agent entirely within one’s own data centers, at acceptable latencies, cutting operational TCO compared to pay-per-token cloud solutions.

The use of synthetic data raises another theme dear to AI-RADAR: data sovereignty. There’s no need to amass vast proprietary datasets to elicit complex emergent behaviors. An organization can generate its own examples internally, keeping intellectual property secure and meeting regulatory requirements like GDPR. QUEST-35B shows that the barrier to entry for advanced research agents is dropping dramatically.

The open-closed gap: where is the boundary now?

Reported benchmarks place QUEST-35B in direct competition with several frontier closed-source Deep Research systems. The quality gap in responses appears to be narrowing, but open questions remain about serving infrastructure and production robustness. Closed models benefit from optimized inference pipelines, global CDN networks, and integrations with vertical tool ecosystems. For open-source agents, the challenge shifts to engineering: efficient orchestrators, retrieval on enterprise knowledge bases, horizontal scalability.

This is where Ohio State’s work gains systemic value. By providing all the building blocks, it lets the community focus on improving reliability and building an ecosystem of interoperable tools. The next frontier for self-hosting research agents will likely be reducing end-to-end latency and enabling transparent integration with internal data sources, without sacrificing the quality we currently see only in closed systems.

QUEST-35B is more than a new model. It is a signal that on-premise can be the playing field for the next generation of AI agents, where code transparency and data control become competitive levers. For IT decision-makers, this release provides a concrete starting point to assess whether—and how quickly—to embed a self-contained research agent into their architecture.