A $28 million funding round might seem like just another data point in the AI investment carousel. But when the startup raising it tests voice agents before they speak to real people, the signal is clear: the maturity of voice models will be measured by their ability to handle the real world, not polished demos.
Coval, founded by a former Waymo engineer, announced the round led by Insight Partners with participation from other investors. The idea is as simple as it is urgent: a voice AI agent can sound flawless in a presentation, then stumble over accents, interruptions, background noise, or unexpected instructions during a real call. Systematic stress-testing is needed, just like for self-driving car software.
From driving simulators to voice
Founder Brooke Wenig helped build safety checks for Waymo’s robotaxis, environments where a single error can have dramatic consequences. She transferred that mindset to the voice domain: verifying robustness, coherence, and edge-case handling before deployment. Coval simulates thousands of voice interactions, injecting variables ranging from regional accents to abrupt interruptions, and measures the agent’s performance.
This approach echoes the “red teaming” principle applied to language models, but with a specific focus on real-time conversational experience. It’s not just about testing language understanding, but verifying the entire pipeline: speech synthesis, turn-taking, perceived latency, and resilience to failures.
On-premise: control amplifies the need for testing
For organizations considering running voice agents locally, the scenario gets more complex. An on-premise deployment, often driven by data sovereignty or compliance, shifts the whole infrastructure burden onto the company. Without a cloud provider handling redundancy and auto-scaling, every pipeline glitch — synthesis latency, inference runtime crash, context loss in the chat — falls squarely on the internal team.
Here, tools like those Coval proposes (or similar ones, if made available on-prem) become an asset for validating performance before going live. In an air-gapped environment where calls may contain sensitive data, you can’t rely solely on third-party cloud services for testing. The ability to simulate workloads and stress the agent directly on company servers reduces the risk of unpleasant surprises.
What this round signals is not just a bet on the growth of voice agents, but also a recognition that their reliability is not optional. For those who need to run Large Language Models and voice pipelines locally, the question is no longer “Can I make them work?” but “Can I ensure they work well every time they’re needed?”. And the answer lies in a simple idea borrowed from self-driving: test everything before you go.
💬 Comments (0)
🔒 Log in or register to comment on articles.
No comments yet. Be the first to comment!