Netris raises $15M from a16z to untangle the networking that throttles GPU clouds

Netris has closed a $15 million Series A round led by Andreessen Horowitz. The news is more than another capital injection into an AI ecosystem startup: it shines a light on a technical bottleneck silently conditioning the race for GPU workloads, both in the cloud and – perhaps even more – in on-premises data centers. The California-based company automates the networking layer that stitches GPU clusters together, doing so on the back of 800% ARR growth and more than 35 live deployments worldwide.

The real bottleneck in AI infrastructure

When people talk about hardware acceleration for LLMs and distributed training, the focus is almost always on VRAM, memory bandwidth, or the compute capability of the cards. But any multi-node cluster pays a hidden cost: GPU-to-GPU communication. In environments with hundreds or thousands of accelerators, networking becomes the factor that dictates effective throughput. Even minimal delays in links – whether InfiniBand, RoCE, or NVLink – translate into wasted GPU time and synchronization queues that cripple the entire training or serving pipeline. Netris aims to solve exactly this: automated network management that reduces operational complexity and optimizes data flows.

Why it matters for on-premises deployments

The announcement comes at a moment when many organizations are evaluating moving inference and fine-tuning of models inside their own boundaries. The reasons are well known: data sovereignty, regulatory compliance (GDPR or sector-specific rules), and long-term TCO. But building an on-premises GPU cluster means facing exactly the same networking challenges that plague large cloud providers. Without a robust automation layer, the internal team risks getting bogged down in manual configuration of switches, routing, and load balancing, nullifying part of the benefits of self-hosting. Technology like Netris’s – or similar solutions – can become an indispensable piece to make operating local environments with tens or hundreds of GPUs sustainable.

East-west traffic and hidden complexity

In GPU data centers, traffic is not only user-to-service (north-south) but above all between nodes (east-west). Communication patterns such as all-reduce, typical of distributed training, generate packet bursts that stress network fabrics. Manual configuration of VLANs, ACLs, and QoS policies is artisanal work that does not scale. The automation promised by Netris intervenes on provisioning, segmentation, and monitoring, reducing the risk of human error and freeing up precious resources. For a company migrating from a centralized cloud environment to a private infrastructure, this difference can translate into months of recovered time and significantly lower operational costs.

Beyond a single vendor: what the market signals

The a16z round for Netris is not an isolated case. The entire AI infrastructure segment – networking, storage, orchestration – is attracting substantial investment, a sign that the market recognizes the gap between the raw power of GPUs and the ability to exploit it fully in production. For readers following AI-RADAR, the message is clear: when designing an on-premises LLM deployment, the choice of the network component is not secondary. In fact, it can determine the success or failure of a project aiming to maintain data control without sacrificing performance.

Looking ahead, network automation solutions like Netris’s could become a standard element in reference architectures for self-hosting large models. The direction is toward declarative infrastructure, where networking complexity is absorbed by software, leaving technical teams focused on application value. It’s not science fiction: it’s the next step to make on-premises AI truly competitive with cloud offerings.