Last week brought two European funding stories that, read together, draw a clear trajectory for anyone considering on-premise deployment of Large Language Models. In the Netherlands, Nearfield Instruments closed a $380 million round for its semiconductor metrology systems; across the Channel, the UK government committed £60 million to university AI labs with the explicit goal of making artificial intelligence cheaper. Neither announcement is about a new model or a serving framework, but both touch on two pain points of any local strategy: access to performant hardware and the cost of inference.

The weak link: chip manufacturing

Nearfield Instruments builds process-control machinery for advanced lithography, the stage where wafers are inspected with nanometre precision. Companies like ASML, the Dutch EUV lithography leader, and major GPU manufacturers rely on such tools to boost yield and reduce defects. The $380 million round – among the largest in semiconductor equipment – doesn’t produce AI chips directly, but it strengthens the supply chain’s ability to turn out growing numbers of advanced processors. For anyone designing on-premise infrastructure, the availability of GPUs with enough VRAM and memory bandwidth is still the main bottleneck. Every dollar flowing into chip-making technology has a downstream effect: shorter lead times and potentially lower procurement costs.

Slimming the inference bill

The second piece comes from the UK. The £60 million allocated to university labs targets a pragmatic goal: cutting the expense required to train and run language models. No single algorithm was named, but the research lines could span from aggressive quantization, to pruning, all the way to new, more parsimonious architectures. For a self-hosted deployment, every step forward in this area means consuming less VRAM for the same accuracy, or being able to run a larger model without swapping hardware. In a Total Cost of Ownership perspective, computational efficiency is the multiplier that decides whether a local infrastructure can compete with cloud APIs.

Sovereignty and affordability: a two-move game

These two capital injections – one into the physical chip factory, the other into research on less compute-hungry models – signal that the on-premise game is played simultaneously on the hardware and software fronts. Server vendors are beginning to offer configurations optimized for local inference, but the real push arrives when unit component costs drop and when an INT8 or INT4 quantized LLM retains enough quality for enterprise use cases. It is no coincidence that Europe is talking increasingly about “digital sovereignty” and sensitive data that must stay inside corporate firewalls: banks, public administrations, and healthcare require guarantees that the cloud alone struggles to provide. Cheaper chips and more efficient models lower the barrier to building local or hybrid clusters without surrendering data control.

Beyond the headlines

Of course, a single funding round and a research project don’t make a revolution. Yet those who watch the market closely know that the real thermometer is the supply chain: when a metrology equipment maker secures hundreds of millions, it means someone upstream is ready to invest in production capacity. And when a government opts to fund research to make AI “cheaper”, it sees an opportunity for technological autonomy. For those evaluating an on-premise deployment, the picture remains complex: in-house skills, orchestration, and maintenance are still required. But the signals suggest that the point when local inference becomes truly reachable for a mid-sized enterprise is approaching – perhaps faster than the headlines imply.