Starling Bank has announced the cut of around 130 jobs, a restructuring aimed at eliminating duplication and speeding up product delivery. The London-based institution – part of the challenger bank wave – plans to push artificial intelligence deeper into internal processes, from risk management to customer service automation.

The news, reported by The Next Web, only scratches the surface of a broader transformation. As digital banks chase operational efficiency, the silent question remains: where will the language and analytical models that replace (or assist) human work be deployed? The choice between cloud and on-premise infrastructure carries double weight when banking data covered by regulations such as GDPR and PSD2 is at stake.

The silent infrastructure battle

For a financial institution, every deployment decision must balance implementation speed and control. Cloud services provide ready-made APIs and reduce time-to-market, but they force sensitive data to travel outside the corporate perimeter. In Europe, data residency is not optional: supervisory authorities demand precise guarantees, and a compliance incident can cost more than any upfront investment.

That is why self-hosted is regaining ground. Running LLMs on in-house servers, perhaps with dedicated GPUs in an air-gapped environment, allows firms to maintain full sovereignty and prevent third parties from accessing critical information. It is not an obstacle-free path: hardware investments, fine-tuning expertise, and ongoing maintenance are required. Yet for continuous workloads – such as real-time transaction filtering – the total cost of ownership (TCO) calculation starts to favor on-premise, as recurring cloud fees turn into capital expenditure spread over time.

Where banking AI is headed

Starling’s restructuring is not an isolated case. Many neobanks are experimenting with quantization and lighter models that run on their own infrastructure, shifting the focus from pure cloud to hybrid solutions that keep transactional data behind corporate firewalls.

At AI-RADAR, we closely watch those who put data control first. The next frontier for banking AI will not be just about speed or tokens per second, but about how well an architecture blends responsiveness with the secrecy the industry demands. Without that seam, even the most powerful model remains unusable where it is truly needed.