While global powerhouses battle for AI supremacy with ever-larger models, Portugal has taken a markedly different path. The government has released Amália, the country’s first Large Language Model built specifically for European Portuguese, and it did so with near-radical transparency: open-source model and public training data. The aim is not to compete in raw power with GPT-4 or similar giants, but to fill a linguistic gap that multilingual models often neglect — and to do so with full data sovereignty.

The forgotten variant of Portuguese

European Portuguese is spoken by about 10 million people in Portugal and scattered communities worldwide, yet it is frequently overshadowed by the far more widespread Brazilian variant. Large LLMs trained on global corpora tend to inherit the characteristics of Brazilian Portuguese, producing responses that to a Portuguese ear sound unnatural or even incorrect. Amália — named after the legendary fado singer Amália Rodrigues — reverses that logic: it was trained specifically on texts, speeches, and documents in European Portuguese, aiming to preserve syntactic, lexical, and cultural nuances that risk being erased by algorithmic homogenization.

Open source and direct infrastructure control

The Portuguese government’s decision to release everything as open source is not just a statement of principle, but a concrete enabler for on-premise deployment. Public administrations, enterprises, and developers in Portugal can download the model and run it on local servers without sending sensitive data to third parties. In a context where GDPR imposes strict limits on personal data movement, keeping inference within one’s own physical perimeter is a compliance advantage that is hard to overstate. Those evaluating such scenarios know well the trade-offs involve hardware: sufficient GPU VRAM, careful quantization management, and optimized serving pipelines are essential. AI-RADAR has long tracked these dynamics, offering analytical frameworks for those deciding between cloud, hybrid, and on-premise approaches.

Implications for the public sector and businesses

For a government agency or a Portuguese SME, Amália opens the door to self-hosted customer support, document analysis, or citizen services. There are no recurring API costs, no risk of lock-in with a foreign provider. On the other hand, the Total Cost of Ownership must be calculated carefully: purchasing dedicated hardware, maintenance, and the in-house expertise needed to run an LLM in production are not negligible. Yet for a country determined to keep control of its digital infrastructure, investing in an open-source national model may be cheaper and strategically smarter than paying per-use for cloud services run by non-European companies.

A signal for Europe

Amália is not the first national LLM — consider models developed in France or Germany — but the combination of full open source, included training data, and focus on a minority language makes it a noteworthy case. It signals that even states with limited resources can build functional AI for their citizens, provided they abandon the race for hundreds of billions of parameters. Portugal’s bet suggests that linguistic precision and digital sovereignty matter more than benchmark scores. A lesson that many other European nations may soon start studying.