Houston News Buzz

collapse
Home / Daily News Analysis / Portugal open-sources Amália, its first national AI model, in a bet on European Portuguese

Portugal open-sources Amália, its first national AI model, in a bet on European Portuguese

Jul 02, 2026  Twila Rosenbaum 10 views
Portugal open-sources Amália, its first national AI model, in a bet on European Portuguese

Portugal has released Amália, the country's first large language model built specifically for European Portuguese, in a move that underscores European efforts to achieve digital sovereignty in artificial intelligence. The model, its training data, and its source code are all open and free for governments, universities, and companies to use and build upon.

Amália—an acronym for Automatic Multimodal Language Assistant with Artificial Intelligence—takes its name from Amália Rodrigues, the renowned fado singer whose voice is deeply intertwined with Portuguese identity. The choice is intentional: the model is designed to capture the unique linguistic and cultural nuances of European Portuguese.

The model is built on EuroLLM-9B, a European foundation model that serves as the base. A team of over 60 researchers and students expanded it with European Portuguese datasets, a larger context window, stronger safety and evaluation systems, and the ability to process both text and images.

Not a Consumer Chatbot

Amália will not be released as a consumer chatbot like ChatGPT. Instead, it is designed as infrastructure that other software can call upon—an underlying layer for public-facing applications. Planned uses include an AI teaching assistant, a virtual guide for Portuguese museums and monuments, a digital assistant for citizen services, and decision-support tools for the Portuguese Navy.

This distinction explains why the Portuguese government is giving the model away rather than charging for access. Open publication ensures that the model can be audited, which is critical for a system intended to handle sensitive government services.

Funding and Development

The project has drawn initial funding of €5.5 million through Portugal’s Recovery and Resilience Plan, part of the broader NextGenerationEU initiative. The funds flow to NOVA University Lisbon, Instituto Superior Técnico, and the universities of Porto, Minho, and Coimbra, coordinated by the Foundation for Science and Technology. A test version was completed in September 2025 and presented at the PROPOR conference in Brazil.

Funding has been secured through the end of 2027, indicating a commitment to long-term development rather than a one-off launch.

Open Source as a Principle

Amália is fully open-source. Where commercial models are closed boxes accessed through APIs and paywalls, Amália ships with its weights, datasets, and code published under an open license. Anyone can inspect how it was trained, adapt the model, and run it on their own hardware. This choice is both ideological and practical: a government that plans to wire AI into citizen services and naval decision-support needs the ability to audit the system, not just trust it.

The release aligns with Europe’s growing unease about dependence on American and Chinese systems for foundational language technology. It follows the OpenEuroLLM alliance, a cross-border effort to train open models on European languages, and a wave of infrastructure investments including Nscale’s €695 million data-center push in Portugal in collaboration with Microsoft.

However, some experts question whether such projects can achieve genuine independence. Renting graphics processing units by the hour can produce the illusion of sovereignty rather than the substance of it, as TNW has argued.

The Power of Specificity

Amália’s strongest advantage is its focus on European Portuguese. European Portuguese differs significantly from Brazilian Portuguese in grammar, idiom, and cultural references. The large commercial models are trained overwhelmingly on Brazilian Portuguese, which tends to flatten these differences. A system that gets the distinctions right is far more useful for public services expected to speak to citizens in their own register rather than an approximation.

This specificity matters most for government applications: an AI teaching assistant that uses Brazilian Portuguese might confuse students, and a virtual guide that mispronounces local place names could alienate visitors. Amália aims to avoid these pitfalls by training exclusively on European Portuguese data.

Adoption Challenges

For all its promise, Amália faces a significant hurdle: adoption. Publishing a model openly is one thing; getting universities, companies, and government departments to actually build on it is another. Many sovereign AI initiatives quietly run out of steam at this second step.

Portugal has funded Amália through 2027 and named the institutions meant to carry it forward. The next two years will show whether it becomes real infrastructure or remains a well-documented research project with a beautiful name.

Broader Context: Europe's AI Sovereignty Push

Amália is part of a larger European trend. The EU has invested heavily in AI through programs like Horizon Europe and the Digital Europe Programme. The OpenEuroLLM project, of which Amália is a concrete output, aims to build open-source LLMs that cover all European languages, with a focus on less-resourced languages like Portuguese, Polish, and Greek.

These efforts respond to concerns that American and Chinese AI dominance could erode European competitiveness and cultural identity. Language is a key battleground: if the only AI assistants available speak American English or Mandarin-accented Chinese, smaller languages risk being marginalized in the digital sphere.

Portugal, with a population of just over 10 million, is a relatively small market, but its language is spoken by about 260 million people worldwide—including Brazil and several African countries. A model that masters European Portuguese could also be adapted for other Portuguese-speaking regions, though that is not the immediate goal.

Technical Details

Amália is a multimodal model based on EuroLLM-9B, which itself is a transformer-based LLM trained on primarily European language data. The research team expanded the context window to handle longer documents, and they integrated visual understanding capabilities by including image-text pairings. Safety and evaluation systems were also strengthened to meet government standards.

The model is released under a permissive open-source license, allowing commercial and non-commercial use. Downloading and running the model requires significant computational resources, but smaller variants may be made available for lighter applications.

Data sources for training include European Portuguese text from government websites, academic publications, news archives, and cultural materials. The team has also released the dataset alongside the model to ensure full transparency.

The project's success will depend on whether the open-source community embraces Amália. If developers build applications that leverage its strengths—such as regional dialect recognition or specialized government terminology—it could become a template for other small European countries seeking AI independence.


Source:TNW | Artificial-Intelligence News


Share:

Your experience on this site will be improved by allowing cookies Cookie Policy