“I would say that my capabilities in Luxembourgish may not be as extensive as with more widely used languages like English, French or German,” says AI-assistant Claude. This is no surprise: most large language models (LLMs) trained on English texts fall short when it comes to analyzing or generating content in Luxembourgish.
Though English represents only a fifth of global speakers, it dominates AI language model training and makes up most online content. While humans communicate in 7,000 different languages worldwide, AI assistants typically work with just a small fraction of these, according to the Center for Democracy & Technology (CDT). Some European languages, including Luxembourgish, are at risk of being left behind from the generative AI revolution due to limited resources for training language models. The LLMs4EU project, coordinated by the Alliance for Language Technologies (ALT-EDIC), aims to change this by preserving linguistic and cultural diversity across the EU. Luxembourg, represented by the Luxembourg Institute of Science and Technology (LIST), the University of Luxembourg, and Zenter fir d'Lëtzebuerger Sprooch, is playing a key role in this initiative to ensure Luxembourgish language and culture is not left behind.
Artificial intelligence is changing how we communicate and access information, but not all languages are given equal attention. LIST, as one of the Luxembourg’s official representatives in the ALT-EDIC consortium and contributor to the LLMs4EU project, is working to close this gap. The institute is developing benchmarking tools to test and enhance AI models for Luxembourgish, ensuring the language is better represented in the digital landscape.
The overarching goal of this scientific effort is to empower European companies, especially small and medium-sized enterprises (SMEs), by providing open-access AI tools for all EU languages. These tools will help businesses develop competitive language technologies while complying with European regulations, including the AI Act and GDPR.
The LLMs4EU project, with its €40 million budget and partnerships spanning 20 European countries, exemplifies the collaborative spirit needed to tackle global challenges.
In a recent round of testing by LIST, some of the most prominent AI language models (including the most recent DeepSeek model) were put through a series of language exams. These exams, designed by the Institut national des langues Luxembourg (INLL), cover levels from A1 to B2, with plans to expand to C2. The results revealed interesting trends: all models performed adequately on the A1 and A2 exams, demonstrating a basic understanding of Luxembourgish. However, as the exams became more advanced, the differences in performance became more pronounced with only the largest models able to achieve a B2 level, but these larger models may not be a feasible option for many SMEs due to privacy, price, access or even resource (energy, memory) constraints. While the models generally performed well in vocabulary and grammar, many struggled with reading and speaking understanding.
A major observation was the types of errors that were shared by all the models. Many questions were answered incorrectly by almost all models, and the answers were broadly similar. Most of the mistakes stemmed from a misinterpretation of context, with numerous errors in grammar and reasoning, including arithmetic mistakes. These findings highlight the current limitations of AI models in understanding the nuances of Luxembourgish, and the challenges that lie ahead in improving AI’s linguistic capabilities.
This project reflects the promise that technology should serve everyone, strengthening the idea that diversity—whether linguistic, cultural, or societal—is Europe’s greatest strength. But there’s more at stake. AI is not just a technological race; it’s about building open models and going for a more frugal approach. Right now, most of the biggest AI models are built far from Europe, often with priorities that don’t reflect our needs and values. That’s why projects like LLMs4EU matter. By developing open and accessible AI tools in Europe, for Europe, we’re making sure that European businesses, researchers, and citizens have AI they can trust.