Georg Kalus
/

European LLMs Teuken-7B-Base & Teuken-7B-Instruct aim to mitigate language bias

Teuken, Mistral and DeepSeek AI models

TL;DR:

  • Teuken-7B-Base and Teuken-7B-Instruct are two new multilingual large language models (LLMs) that support 24 official EU languages.
  • Trained on 60% non-English data, these models address existing language biases in LLMs.
  • They excel in various benchmarks, including ARC, HellaSwag, MMLU, and TruthfulQA.

New LLM developed in Germany

Teuken-7B-Base and Teuken-7B-Instruct are two groundbreaking multilingual LLMs built by Fraunhofer IAIS, Lamarr Institute, TU Dresden, FZ Jülich, Deutsches Forschungszentrum für Künstliche Intelligenz (DFKI), Fraunhofer IIS, and Aleph Alpha that have been making waves in the natural language processing (NLP) community. These models are designed to support 24 official EU languages, making them a significant step forward in promoting language diversity and inclusivity in AI.

One of the primary challenges in developing multilingual LLMs is addressing the existing language biases in these models. Most LLMs are trained on predominantly English data, which can result in suboptimal performance on non-English languages. Teuken-7B-Base and Teuken-7B-Instruct, however, are trained on 60% non-English data, which helps to mitigate these biases. This diverse training data enables the models to better understand and generate text in multiple languages.

The performance of Teuken-7B-Base and Teuken-7B-Instruct has been evaluated on various benchmarks, including ARC, HellaSwag, MMLU, and TruthfulQA. The results show that these models excel in these benchmarks, demonstrating their ability to understand and generate coherent text in multiple languages.

Technical Details

The technical details of Teuken-7B-Base and Teuken-7B-Instruct are impressive. These models are built using a combination of machine learning algorithms and large amounts of training data. The training process involves optimizing the model’s parameters to minimize the difference between the predicted output and the actual output.

The models’ architecture is based on the transformer architecture, which is a type of neural network that is well-suited for NLP tasks. The transformer architecture allows the model to attend to different parts of the input sequence simultaneously, enabling it to capture long-range dependencies and contextual relationships.

Applications and Implications

The development of Teuken-7B-Base and Teuken-7B-Instruct has significant implications for the NLP community. These models can be used for a variety of applications, including language translation, text summarization, and chatbots.

The ability of these models to support multiple languages makes them particularly useful for applications that require multilingual support. For example, they can be used to develop chatbots that can communicate with users in multiple languages, or to translate text from one language to another.

In addition to their practical applications, Teuken-7B-Base and Teuken-7B-Instruct also have implications for the development of more inclusive and diverse AI systems. By promoting language diversity and inclusivity, these models can help to reduce the biases and disparities that exist in current AI systems.

In conclusion, Teuken-7B-Base and Teuken-7B-Instruct are two significant advancements in the development of multilingual LLMs. Their ability to support 24 official EU languages and address existing language biases makes them a valuable resource for the NLP community. As AI continues to play an increasingly important role in our lives, the development of more inclusive and diverse AI systems like Teuken-7B-Base and Teuken-7B-Instruct is crucial for promoting fairness, equity, and accessibility.

References

[^1]: Available at Teuken-7B-Base & Teuken-7B-Instruct: Towards European LLMs. Arxiv.org. Retrieved April 18, 2025.

[^2] HuggingFace Model Card for Teuken-7B-instruct-commercial-v0.4. Hugging Face. Retrieved April 18, 2025.

To top