Teuken-7B: Europe's Multilingual AI Model Launches with 7 Billion Parameters on Hugging Face

November 26, 2024
Teuken-7B: Europe's Multilingual AI Model Launches with 7 Billion Parameters on Hugging Face
  • Teuken-7B is designed as an open-source alternative for researchers and businesses, enabling them to utilize and adapt the model for commercial projects while ensuring data security.

  • The model was trained on Germany's JUWELS supercomputer, equipped with 3,744 NVIDIA A100 GPUs, providing the computational power necessary for its development.

  • The OpenGPT-X project is a collaborative effort led by the Fraunhofer Institutes, with contributions from various institutions, showcasing a model of cooperative public research.

  • Teuken-7B is available in two versions: one for research-only and another under the Apache 2.0 license for commercial use, both offering comparable performance.

  • The model aims to support AI applications across industries, demonstrating superior performance in multilingual benchmarks, particularly across 21 languages.

  • The project is set to continue until March 31, 2025, allowing for ongoing enhancements and evaluations of the model.

  • Initiated in early 2022, the project aims to develop an AI model that aligns with European values, prioritizing data protection and linguistic diversity.

  • The model is hosted on Gaia-X infrastructure, ensuring compliance with strict European data protection regulations, which is essential for businesses handling sensitive information.

  • The OpenGPT-X research project has launched the 'Teuken-7B' language model, now available for download on Hugging Face, which is trained in all 24 official languages of the European Union and features seven billion parameters.

  • Stefan Wrobel, director at Fraunhofer IAIS, expressed optimism that Teuken-7B will be widely adopted across various sectors, addressing the growing demand for transparent and customizable generative AI solutions.

  • Developed over two years, the project emphasizes energy-efficient training and operation, contributing to advancements in multilingual AI models.

  • Teuken-7B stands out by incorporating approximately 50% non-English pre-training data, enhancing its performance for multilingual applications compared to other models.

Summary based on 9 sources


Get a daily email with more AI stories

Sources



Teuken-7B: European AI model challenges US giants

Onlineportal von IT Management • Nov 26, 2024

Teuken-7B: European AI model challenges US giants

OpenGPT-X Unveils Multilingual Open Source Model

More Stories