Stanford CS25: V3 I No Language Left Behind: Scaling Human-Centered Machine Translation
Автор: Stanford Online
Загружено: 2023-12-15
Просмотров: 3917
Описание:
November 14, 2023
Angela Fan, Meta AI
Driven by the goal of eradicating language barriers on a global scale, machine translation has solidified itself as a key focus of artificial intelligence research today. However, such efforts have coalesced around a small subset of languages, leaving behind the vast majority of mostly low-resource languages. What does it take to break the 200 language barrier while ensuring safe, high-quality results, all while keeping ethical considerations in mind? In this talk, I introduce No Language Left Behind, an initiative to break language barriers for low-resource languages. In No Language Left Behind, we took on the low-resource language translation challenge by first contextualizing the need for translation support through exploratory interviews with native speakers. Then, we created datasets and models aimed at narrowing the performance gap between low and high-resource languages. We proposed multiple architectural and training improvements to counteract overfitting while training on thousands of tasks. Critically, we evaluated the performance of over 40,000 different translation directions using a human-translated benchmark, Flores-200, and combined human evaluation with a novel toxicity benchmark covering all languages in Flores-200 to assess translation safety. Our model achieves an improvement of 44% BLEU relative to the previous state-of-the-art, laying important groundwork towards realizing a universal translation system in an open-source manner.
More about the course can be found here: https://web.stanford.edu/class/cs25/
View the entire CS25 Transformers United playlist: • Stanford CS25 - Transformers United
Повторяем попытку...
Доступные форматы для скачивания:
Скачать видео
-
Информация по загрузке: