/* ---- Google Analytics Code Below */

Monday, June 08, 2020

Google Reports on Recent Advances in Translate

Google reports measure improvements in their Translate, but there are still contextual challenges.  I would be tempted to try to construct a risk analysis regarding its context, but that too would be hard.  Though impressed, I woud still be cautious about autonomous use.

Recent Advances in Google Translate
Monday, June 8, 2020
Posted by Isaac Caswell and Bowen Liang, Software Engineers, Google Research

Advances in machine learning (ML) have driven improvements to automated translation, including the GNMT neural translation model introduced in Translate in 2016, that have enabled great improvements to the quality of translation for over 100 languages. Nevertheless, state-of-the-art systems lag significantly behind human performance in all but the most specific translation tasks. And while the research community has developed techniques that are successful for high-resource languages like Spanish and German, for which there exist copious amounts of training data, performance on low-resource languages, like Yoruba or Malayalam, still leaves much to be desired. Many techniques have demonstrated significant gains for low-resource languages in controlled research settings (e.g., the WMT Evaluation Campaign), however these results on smaller, publicly available datasets may not easily transition to large, web-crawled datasets.

In this post, we share some recent progress we have made in translation quality for supported languages, especially for those that are low-resource, by synthesizing and expanding a variety of recent advances, and demonstrate how they can be applied at scale to noisy, web-mined data. These techniques span improvements to model architecture and training, improved treatment of noise in datasets, increased multilingual transfer learning through M4 modeling, and use of monolingual data. The quality improvements, which averaged +5 BLEU score over all 100+ languages, are visualized below. ... " 

 And see also, on issues with translation, from the ACM:

Automatic Translators are Not Really Capable of Learning
By Herbert Bruderer    ... "

No comments: