/* ---- Google Analytics Code Below */

Friday, March 20, 2020

Pattern Recognition for Dead Languages

Fascinating application of language pattern recognition.   Which shows this tech can be taken into unexpected and complex places.

Dead Languages Come to Life
By Gary Anthes   ACM
Communications of the ACM, April 2020, Vol. 63 No. 4, Pages 13-15  10.1145/3381908

Driven by advanced techniques in machine learning, commercial systems for automated language translation now nearly match the performance of human linguists, and far more efficiently. Google Translate supports 105 languages, from Afrikaans to Zulu, and in addition to printed text it can translate speech, handwriting, and the text found on websites and in images.

The methods for doing those things are clever, but the key enabler lies in the huge annotated databases of writings in the various language pairs. A translation from French to English succeeds because the algorithms were trained on millions of actual translation examples. The expectation is that every word or phrase that comes into the system, with its associated rules and patterns of language structure, will have been seen and translated before.

Now researchers have developed a method that, in some cases, can automatically translate extinct languages, those for which these big parallel data sets do not exist. Jiaming Luo and Regina Barzilay at the Massachusetts Institute of Technology (MIT) and Yuan Cao at Google were able to automate the "decipherment" of Linear B—a Greek language predecessor dating to 1450 B.C.—into modern Greek. Previous translations of Linear B to Greek were only possible manually, at great effort, by language and subject-matter experts. The same automated methods were also able to translate Ugaritic, an extinct Semitic language, into Hebrew.

How It Works: .... 

No comments: