Had not thought the idea of capturing spellings in multiple languages was important, but this piece makes the point.
Microsoft details Speller100, an AI system that checks spelling in over 100 languages Kyle Wiggers @Kyle_L_Wiggers, February 8, 2021 9:05 AM
In a post on its AI research blog, Microsoft today detailed a new language system, Speller100, that the company claims is one of the most comprehensive ever made in terms of language coverage and accuracy. Comprising a number of machine learning models that can understand speech in over 100 languages collectively, Speller100 now powers spelling correction on Bing.
As Microsoft notes, for a language with very little web presence, it’s challenging to collect an adequate amount of data to train a model. Moreover, models can’t rely solely on training data to learn the spelling of a language. At its core, spelling correction is about building both an error and a language model, and not all errors are the same. For example, non-word errors occurs when a word isn’t in the vocabulary for a given language, while real-word errors occur when the word exists but doesn’t fit in a larger context. ... "
See: https://www.microsoft.com/en-us/research/project/speller/
No comments:
Post a Comment