Been reading and exploring about what reading understanding means. Here an update from Quanta Magazine on the toic. We still have far to go when we have to deal with changing context, common sense and even inferring things like implications of cause and effect. We did lots of work with 'sentiment analysis' long ago, and its much easier to do now, with lots of easy to plug in capabilities, but the result is still statistically weak. Shows how difficult a building a semi general purpose chatbot is. We discovered that during several efforts. Good read here at the link:
Machines Beat Humans on a Reading Test. But Do They Understand?
A tool known as BERT can now beat humans on advanced reading-comprehension tests. But it's also revealed how far AI has to go.
In the fall of 2017, Sam Bowman, a computational linguist at New York University, figured that computers still weren’t very good at understanding the written word. Sure, they had become decent at simulating that understanding in certain narrow domains, like automatic translation or sentiment analysis (for example, determining if a sentence sounds “mean or nice,” he said). But Bowman wanted measurable evidence of the genuine article: bona fide, human-style reading comprehension in English. So he came up with a test.
In an April 2018 paper coauthored with collaborators from the University of Washington and DeepMind, the Google-owned artificial intelligence company, Bowman introduced a battery of nine reading-comprehension tasks for computers called GLUE (General Language Understanding Evaluation). The test was designed as “a fairly representative sample of what the research community thought were interesting challenges,” said Bowman, but also “pretty straightforward for humans.” For example, one task asks whether a sentence is true based on information offered in a preceding sentence. If you can tell that “President Trump landed in Iraq for the start of a seven-day visit” implies that “President Trump is on an overseas visit,” you’ve just passed.
The machines bombed. Even state-of-the-art neural networks scored no higher than 69 out of 100 across all nine tasks: a D-plus, in letter grade terms. Bowman and his coauthors weren’t surprised. Neural networks — layers of computational connections built in a crude approximation of how neurons communicate within mammalian brains — had shown promise in the field of “natural language processing” (NLP), but the researchers weren’t convinced that these systems were learning anything substantial about language itself. And GLUE seemed to prove it. “These early results indicate that solving GLUE is beyond the capabilities of current models and methods,” Bowman and his coauthors wrote. .... "
Sunday, October 20, 2019
Subscribe to:
Post Comments (Atom)
No comments:
Post a Comment