(Updated) after reading. See supporting images at link. ....
Meta’s AI Takes an Unsupervised Step Forward In the quest for human-level intelligent AI, Meta is betting on self-supervised learning By ELIZA STRICKLAND in IEEE Spectrum
Meta’s chief AI scientist, Yann LeCun, doesn’t lose sight of his far-off goal, even when talking about concrete steps in the here and now. “We want to build intelligent machines that learn like animals and humans,” LeCun tells IEEE Spectrum in an interview.
Today’s concrete step is a series of papers from Meta, the company formerly known as Facebook, on a type of self-supervised learning (SSL) for AI systems. SSL stands in contrast to supervised learning, in which an AI system learns from a labeled data set (the labels serve as the teacher who provides the correct answers when the AI system checks its work). LeCun has often spoken about his strong belief that SSL is a necessary prerequisite for AI systems that can build “world models” and can therefore begin to gain humanlike faculties such as reason, common sense, and the ability to transfer skills and knowledge from one context to another. The new papers show how a self-supervised system called a masked auto-encoder (MAE) learned to reconstruct images, video, and even audio from very patchy and incomplete data. While MAEs are not a new idea, Meta has extended the work to new domains.
By figuring out how to predict missing data, either in a static image or a video or audio sequence, the MAE system must be constructing a world model, LeCun says. “If it can predict what’s going to happen in a video, it has to understand that the world is three-dimensional, that some objects are inanimate and don’t move by themselves, that other objects are animate and harder to predict, all the way up to predicting complex behavior from animate persons,” he says. And once an AI system has an accurate world model, it can use that model to plan actions.
“Images, which are signals from the natural world, are not constructed to remove redundancy. That’s why we can compress things so well when we create JPGs.” —Ross Girshick, Meta
“The essence of intelligence is learning to predict,” LeCun says. And while he’s not claiming that Meta’s MAE system is anything close to an artificial general intelligence, he sees it as an important step.
Not everyone agrees that the Meta researchers are on the right path to human-level intelligence. Yoshua Bengio is credited, in addition to his co–Turing Award winners LeCun and Geoffrey Hinton, with the development of deep neural networks, and he sometimes engages in friendly sparring with LeCun over big ideas in AI. In an email to IEEE Spectrum, Bengio spells out both some differences and similarities in their aims.
“I really don’t think that our current approaches (self-supervised or not) are sufficient to bridge the gapto human-level intelligence,” Bengio writes. He adds that “qualitative advances” in the field will be needed to really move the state of the art anywhere closer to human-scale AI.
While he agrees with LeCun that the ability to reason about the world is a key element of intelligence, Bengio’s team isn’t focused on models that can predict, but rather those that can render knowledge in the form of natural language. Such a model “would allow us to combine these pieces of knowledge to solve new problems, perform counterfactual simulations, or examine possible futures,” he notes. Bengio’s team has developed a new neural-net framework that has a more modular nature than those favored by LeCun, whose team is working on end-to-end learning (models that learn all the steps between the initial input stage and the final output result). ..... '
No comments:
Post a Comment