In general, there is need for general an understanding of context, starting with physical environment.
MIT researchers have developed a representation of spatial perception for robots that is modeled after the way humans perceive and navigate the world. The key component of the team’s new model is Kimera, an open-source library that the team previously developed to simultaneously construct a 3D geometric model of an environment. Kimera builds a dense 3D semantic mesh of an environment and can track humans in the environment. The figure shows a multi-frame action sequence of a human moving in the scene. (Videos at the link) Paper: https://roboticsconference.org/program/papers/79/
“Alexa, go to the kitchen and fetch me a snack”
New model aims to give robots human-like perception of their physical environments.
Jennifer Chu | MIT News Office
Wouldn’t we all appreciate a little help around the house, especially if that help came in the form of a smart, adaptable, uncomplaining robot? Sure, there are the one-trick Roombas of the appliance world. But MIT engineers are envisioning robots more like home helpers, able to follow high-level, Alexa-type commands, such as “Go to the kitchen and fetch me a coffee cup.” ...
To carry out such high-level tasks, researchers believe robots will have to be able to perceive their physical environment as humans do.
“In order to make any decision in the world, you need to have a mental model of the environment around you,” says Luca Carlone, assistant professor of aeronautics and astronautics at MIT. “This is something so effortless for humans. But for robots it’s a painfully hard problem, where it’s about transforming pixel values that they see through a camera, into an understanding of the world.”
Now Carlone and his students have developed a representation of spatial perception for robots that is modeled after the way humans perceive and navigate the world.
The new model, which they call 3D Dynamic Scene Graphs, enables a robot to quickly generate a 3D map of its surroundings that also includes objects and their semantic labels (a chair versus a table, for instance), as well as people, rooms, walls, and other structures that the robot is likely seeing in its environment.
The model also allows the robot to extract relevant information from the 3D map, to query the location of objects and rooms, or the movement of people in its path. ... "
Tuesday, July 21, 2020
Subscribe to:
Post Comments (Atom)
No comments:
Post a Comment