Another interesting piece from Berkeley - BAIR. Now looking at more complex planning. Long range planning that machines are not necessarily good at. But humans are also not so good at problems that require many option planning and analysis. COuld this be something where machines and humans could collaborate well. Had some supply chain planning models that might have used these directions. Byond the intro this article is technical.
Learning State Abstractions for Long-Hoprizon Planning
By Scott Emmons*, Ajay Jain*, Michael Laskin*, Thanard Kurutach, Pieter Abbeel, Deepak Pathak
Many tasks that we do on a regular basis, such as navigating a city, cooking a meal, or loading a dishwasher, require planning over extended periods of time. Accomplishing these tasks may seem simple to us; however, reasoning over long time horizons remains a major challenge for today’s Reinforcement Learning (RL) algorithms. While unable to plan over long horizons, deep RL algorithms excel at learning policies for short horizon tasks, such as robotic grasping, directly from pixels. At the same time, classical planning methods such as Dijkstra’s algorithm and A∗ search can plan over long time horizons, but they require hand-specified or task-specific abstract representations of the environment as input.
To achieve the best of both worlds, state-of-the-art visual navigation methods have applied classical search methods to learned graphs. In particular, SPTM [2] and SoRB [3] use a replay buffer of observations as nodes in a graph and learn a parametric distance function to draw edges in the graph. These methods have been successfully applied to long-horizon simulated navigation tasks that were too challenging for previous methods to solve. ... "
No comments:
Post a Comment