The Eponymous Pickle: AI Teaching Itself

Thursday, May 20, 2021

AI Teaching Itself

Its al about setting goals, metagoals in the right context.

Why AI That Teaches Itself to Achieve a Goal Is the Next Big Thing

by 7wData April 24, 2021 by Kathryn Hume and Matthew E. Taylor

April 21, 2021 HBR

Summary. What’s the difference between the creative power of game-playing AIs and the predictive AIs most companies seem to use? How they learn. The AIs that thrive at games like Go, creating never before seen strategies, use an approach called reinforcement...

Lee Sedol, a world-class Go Champion, was flummoxed by the 37th move Deepmind’s AlphaGo made in the second match of the famous 2016 series. So flummoxed that it took him nearly 15 minutes to formulate a response. The move was strange to other experienced Go players as well, with one commentator suggesting it was a mistake. In fact, it was a canonical example of an artificial intelligence algorithm learning something that seemed to go beyond just pattern recognition in data — learning something strategic and even creative. Indeed, beyond just feeding the algorithm past examples of Go champions playing games, Deepmind developers trained AlphaGo by having it play many millions of matches against itself. During these matches, the system had the chance to explore new moves and strategies, and then evaluate if they improved performance. Through all this trial and error, it discovered a way to play the game that surprised even the best players in the world.

If this kind of AI with creative capabilities seems different than the chatbots and predictive models most businesses end up with when they apply machine learning, that’s because it is. Instead of machine learning that uses historical data to generate predictions, game-playing systems like AlphaGo use reinforcement learning — a mature machine learning technology that’s good at optimizing tasks. To do so, an agent takes a series of actions over time, and each action is informed by the outcome of the previous ones. Put simply, it works by trying different approaches and latching onto — reinforcing — the ones that seem to work better than the others. With enough trials, you can reinforce your way to beating your current best approach and discover a new best way to accomplish your task.

Despite its demonstrated usefulness, however, reinforcement learning is mostly used in academia and niche areas like video games and robotics. Companies such as Netflix, Spotify, and Google have started using it, but most businesses lag behind. Yet opportunities are everywhere. In fact, any time you have to make decisions in sequence — what AI practitioners call sequential decision tasks — there a chance to deploy reinforcement learning.

Consider the many real-world problems that require deciding how to act over time, where there is something to maximize (or minimize), and where you’re never explicitly given the correct solution. For example: .... "

The Eponymous Pickle

About Me

RSS

Blog Archive

Thursday, May 20, 2021

AI Teaching Itself

No comments: