/* ---- Google Analytics Code Below */

Wednesday, July 15, 2020

Thinking Decentralized Dependent Decision Problems

Struck me as quite interesting, how do we make societal decisions?  Can we learn the right information to support the right decisions?   You can usually state sub problems in a way that they can optimized based on goals and constraints.  But it is much harder to optimize a number of dependent decisions.   Doing this well, under multiple changing (and forecast)  conditions could be quite powerful.  Interesting discussion here, but technically complex.   Decentralized reinforcement learning?  Considering.

Decentralized Reinforcement Learning:
Global Decision-Making via Local Economic Transactions
Michael Chang and Sidhant Kaushik      Berkeley,   Jul 11, 2020

Many neural network architectures that underlie various artificial intelligence systems today bear an interesting similarity to the early computers a century ago. Just as early computers were specialized circuits for specific purposes like solving linear systems or cryptanalysis, so too does the trained neural network generally function as a specialized circuit for performing a specific task, with all parameters coupled together in the same global scope.

One might naturally wonder what it might take for learning systems to scale in complexity in the same way as programmed systems have. And if the history of how abstraction enabled computer science to scale gives any indication, one possible place to start would be to consider what it means to build complex learning systems at multiple levels of abstraction, where each level of learning is the emergent consequence of learning from the layer below.

This post discusses our recent paper that introduces a framework for societal decision-making, a perspective on reinforcement learning through the lens of a self-organizing society of primitive agents. We prove the optimality of an incentive mechanism for engineering the society to optimize a collective objective. Our work also provides suggestive evidence that the local credit assignment scheme of the decentralized reinforcement learning algorithms we develop to train the society facilitates more efficient transfer to new tasks.

Levels of Abstraction in Complex Learning Systems

From corporations to organisms, many large-scale systems in our world are composed of smaller individual autonomous components, whose collective function serve a larger objective than the objective of any individual component alone. A corporation for example, optimizes for profits as if it were a single super-agent when in reality it is a society of self-interested human agents, each with concerns that may have little to do with profit. And every human is also simply an abstraction of organs, tissues, and cells individually adapting and making their own simpler decisions.

You know that everything you think and do is thought and done by you. But what's a "you"? What kinds of smaller entities cooperate inside your mind to do your work? — Marvin Minsky, The Society of Mind

At the core of building complex learning systems at multiple levels of abstraction is to understand the mechanisms that bind consecutive levels together. In the context of learning for decision-making, this means to define three ingredients:  ....

Supporting paper:

https://arxiv.org/abs/2007.02382

Decentralized Reinforcement Learning: Global Decision-Making via Local Economic Transactions
Michael Chang, Sidhant Kaushik, S. Matthew Weinberg, Thomas L. Griffiths, Sergey Levine
This paper seeks to establish a framework for directing a society of simple, specialized, self-interested agents to solve what traditionally are posed as monolithic single-agent sequential decision problems. What makes it challenging to use a decentralized approach to collectively optimize a central objective is the difficulty in characterizing the equilibrium strategy profile of non-cooperative games. To overcome this challenge, we design a mechanism for defining the learning environment of each agent for which we know that the optimal solution for the global objective coincides with a Nash equilibrium strategy profile of the agents optimizing their own local objectives. The society functions as an economy of agents that learn the credit assignment process itself by buying and selling to each other the right to operate on the environment state. We derive a class of decentralized reinforcement learning algorithms that are broadly applicable not only to standard reinforcement learning but also for selecting options in semi-MDPs and dynamically composing computation graphs. Lastly, we demonstrate the potential advantages of a society's inherent modular structure for more efficient transfer learning.
Comments: 17 pages, 12 figures, accepted to the International Conference on Machine Learning (ICML) 2020
Subjects: Machine Learning (cs.LG); Computer Science and Game Theory (cs.GT); Multiagent Systems (cs.MA); Neural and Evolutionary Computing (cs.NE); Machine Learning (stat.ML)
Cite as: arXiv:2007.02382 [cs.LG]
  (or arXiv:2007.02382v1 [cs.LG] for this version)
...

No comments: