Interesting example of the use of reinforcement learning.
Using reinforcement learning to identify high-risk states and treatments in healthcare
Published February 2, 2022
By Mehdi Fatemi , Senior Researcher Taylor Killian , PhD student Marzyeh Ghassemi , Assistant Professor by Microsoft Research.
As the pandemic overburdens medical facilities and clinicians become increasingly overworked, the ability to make quick decisions on providing the best possible treatment is even more critical. In urgent health situations, such decisions can mean life or death. However, certain treatment protocols can pose a considerable risk to patients who have serious medical conditions and can potentially contribute to unintended outcomes.
In this research project, we built a machine learning (ML) model that works with scenarios where data is limited, such as healthcare. This model was developed to recognize treatment protocols that could contribute to negative outcomes and to alert clinicians when a patient’s health could decline to a dangerous level. You can explore the details of this research project in our research paper, “Medical Dead-ends and Learning to Identify High-risk States and Treatments,” which was presented at the 2021 Conference on Neural Information Processing Systems (NeurIPS 2021).
Reinforcement learning for healthcare
To build our model, we decided to use reinforcement learning—an ML framework that’s uniquely well-suited for advancing safety-critical domains such as healthcare. This is because at its core, healthcare is a sequential decision-making domain, and reinforcement learning is the formal paradigm for modeling and solving problems in such domains. In healthcare, clinicians base their treatment decisions on an overall understanding of a patient’s health; they observe how the patient responds to this treatment, and the process repeats. Likewise, in reinforcement learning, an algorithm, or agent, interprets the state of its environment and takes an action, which, coupled with the internal dynamics of the environment, causes it to transition to a new state, as shown in Figure 1. A reward signal is then assigned to account for the immediate impact of this change. For example, in a healthcare scenario, if a patient recovers or is discharged from the intensive care unit (ICU), the agent may receive a positive reward. However, if the patient does not survive, the agent receives a negative reward, or penalty. .... '
No comments:
Post a Comment