Ultimately this will be a very important problem. We worked on many examples where we could create very complex analytics or AI style solutions, but ultimately fit their solutions into real world decisions. What is the form of the mixing? Assistant, Simulation, Testing, Enhancing, Collaboration? The form of mixing will be key.
The Effects of Mixing Machine Learning and Human Judgment
Collaboration between humans and machines does not necessarily lead to better outcomes.
Michelle Vaccaro and Jim Waldo in Queue
In 1997 IBM's Deep Blue software beat the World Chess Champion Garry Kasparov in a series of six matches. Since then, other programs have beaten human players in games ranging from Jeopardy to Go. Inspired by his loss, Kasparov decided in 2005 to test the success of Human+AI pairs in an online chess tournament.2 He found that the Human+AI team bested the solo human. More surprisingly, he also found that the Human+AI team bested the solo computer, even though the machine outperformed humans.
Researchers explain this phenomenon by emphasizing that humans and machines excel in different dimensions of intelligence.9 Human chess players do well with long-term chess strategies, but they perform poorly at assessing the millions of possible configurations of pieces. The opposite holds for machines. Because of these differences, combining human and machine intelligence produces better outcomes than when each works separately. People also view this form of collaboration between humans and machines as a possible way to mitigate the problems of bias in machine learning, a problem that has taken center stage in recent months.12
We decided to investigate this type of collaboration between humans and machines using risk-assessment algorithms as a case study. In particular, we looked at the COMPAS (Correctional Offender Management Profiling for Alternative Sanctions) algorithm, a well-known (perhaps infamous) risk-prediction system, and its effect on human decisions about risk. Many state courts use algorithms such as COMPAS to predict defendants' risk of recidivism, and these results inform bail, sentencing, and parole decisions.
Prior work on risk-assessment algorithms has focused on their accuracy and fairness, but it has not addressed their interactions with human decision makers who serve as the final arbitrators. In one study from 2018, Julia Dressel and Hany Farid compared risk assessments from the COMPAS software and Amazon Mechanical Turk workers, and found that the algorithm and the humans achieved similar levels of accuracy and fairness.6 This study signals an important shift in the literature on risk-assessment instruments by incorporating human subjects to contextualize the accuracy and fairness of the algorithms. Dressel and Farid's study, however, divorces the human decision makers and the algorithm when, in fact, the current model indicates that humans and algorithms would work in tandem. .... "
Wednesday, October 02, 2019
Subscribe to:
Post Comments (Atom)
No comments:
Post a Comment