Predicting human deliberative judgments with machine learning

<p>Machine Learning (ML) has been successful in automating a range of cognitive tasks that humans solve effortlessly and quickly. Yet many realworld tasks are difficult and slow: people solve them by an extended process that involves analytical reasoning, gathering external information, and di...

Volledige beschrijving

Bibliografische gegevens
Hoofdauteurs: Evans, O, Stuhlmüller, A, Cundy, C, Carey, R, Kenton, Z, McGrath, T, Schreiber, A
Formaat: Report
Taal:English
Gepubliceerd in: Future of Humanity Institute 2018
Omschrijving
Samenvatting:<p>Machine Learning (ML) has been successful in automating a range of cognitive tasks that humans solve effortlessly and quickly. Yet many realworld tasks are difficult and slow: people solve them by an extended process that involves analytical reasoning, gathering external information, and discussing with collaborators. Examples include medical advice, judging a criminal trial, and providing personalized recommendations for rich content such as books or academic papers. There is great demand for automating tasks that require deliberative judgment. Current ML approaches can be unreliable: this is partly because such tasks are intrinsically difficult (even AI-complete) and partly because assembling datasets of deliberative judgments is expensive (each label might take hours of human work). We consider addressing this data problem by collecting fast judgments and using them to help predict deliberative (slow) judgments. Instead of having a human spend hours on a task, we might instead collect their judgment after 30 seconds or 10 minutes. These fast judgments are combined with a smaller quantity of slow judgments to provide training data. The resulting prediction problem is related to semi-supervised learning and collaborative filtering. We designed two tasks for the purpose of testing ML algorithms on predicting human deliberative judgments. One task involves Fermi estimation (back-of-the-envelope estimation) and the other involves judging the veracity of political statements. We collected a dataset of 25,000 judgments from more than 800 people. We define an ML prediction task for predicting deliberative judgments given a training set that also contains fast judgments. We tested a variety of baseline algorithms on this task. Unfortunately our dataset has serious limitations. Additional work is required to create a good testbed for predicting human deliberative judgments. This technical report explains the motivation for our project (which might be built on in future work) and explains how further work can avoid our mistakes. Our dataset and code is available at https: //github.com/oughtinc/psj.</p>