Growing a list

It is easy to find expert knowledge on the Internet on almost any topic, but obtaining a complete overview of a given topic is not always easy: information can be scattered across many sources and must be aggregated to be useful. We introduce a method for intelligently growing a list of relevant ite...

Full description

Bibliographic Details
Main Authors: Letham, Benjamin, Rudin, Cynthia, Heller, Katherine A.
Other Authors: Massachusetts Institute of Technology. Operations Research Center
Format: Article
Language:en_US
Published: Springer-Verlag 2015
Online Access:http://hdl.handle.net/1721.1/99125
_version_ 1826193895345618944
author Letham, Benjamin
Rudin, Cynthia
Heller, Katherine A.
author2 Massachusetts Institute of Technology. Operations Research Center
author_facet Massachusetts Institute of Technology. Operations Research Center
Letham, Benjamin
Rudin, Cynthia
Heller, Katherine A.
author_sort Letham, Benjamin
collection MIT
description It is easy to find expert knowledge on the Internet on almost any topic, but obtaining a complete overview of a given topic is not always easy: information can be scattered across many sources and must be aggregated to be useful. We introduce a method for intelligently growing a list of relevant items, starting from a small seed of examples. Our algorithm takes advantage of the wisdom of the crowd, in the sense that there are many experts who post lists of things on the Internet. We use a collection of simple machine learning components to find these experts and aggregate their lists to produce a single complete and meaningful list. We use experiments with gold standards and open-ended experiments without gold standards to show that our method significantly outperforms the state of the art. Our method uses the ranking algorithm Bayesian Sets even when its underlying independence assumption is violated, and we provide a theoretical generalization bound to motivate its use.
first_indexed 2024-09-23T09:47:00Z
format Article
id mit-1721.1/99125
institution Massachusetts Institute of Technology
language en_US
last_indexed 2024-09-23T09:47:00Z
publishDate 2015
publisher Springer-Verlag
record_format dspace
spelling mit-1721.1/991252022-09-30T16:46:35Z Growing a list Letham, Benjamin Rudin, Cynthia Heller, Katherine A. Massachusetts Institute of Technology. Operations Research Center Sloan School of Management Letham, Benjamin Rudin, Cynthia It is easy to find expert knowledge on the Internet on almost any topic, but obtaining a complete overview of a given topic is not always easy: information can be scattered across many sources and must be aggregated to be useful. We introduce a method for intelligently growing a list of relevant items, starting from a small seed of examples. Our algorithm takes advantage of the wisdom of the crowd, in the sense that there are many experts who post lists of things on the Internet. We use a collection of simple machine learning components to find these experts and aggregate their lists to produce a single complete and meaningful list. We use experiments with gold standards and open-ended experiments without gold standards to show that our method significantly outperforms the state of the art. Our method uses the ranking algorithm Bayesian Sets even when its underlying independence assumption is violated, and we provide a theoretical generalization bound to motivate its use. Lincoln Laboratory National Science Foundation (U.S.) (IIS-1053407) 2015-10-02T12:12:18Z 2015-10-02T12:12:18Z 2013-07 2012-12 Article http://purl.org/eprint/type/JournalArticle 1384-5810 1573-756X http://hdl.handle.net/1721.1/99125 Letham, Benjamin, Cynthia Rudin, and Katherine A. Heller. “Growing a List.” Data Mining and Knowledge Discovery 27, no. 3 (November 2013): 372–95. en_US http://dx.doi.org/10.1007/s10618-013-0329-7 Data Mining and Knowledge Discovery Creative Commons Attribution-Noncommercial-Share Alike http://creativecommons.org/licenses/by-nc-sa/4.0/ application/pdf Springer-Verlag MIT web domain
spellingShingle Letham, Benjamin
Rudin, Cynthia
Heller, Katherine A.
Growing a list
title Growing a list
title_full Growing a list
title_fullStr Growing a list
title_full_unstemmed Growing a list
title_short Growing a list
title_sort growing a list
url http://hdl.handle.net/1721.1/99125
work_keys_str_mv AT lethambenjamin growingalist
AT rudincynthia growingalist
AT hellerkatherinea growingalist