Processing and visualizing the data in tweets

Microblogs such as Twitter provide a valuable stream of diverse user-generated data. While the data extracted from Twitter is generally timely and accurate, the process by which developers extract structured data from the tweet stream is ad-hoc and requires reimplementation of common data manipulati...

Full description

Bibliographic Details
Main Authors: Marcus, Adam, Bernstein, Michael S., Badar, Osama, Karger, David R., Madden, Samuel R., Miller, Robert C.
Other Authors: Massachusetts Institute of Technology. Computer Science and Artificial Intelligence Laboratory
Format: Article
Language:en_US
Published: Association for Computing Machinery 2013
Online Access:http://hdl.handle.net/1721.1/79351
https://orcid.org/0000-0002-7470-3265
https://orcid.org/0000-0002-0024-5847
https://orcid.org/0000-0002-0442-691X
Description
Summary:Microblogs such as Twitter provide a valuable stream of diverse user-generated data. While the data extracted from Twitter is generally timely and accurate, the process by which developers extract structured data from the tweet stream is ad-hoc and requires reimplementation of common data manipulation primitives. In this paper, we present two systems for querying and extracting structure from Twitter-embedded data. The first, TweeQL, provides a streaming SQL-like interface to the Twitter API, making common tweet processing tasks simpler. The second, TwitInfo, shows how end-users can interact with and understand aggregated data from the tweet stream, in addition to showcasing the power of the TweeQL language. Together these systems show the richness of content that can be extracted from Twitter.