Forecasting Twitter topic popularity using bass diffusion model and machine learning
Thesis: S.M. in Transportation, Massachusetts Institute of Technology, Department of Civil and Environmental Engineering, 2015.
Main Author: | |
---|---|
Other Authors: | |
Format: | Thesis |
Language: | eng |
Published: |
Massachusetts Institute of Technology
2015
|
Subjects: | |
Online Access: | http://hdl.handle.net/1721.1/99575 |
_version_ | 1826203492340989952 |
---|---|
author | Shen, Yingzhen |
author2 | David Simchi-Levi. |
author_facet | David Simchi-Levi. Shen, Yingzhen |
author_sort | Shen, Yingzhen |
collection | MIT |
description | Thesis: S.M. in Transportation, Massachusetts Institute of Technology, Department of Civil and Environmental Engineering, 2015. |
first_indexed | 2024-09-23T12:37:54Z |
format | Thesis |
id | mit-1721.1/99575 |
institution | Massachusetts Institute of Technology |
language | eng |
last_indexed | 2024-09-23T12:37:54Z |
publishDate | 2015 |
publisher | Massachusetts Institute of Technology |
record_format | dspace |
spelling | mit-1721.1/995752019-04-10T12:38:42Z Forecasting Twitter topic popularity using bass diffusion model and machine learning Shen, Yingzhen David Simchi-Levi. Massachusetts Institute of Technology. Department of Civil and Environmental Engineering. Massachusetts Institute of Technology. Department of Civil and Environmental Engineering. Civil and Environmental Engineering. Thesis: S.M. in Transportation, Massachusetts Institute of Technology, Department of Civil and Environmental Engineering, 2015. Cataloged from PDF version of thesis. Includes bibliographical references (pages 91-93). Today social network websites like Twitter are important information sources for a company's marketing, logistics and supply chain. Sometimes a topic about a product will "explode" at a "peak day," suddenly being talked about by a large number of users. Predicting the diffusion process of a Twitter topic is meaningful for a company to forecast demand, and plan ahead to dispatch its products. In this study, we collected Twitter data on 220 topics, covering a wide range of fields. And we created 12 features for each topic at each time stage, e.g. number of tweets mentioning this topic per hour, number of followers of users already mentioning this topic, and percentage of root tweets among all tweets. The task in this study is to predict the total mention count within the whole time horizon, 180 days, as early and accurately as possible. To complete this task, we applied two models - fitting the curve denoting topic popularity (mention count curve) by Bass diffusion model; and using machine learning models including K-nearest-neighbor, linear regression, bagged tree, and ensemble to learn the topic popularity as a function of the features we created. The results of this study reveal that the Basic Bass model captures the underlying mechanism of the Twitter topic development process. And we can analogue Twitter topics' adoption to a new product's diffusion. Using only mention count, over the whole time horizon, the Bass model has much better predictive accuracy, compared to machine learning models with extra features. However, even with the best model (the Bass model) and focusing on the subset of topics with better predictability, predictive accuracy is still not good enough before the "explosion day." This is because "explosion" is usually triggered by news outside Twitter, and therefore is hard to predict without information outside Twitter. by Yingzhen Shen. S.M. in Transportation 2015-10-30T18:56:57Z 2015-10-30T18:56:57Z 2015 2015 Thesis http://hdl.handle.net/1721.1/99575 924831552 eng M.I.T. theses are protected by copyright. They may be viewed from this source for any purpose, but reproduction or distribution in any format is prohibited without written permission. See provided URL for inquiries about permission. http://dspace.mit.edu/handle/1721.1/7582 108 pages application/pdf Massachusetts Institute of Technology |
spellingShingle | Civil and Environmental Engineering. Shen, Yingzhen Forecasting Twitter topic popularity using bass diffusion model and machine learning |
title | Forecasting Twitter topic popularity using bass diffusion model and machine learning |
title_full | Forecasting Twitter topic popularity using bass diffusion model and machine learning |
title_fullStr | Forecasting Twitter topic popularity using bass diffusion model and machine learning |
title_full_unstemmed | Forecasting Twitter topic popularity using bass diffusion model and machine learning |
title_short | Forecasting Twitter topic popularity using bass diffusion model and machine learning |
title_sort | forecasting twitter topic popularity using bass diffusion model and machine learning |
topic | Civil and Environmental Engineering. |
url | http://hdl.handle.net/1721.1/99575 |
work_keys_str_mv | AT shenyingzhen forecastingtwittertopicpopularityusingbassdiffusionmodelandmachinelearning |