Crowdsourcing step-by-step information extraction to enhance existing how-to videos

Millions of learners today use how-to videos to master new skills in a variety of domains. But browsing such videos is often tedious and inefficient because video player interfaces are not optimized for the unique step-by-step structure of such videos. This research aims to improve the learning expe...

Full description

Bibliographic Details
Main Authors:	Nguyen, Phu Tran, Weir, Sarah, Guo, Philip J., Miller, Robert C., Gajos, Krzysztof Z., Kim, Ju Ho
Other Authors:	Massachusetts Institute of Technology. Computer Science and Artificial Intelligence Laboratory
Format:	Article
Language:	en_US
Published:	Association for Computing Machinery (ACM) 2014
Online Access:	http://hdl.handle.net/1721.1/90410 https://orcid.org/0000-0001-6348-4127 https://orcid.org/0000-0002-0442-691X

_version_	1826204561671454720
author	Nguyen, Phu Tran Weir, Sarah Guo, Philip J. Miller, Robert C. Gajos, Krzysztof Z. Kim, Ju Ho
author2	Massachusetts Institute of Technology. Computer Science and Artificial Intelligence Laboratory
author_facet	Massachusetts Institute of Technology. Computer Science and Artificial Intelligence Laboratory Nguyen, Phu Tran Weir, Sarah Guo, Philip J. Miller, Robert C. Gajos, Krzysztof Z. Kim, Ju Ho
author_sort	Nguyen, Phu Tran
collection	MIT
description	Millions of learners today use how-to videos to master new skills in a variety of domains. But browsing such videos is often tedious and inefficient because video player interfaces are not optimized for the unique step-by-step structure of such videos. This research aims to improve the learning experience of existing how-to videos with step-by-step annotations. We first performed a formative study to verify that annotations are actually useful to learners. We created ToolScape, an interactive video player that displays step descriptions and intermediate result thumbnails in the video timeline. Learners in our study performed better and gained more self-efficacy using ToolScape versus a traditional video player. To add the needed step annotations to existing how-to videos at scale, we introduce a novel crowdsourcing workflow. It extracts step-by-step structure from an existing video, including step times, descriptions, and before and after images. We introduce the Find-Verify-Expand design pattern for temporal and visual annotation, which applies clustering, text processing, and visual analysis algorithms to merge crowd output. The workflow does not rely on domain-specific customization, works on top of existing videos, and recruits untrained crowd workers. We evaluated the workflow with Mechanical Turk, using 75 cooking, makeup, and Photoshop videos on YouTube. Results show that our workflow can extract steps with a quality comparable to that of trained annotators across all three domains with 77% precision and 81% recall.
first_indexed	2024-09-23T12:57:25Z
format	Article
id	mit-1721.1/90410
institution	Massachusetts Institute of Technology
language	en_US
last_indexed	2024-09-23T12:57:25Z
publishDate	2014
publisher	Association for Computing Machinery (ACM)
record_format	dspace
spelling	mit-1721.1/904102022-09-28T11:08:37Z Crowdsourcing step-by-step information extraction to enhance existing how-to videos Nguyen, Phu Tran Weir, Sarah Guo, Philip J. Miller, Robert C. Gajos, Krzysztof Z. Kim, Ju Ho Massachusetts Institute of Technology. Computer Science and Artificial Intelligence Laboratory Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science Kim, Ju Ho Nguyen, Phu Tran Weir, Sarah Miller, Robert C. Millions of learners today use how-to videos to master new skills in a variety of domains. But browsing such videos is often tedious and inefficient because video player interfaces are not optimized for the unique step-by-step structure of such videos. This research aims to improve the learning experience of existing how-to videos with step-by-step annotations. We first performed a formative study to verify that annotations are actually useful to learners. We created ToolScape, an interactive video player that displays step descriptions and intermediate result thumbnails in the video timeline. Learners in our study performed better and gained more self-efficacy using ToolScape versus a traditional video player. To add the needed step annotations to existing how-to videos at scale, we introduce a novel crowdsourcing workflow. It extracts step-by-step structure from an existing video, including step times, descriptions, and before and after images. We introduce the Find-Verify-Expand design pattern for temporal and visual annotation, which applies clustering, text processing, and visual analysis algorithms to merge crowd output. The workflow does not rely on domain-specific customization, works on top of existing videos, and recruits untrained crowd workers. We evaluated the workflow with Mechanical Turk, using 75 cooking, makeup, and Photoshop videos on YouTube. Results show that our workflow can extract steps with a quality comparable to that of trained annotators across all three domains with 77% precision and 81% recall. 2014-09-26T18:02:11Z 2014-09-26T18:02:11Z 2014-04 Article http://purl.org/eprint/type/ConferencePaper 9781450324731 http://hdl.handle.net/1721.1/90410 Juho Kim, Phu Tran Nguyen, Sarah Weir, Philip J. Guo, Robert C. Miller, and Krzysztof Z. Gajos. 2014. Crowdsourcing step-by-step information extraction to enhance existing how-to videos. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems (CHI '14). ACM, New York, NY, USA, 4017-4026. https://orcid.org/0000-0001-6348-4127 https://orcid.org/0000-0002-0442-691X en_US http://dx.doi.org/10.1145/2556288.2556986 Proceedings of the 32nd annual ACM conference on Human factors in computing systems (CHI '14) Creative Commons Attribution-Noncommercial-Share Alike http://creativecommons.org/licenses/by-nc-sa/4.0/ application/pdf Association for Computing Machinery (ACM) Other univ. web domain
spellingShingle	Nguyen, Phu Tran Weir, Sarah Guo, Philip J. Miller, Robert C. Gajos, Krzysztof Z. Kim, Ju Ho Crowdsourcing step-by-step information extraction to enhance existing how-to videos
title	Crowdsourcing step-by-step information extraction to enhance existing how-to videos
title_full	Crowdsourcing step-by-step information extraction to enhance existing how-to videos
title_fullStr	Crowdsourcing step-by-step information extraction to enhance existing how-to videos
title_full_unstemmed	Crowdsourcing step-by-step information extraction to enhance existing how-to videos
title_short	Crowdsourcing step-by-step information extraction to enhance existing how-to videos
title_sort	crowdsourcing step by step information extraction to enhance existing how to videos
url	http://hdl.handle.net/1721.1/90410 https://orcid.org/0000-0001-6348-4127 https://orcid.org/0000-0002-0442-691X
work_keys_str_mv	AT nguyenphutran crowdsourcingstepbystepinformationextractiontoenhanceexistinghowtovideos AT weirsarah crowdsourcingstepbystepinformationextractiontoenhanceexistinghowtovideos AT guophilipj crowdsourcingstepbystepinformationextractiontoenhanceexistinghowtovideos AT millerrobertc crowdsourcingstepbystepinformationextractiontoenhanceexistinghowtovideos AT gajoskrzysztofz crowdsourcingstepbystepinformationextractiontoenhanceexistinghowtovideos AT kimjuho crowdsourcingstepbystepinformationextractiontoenhanceexistinghowtovideos

Crowdsourcing step-by-step information extraction to enhance existing how-to videos

Similar Items