Multimodal generative models for storytelling

Thesis: M. Eng., Massachusetts Institute of Technology, Department of Electrical Engineering and Computer Science, February, 2021

Bibliographic Details
Main Author:	Bensaid, Eden.
Other Authors:	Jacob Andreas and Hendrik Strobelt.
Format:	Thesis
Language:	eng
Published:	Massachusetts Institute of Technology 2021
Subjects:	Electrical Engineering and Computer Science.
Online Access:	https://hdl.handle.net/1721.1/130680

_version_	1826215913523773440
author	Bensaid, Eden.
author2	Jacob Andreas and Hendrik Strobelt.
author_facet	Jacob Andreas and Hendrik Strobelt. Bensaid, Eden.
author_sort	Bensaid, Eden.
collection	MIT
description	Thesis: M. Eng., Massachusetts Institute of Technology, Department of Electrical Engineering and Computer Science, February, 2021
first_indexed	2024-09-23T16:39:01Z
format	Thesis
id	mit-1721.1/130680
institution	Massachusetts Institute of Technology
language	eng
last_indexed	2024-09-23T16:39:01Z
publishDate	2021
publisher	Massachusetts Institute of Technology
record_format	dspace
spelling	mit-1721.1/1306802021-05-25T03:02:00Z Multimodal generative models for storytelling Bensaid, Eden. Jacob Andreas and Hendrik Strobelt. Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science. Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science Electrical Engineering and Computer Science. Thesis: M. Eng., Massachusetts Institute of Technology, Department of Electrical Engineering and Computer Science, February, 2021 Cataloged from the official PDF of thesis. Includes bibliographical references (pages 41-45). Storytelling is an open-ended task that entails creative thinking and requires a constant flow of ideas. Generative models have recently gained momentum thanks to their ability to identify complex data's inner structure and learn efficiently from unlabeled data [34]. Natural language generation (NLG) for storytelling is especially challenging because it requires the generated text to follow an overall theme while remaining creative and diverse to engage the reader [26]. Competitive story generation models still suffer from repetition [19], are unable to consistently condition on a theme [51] and struggle to produce a grounded, evolving storyboard [43]. Published story visualization architectures that generate images require a descriptive text to depict the scene to illustrate [30]. Therefore, it seems promising to evaluate an interactive multimodal generative platform that collaborates with writers to face the complex story-generation task. With co-creation, writers contribute their creative thinking, while generative models contribute to their constant workflow. In this work, we introduce a system and a web-based demo, FairyTailor¹, for machine-in-the-loop visual story co-creation. Users can create a cohesive children's story by weaving generated texts and retrieved images with their input. FairyTailor adds another modality and modifies the text generation process to produce a coherent and creative sequence of text and images. To our knowledge, this is the first dynamic tool for multimodal story generation that allows interactive co-creation of both texts and images. It allows users to give feedback on co-created stories and share their results. We release the demo source code² for other researchers' use. by Eden Bensaid. M. Eng. M.Eng. Massachusetts Institute of Technology, Department of Electrical Engineering and Computer Science 2021-05-24T19:40:13Z 2021-05-24T19:40:13Z 2021 2021 Thesis https://hdl.handle.net/1721.1/130680 1251773235 eng MIT theses may be protected by copyright. Please reuse MIT thesis content according to the MIT Libraries Permissions Policy, which is available through the URL provided. http://dspace.mit.edu/handle/1721.1/7582 45 pages application/pdf Massachusetts Institute of Technology
spellingShingle	Electrical Engineering and Computer Science. Bensaid, Eden. Multimodal generative models for storytelling
title	Multimodal generative models for storytelling
title_full	Multimodal generative models for storytelling
title_fullStr	Multimodal generative models for storytelling
title_full_unstemmed	Multimodal generative models for storytelling
title_short	Multimodal generative models for storytelling
title_sort	multimodal generative models for storytelling
topic	Electrical Engineering and Computer Science.
url	https://hdl.handle.net/1721.1/130680
work_keys_str_mv	AT bensaideden multimodalgenerativemodelsforstorytelling

Multimodal generative models for storytelling

Similar Items