Natural Language Interfaces for Data Analytics

As more processes become data-driven, anyone should be able to gather insights into databases without needing to develop complex computer skills typically required for data analytics software. We propose to design new paradigms in which users rely on their own natural language to analyze and visuali...

Full description

Bibliographic Details
Main Author: Wellens, Quentin
Other Authors: Kraska, Tim
Format: Thesis
Published: Massachusetts Institute of Technology 2022
Online Access:https://hdl.handle.net/1721.1/139449
_version_ 1826191992265113600
author Wellens, Quentin
author2 Kraska, Tim
author_facet Kraska, Tim
Wellens, Quentin
author_sort Wellens, Quentin
collection MIT
description As more processes become data-driven, anyone should be able to gather insights into databases without needing to develop complex computer skills typically required for data analytics software. We propose to design new paradigms in which users rely on their own natural language to analyze and visualize data. To that end, we develop three different approaches (unsupervised, rule-based, and supervised) to infer formal specifications from natural language utterances. Contrary to most other work, we developed these approaches in a low-resource environment using synthetically generated training sets, rather than expensive and labor-intensive expert annotations or crowd-sourced examples. Finally, we conducted a study to compare our proposed paradigm to drag-and-drop mechanisms. Not only does our best-performing model, Alcurve, achieve an 86.3% test accuracy on real user input, it also enables users to be 30% more productive when solving analytical tasks, which further highlights the important improvements in usability language-based interfaces can provide.
first_indexed 2024-09-23T09:04:09Z
format Thesis
id mit-1721.1/139449
institution Massachusetts Institute of Technology
last_indexed 2024-09-23T09:04:09Z
publishDate 2022
publisher Massachusetts Institute of Technology
record_format dspace
spelling mit-1721.1/1394492022-01-15T04:01:58Z Natural Language Interfaces for Data Analytics Wellens, Quentin Kraska, Tim Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science As more processes become data-driven, anyone should be able to gather insights into databases without needing to develop complex computer skills typically required for data analytics software. We propose to design new paradigms in which users rely on their own natural language to analyze and visualize data. To that end, we develop three different approaches (unsupervised, rule-based, and supervised) to infer formal specifications from natural language utterances. Contrary to most other work, we developed these approaches in a low-resource environment using synthetically generated training sets, rather than expensive and labor-intensive expert annotations or crowd-sourced examples. Finally, we conducted a study to compare our proposed paradigm to drag-and-drop mechanisms. Not only does our best-performing model, Alcurve, achieve an 86.3% test accuracy on real user input, it also enables users to be 30% more productive when solving analytical tasks, which further highlights the important improvements in usability language-based interfaces can provide. M.Eng. 2022-01-14T15:12:08Z 2022-01-14T15:12:08Z 2021-06 2021-06-17T20:14:47.221Z Thesis https://hdl.handle.net/1721.1/139449 In Copyright - Educational Use Permitted Copyright MIT http://rightsstatements.org/page/InC-EDU/1.0/ application/pdf Massachusetts Institute of Technology
spellingShingle Wellens, Quentin
Natural Language Interfaces for Data Analytics
title Natural Language Interfaces for Data Analytics
title_full Natural Language Interfaces for Data Analytics
title_fullStr Natural Language Interfaces for Data Analytics
title_full_unstemmed Natural Language Interfaces for Data Analytics
title_short Natural Language Interfaces for Data Analytics
title_sort natural language interfaces for data analytics
url https://hdl.handle.net/1721.1/139449
work_keys_str_mv AT wellensquentin naturallanguageinterfacesfordataanalytics