API2CAN: a dataset & service for canonical utterance generation for REST APIs

Abstract Objectives Recently natural language interfaces (e.g., chatbots) have gained enormous attention. Such interfaces execute underlying application programming interfaces (APIs) based on the user's utterances to perform tasks (e.g., reporting weather). Supervised approaches for building su...

Full description

Bibliographic Details
Main Authors:	Mohammad-Ali Yaghoub-Zadeh-Fard, Boualem Benatallah
Format:	Article
Language:	English
Published:	BMC 2021-09-01
Series:	BMC Research Notes
Subjects:	Chatbots Bot development Natural language interfaces
Online Access:	https://doi.org/10.1186/s13104-021-05593-w

_version_	1818442292881522688
author	Mohammad-Ali Yaghoub-Zadeh-Fard Boualem Benatallah
author_facet	Mohammad-Ali Yaghoub-Zadeh-Fard Boualem Benatallah
author_sort	Mohammad-Ali Yaghoub-Zadeh-Fard
collection	DOAJ
description	Abstract Objectives Recently natural language interfaces (e.g., chatbots) have gained enormous attention. Such interfaces execute underlying application programming interfaces (APIs) based on the user's utterances to perform tasks (e.g., reporting weather). Supervised approaches for building such interfaces rely upon a large set of user utterances paired with APIs. Collecting such pairs is typically starts with obtaining initial utterances for a given API method. Generating initial utterances can be considered as a machine translation task in which an API method is translated into an utterance. However, the key challenge is the lack of training samples for training domain-independent translation models. In this paper, we propose a dataset for training supervised models to generate initial utterances for APIs. Data description The dataset contains 14,370 pairs of API methods and utterances. It is built automatically by converting method descriptions of a large number of APIs to user utterances; and it is cleaned manually to ensure quality. The dataset is also accompanied with a set of microservices (e.g., translating API methods to utterances) which can facilitate the process of collecting training samples for building natural language interfaces.
first_indexed	2024-12-14T18:41:50Z
format	Article
id	doaj.art-18519cab09a34f8690eb5649436f0357
institution	Directory Open Access Journal
issn	1756-0500
language	English
last_indexed	2024-12-14T18:41:50Z
publishDate	2021-09-01
publisher	BMC
record_format	Article
series	BMC Research Notes
spelling	doaj.art-18519cab09a34f8690eb5649436f03572022-12-21T22:51:28ZengBMCBMC Research Notes1756-05002021-09-011411310.1186/s13104-021-05593-wAPI2CAN: a dataset & service for canonical utterance generation for REST APIsMohammad-Ali Yaghoub-Zadeh-Fard0Boualem Benatallah1UNSW SydneyUNSW SydneyAbstract Objectives Recently natural language interfaces (e.g., chatbots) have gained enormous attention. Such interfaces execute underlying application programming interfaces (APIs) based on the user's utterances to perform tasks (e.g., reporting weather). Supervised approaches for building such interfaces rely upon a large set of user utterances paired with APIs. Collecting such pairs is typically starts with obtaining initial utterances for a given API method. Generating initial utterances can be considered as a machine translation task in which an API method is translated into an utterance. However, the key challenge is the lack of training samples for training domain-independent translation models. In this paper, we propose a dataset for training supervised models to generate initial utterances for APIs. Data description The dataset contains 14,370 pairs of API methods and utterances. It is built automatically by converting method descriptions of a large number of APIs to user utterances; and it is cleaned manually to ensure quality. The dataset is also accompanied with a set of microservices (e.g., translating API methods to utterances) which can facilitate the process of collecting training samples for building natural language interfaces.https://doi.org/10.1186/s13104-021-05593-wChatbotsBot developmentNatural language interfaces
spellingShingle	Mohammad-Ali Yaghoub-Zadeh-Fard Boualem Benatallah API2CAN: a dataset & service for canonical utterance generation for REST APIs BMC Research Notes Chatbots Bot development Natural language interfaces
title	API2CAN: a dataset & service for canonical utterance generation for REST APIs
title_full	API2CAN: a dataset & service for canonical utterance generation for REST APIs
title_fullStr	API2CAN: a dataset & service for canonical utterance generation for REST APIs
title_full_unstemmed	API2CAN: a dataset & service for canonical utterance generation for REST APIs
title_short	API2CAN: a dataset & service for canonical utterance generation for REST APIs
title_sort	api2can a dataset service for canonical utterance generation for rest apis
topic	Chatbots Bot development Natural language interfaces
url	https://doi.org/10.1186/s13104-021-05593-w
work_keys_str_mv	AT mohammadaliyaghoubzadehfard api2canadatasetserviceforcanonicalutterancegenerationforrestapis AT boualembenatallah api2canadatasetserviceforcanonicalutterancegenerationforrestapis

API2CAN: a dataset & service for canonical utterance generation for REST APIs

Similar Items