Auxiliary-based extension of multi-tasking sequence-to-sequence model for chatbot answers

Chatbots that can answer user questions have a great future to assist humans to be very productive. Question-answering (QA) chatbots can be implemented using machine learning (ML) or rules. ML chatbots are better compared to rule-based chatbots because ML chatbots are expandable with continuously tr...

Full description

Bibliographic Details
Main Author:	Palasundram, Kulothunkan
Format:	Thesis
Language:	English
Published:	2021
Subjects:	Chatbots Computing platforms
Online Access:	http://psasir.upm.edu.my/id/eprint/104064/1/FSKTM%202022%2011%20IR.pdf

_version_	1825938965382823936
author	Palasundram, Kulothunkan
author_facet	Palasundram, Kulothunkan
author_sort	Palasundram, Kulothunkan
collection	UPM
description	Chatbots that can answer user questions have a great future to assist humans to be very productive. Question-answering (QA) chatbots can be implemented using machine learning (ML) or rules. ML chatbots are better compared to rule-based chatbots because ML chatbots are expandable with continuously training. Since its inception for the machine-learning-based translation problem domain in 2014, the sequence-to-sequence (Seq2Seq) training approach has shown remarkable progress in developing chatbots. Nevertheless, Seq2Seq chatbots have a weakness whereby it tends to produce irrelevant responses and is not meaningful hence may reduce the chatbot acceptance. The flaw is caused by three factors: “Language Model Influence”, “Question Encoding Overfitting”, and “Answer Generation Overfitting”. Besides, many chatbots are developed using the single-task learning (“STL”) method which executes only the response generation task. Recent works utilize multi-task learning (MTL) to overcome the weakness, but they still produce generic answers which are not consistent with the questions. Therefore, this research presents “SEQ2SEQ++”. “SEQ2SEQ++” is a Seq2Seq MTL learning method which comprises of four (4) components (“Multi-Functional Encoder” (MFE), “Answer Decoder”, “Answer Encoder”, “Ternary-Classifier” (TC)) and is trained using “Dynamic Weights” algorithm and “Comprehensive Attention Mechanism” (CAM). All these methods and mechanisms are novel approaches proposed in this work. Experiments were conducted on two (2) publicly available published academic datasets (SQuAD and NarrativeQA) to measure the performance of the suggested method against two current MTL methods (“MTL-LTS” and “MTL-BC”). “MTL-BC” executes response generation and binary question-response categorization in parallel. “MTL-LTS” executes first-word generation subsequently response generation in sequential order. Experiment outcomes show that “SEQ2SEQ++” outexecutes the benchmark works on all assessment metrics used in this study. For the “BLEU” metric, “SEQ2SEQ++” performed 44.42% superior to “MTL-BC” on NarrativeQA and 17.31% superior to “MTL-BC” on SQuAD correspondingly. On “WER”, “SEQ2SEQ++” performed 58.83% superior to “MTL-LTS” on NarrativeQA and 37.26% superior to “MTL-BC” on SQuAD correspondingly. As for “Distinct-2”, “SEQ2SEQ++” performed 0.73% superior to “MTL-BC” on NarrativeQA and 0.21% superior to “MTL-LTS” on SQuAD correspondingly.
first_indexed	2024-03-06T11:20:08Z
format	Thesis
id	upm.eprints-104064
institution	Universiti Putra Malaysia
language	English
last_indexed	2024-03-06T11:20:08Z
publishDate	2021
record_format	dspace
spelling	upm.eprints-1040642023-07-07T02:28:30Z http://psasir.upm.edu.my/id/eprint/104064/ Auxiliary-based extension of multi-tasking sequence-to-sequence model for chatbot answers Palasundram, Kulothunkan Chatbots that can answer user questions have a great future to assist humans to be very productive. Question-answering (QA) chatbots can be implemented using machine learning (ML) or rules. ML chatbots are better compared to rule-based chatbots because ML chatbots are expandable with continuously training. Since its inception for the machine-learning-based translation problem domain in 2014, the sequence-to-sequence (Seq2Seq) training approach has shown remarkable progress in developing chatbots. Nevertheless, Seq2Seq chatbots have a weakness whereby it tends to produce irrelevant responses and is not meaningful hence may reduce the chatbot acceptance. The flaw is caused by three factors: “Language Model Influence”, “Question Encoding Overfitting”, and “Answer Generation Overfitting”. Besides, many chatbots are developed using the single-task learning (“STL”) method which executes only the response generation task. Recent works utilize multi-task learning (MTL) to overcome the weakness, but they still produce generic answers which are not consistent with the questions. Therefore, this research presents “SEQ2SEQ++”. “SEQ2SEQ++” is a Seq2Seq MTL learning method which comprises of four (4) components (“Multi-Functional Encoder” (MFE), “Answer Decoder”, “Answer Encoder”, “Ternary-Classifier” (TC)) and is trained using “Dynamic Weights” algorithm and “Comprehensive Attention Mechanism” (CAM). All these methods and mechanisms are novel approaches proposed in this work. Experiments were conducted on two (2) publicly available published academic datasets (SQuAD and NarrativeQA) to measure the performance of the suggested method against two current MTL methods (“MTL-LTS” and “MTL-BC”). “MTL-BC” executes response generation and binary question-response categorization in parallel. “MTL-LTS” executes first-word generation subsequently response generation in sequential order. Experiment outcomes show that “SEQ2SEQ++” outexecutes the benchmark works on all assessment metrics used in this study. For the “BLEU” metric, “SEQ2SEQ++” performed 44.42% superior to “MTL-BC” on NarrativeQA and 17.31% superior to “MTL-BC” on SQuAD correspondingly. On “WER”, “SEQ2SEQ++” performed 58.83% superior to “MTL-LTS” on NarrativeQA and 37.26% superior to “MTL-BC” on SQuAD correspondingly. As for “Distinct-2”, “SEQ2SEQ++” performed 0.73% superior to “MTL-BC” on NarrativeQA and 0.21% superior to “MTL-LTS” on SQuAD correspondingly. 2021-12 Thesis NonPeerReviewed text en http://psasir.upm.edu.my/id/eprint/104064/1/FSKTM%202022%2011%20IR.pdf Palasundram, Kulothunkan (2021) Auxiliary-based extension of multi-tasking sequence-to-sequence model for chatbot answers. Doctoral thesis, Universiti Putra Malaysia. Chatbots Computing platforms
spellingShingle	Chatbots Computing platforms Palasundram, Kulothunkan Auxiliary-based extension of multi-tasking sequence-to-sequence model for chatbot answers
title	Auxiliary-based extension of multi-tasking sequence-to-sequence model for chatbot answers
title_full	Auxiliary-based extension of multi-tasking sequence-to-sequence model for chatbot answers
title_fullStr	Auxiliary-based extension of multi-tasking sequence-to-sequence model for chatbot answers
title_full_unstemmed	Auxiliary-based extension of multi-tasking sequence-to-sequence model for chatbot answers
title_short	Auxiliary-based extension of multi-tasking sequence-to-sequence model for chatbot answers
title_sort	auxiliary based extension of multi tasking sequence to sequence model for chatbot answers
topic	Chatbots Computing platforms
url	http://psasir.upm.edu.my/id/eprint/104064/1/FSKTM%202022%2011%20IR.pdf
work_keys_str_mv	AT palasundramkulothunkan auxiliarybasedextensionofmultitaskingsequencetosequencemodelforchatbotanswers

Auxiliary-based extension of multi-tasking sequence-to-sequence model for chatbot answers

Similar Items