Auxiliary-based extension of multi-tasking sequence-to-sequence model for chatbot answers

Chatbots that can answer user questions have a great future to assist humans to be very productive. Question-answering (QA) chatbots can be implemented using machine learning (ML) or rules. ML chatbots are better compared to rule-based chatbots because ML chatbots are expandable with continuously tr...

Full description

Bibliographic Details
Main Author: Palasundram, Kulothunkan
Format: Thesis
Language:English
Published: 2021
Subjects:
Online Access:http://psasir.upm.edu.my/id/eprint/104064/1/FSKTM%202022%2011%20IR.pdf
_version_ 1825938965382823936
author Palasundram, Kulothunkan
author_facet Palasundram, Kulothunkan
author_sort Palasundram, Kulothunkan
collection UPM
description Chatbots that can answer user questions have a great future to assist humans to be very productive. Question-answering (QA) chatbots can be implemented using machine learning (ML) or rules. ML chatbots are better compared to rule-based chatbots because ML chatbots are expandable with continuously training. Since its inception for the machine-learning-based translation problem domain in 2014, the sequence-to-sequence (Seq2Seq) training approach has shown remarkable progress in developing chatbots. Nevertheless, Seq2Seq chatbots have a weakness whereby it tends to produce irrelevant responses and is not meaningful hence may reduce the chatbot acceptance. The flaw is caused by three factors: “Language Model Influence”, “Question Encoding Overfitting”, and “Answer Generation Overfitting”. Besides, many chatbots are developed using the single-task learning (“STL”) method which executes only the response generation task. Recent works utilize multi-task learning (MTL) to overcome the weakness, but they still produce generic answers which are not consistent with the questions. Therefore, this research presents “SEQ2SEQ++”. “SEQ2SEQ++” is a Seq2Seq MTL learning method which comprises of four (4) components (“Multi-Functional Encoder” (MFE), “Answer Decoder”, “Answer Encoder”, “Ternary-Classifier” (TC)) and is trained using “Dynamic Weights” algorithm and “Comprehensive Attention Mechanism” (CAM). All these methods and mechanisms are novel approaches proposed in this work. Experiments were conducted on two (2) publicly available published academic datasets (SQuAD and NarrativeQA) to measure the performance of the suggested method against two current MTL methods (“MTL-LTS” and “MTL-BC”). “MTL-BC” executes response generation and binary question-response categorization in parallel. “MTL-LTS” executes first-word generation subsequently response generation in sequential order. Experiment outcomes show that “SEQ2SEQ++” outexecutes the benchmark works on all assessment metrics used in this study. For the “BLEU” metric, “SEQ2SEQ++” performed 44.42% superior to “MTL-BC” on NarrativeQA and 17.31% superior to “MTL-BC” on SQuAD correspondingly. On “WER”, “SEQ2SEQ++” performed 58.83% superior to “MTL-LTS” on NarrativeQA and 37.26% superior to “MTL-BC” on SQuAD correspondingly. As for “Distinct-2”, “SEQ2SEQ++” performed 0.73% superior to “MTL-BC” on NarrativeQA and 0.21% superior to “MTL-LTS” on SQuAD correspondingly.
first_indexed 2024-03-06T11:20:08Z
format Thesis
id upm.eprints-104064
institution Universiti Putra Malaysia
language English
last_indexed 2024-03-06T11:20:08Z
publishDate 2021
record_format dspace
spelling upm.eprints-1040642023-07-07T02:28:30Z http://psasir.upm.edu.my/id/eprint/104064/ Auxiliary-based extension of multi-tasking sequence-to-sequence model for chatbot answers Palasundram, Kulothunkan Chatbots that can answer user questions have a great future to assist humans to be very productive. Question-answering (QA) chatbots can be implemented using machine learning (ML) or rules. ML chatbots are better compared to rule-based chatbots because ML chatbots are expandable with continuously training. Since its inception for the machine-learning-based translation problem domain in 2014, the sequence-to-sequence (Seq2Seq) training approach has shown remarkable progress in developing chatbots. Nevertheless, Seq2Seq chatbots have a weakness whereby it tends to produce irrelevant responses and is not meaningful hence may reduce the chatbot acceptance. The flaw is caused by three factors: “Language Model Influence”, “Question Encoding Overfitting”, and “Answer Generation Overfitting”. Besides, many chatbots are developed using the single-task learning (“STL”) method which executes only the response generation task. Recent works utilize multi-task learning (MTL) to overcome the weakness, but they still produce generic answers which are not consistent with the questions. Therefore, this research presents “SEQ2SEQ++”. “SEQ2SEQ++” is a Seq2Seq MTL learning method which comprises of four (4) components (“Multi-Functional Encoder” (MFE), “Answer Decoder”, “Answer Encoder”, “Ternary-Classifier” (TC)) and is trained using “Dynamic Weights” algorithm and “Comprehensive Attention Mechanism” (CAM). All these methods and mechanisms are novel approaches proposed in this work. Experiments were conducted on two (2) publicly available published academic datasets (SQuAD and NarrativeQA) to measure the performance of the suggested method against two current MTL methods (“MTL-LTS” and “MTL-BC”). “MTL-BC” executes response generation and binary question-response categorization in parallel. “MTL-LTS” executes first-word generation subsequently response generation in sequential order. Experiment outcomes show that “SEQ2SEQ++” outexecutes the benchmark works on all assessment metrics used in this study. For the “BLEU” metric, “SEQ2SEQ++” performed 44.42% superior to “MTL-BC” on NarrativeQA and 17.31% superior to “MTL-BC” on SQuAD correspondingly. On “WER”, “SEQ2SEQ++” performed 58.83% superior to “MTL-LTS” on NarrativeQA and 37.26% superior to “MTL-BC” on SQuAD correspondingly. As for “Distinct-2”, “SEQ2SEQ++” performed 0.73% superior to “MTL-BC” on NarrativeQA and 0.21% superior to “MTL-LTS” on SQuAD correspondingly. 2021-12 Thesis NonPeerReviewed text en http://psasir.upm.edu.my/id/eprint/104064/1/FSKTM%202022%2011%20IR.pdf Palasundram, Kulothunkan (2021) Auxiliary-based extension of multi-tasking sequence-to-sequence model for chatbot answers. Doctoral thesis, Universiti Putra Malaysia. Chatbots Computing platforms
spellingShingle Chatbots
Computing platforms
Palasundram, Kulothunkan
Auxiliary-based extension of multi-tasking sequence-to-sequence model for chatbot answers
title Auxiliary-based extension of multi-tasking sequence-to-sequence model for chatbot answers
title_full Auxiliary-based extension of multi-tasking sequence-to-sequence model for chatbot answers
title_fullStr Auxiliary-based extension of multi-tasking sequence-to-sequence model for chatbot answers
title_full_unstemmed Auxiliary-based extension of multi-tasking sequence-to-sequence model for chatbot answers
title_short Auxiliary-based extension of multi-tasking sequence-to-sequence model for chatbot answers
title_sort auxiliary based extension of multi tasking sequence to sequence model for chatbot answers
topic Chatbots
Computing platforms
url http://psasir.upm.edu.my/id/eprint/104064/1/FSKTM%202022%2011%20IR.pdf
work_keys_str_mv AT palasundramkulothunkan auxiliarybasedextensionofmultitaskingsequencetosequencemodelforchatbotanswers