HSM-QA: Question Answering System Based on Hierarchical Semantic Matching

In recent years, Question Answering (QA) systems have gained popularity as a means of acquiring knowledge. However, the prevalent approach of matching question-answer pairs still suffers from low precision and efficiency due to the inherent ambiguity of natural language descriptions. To address thes...

Full description

Bibliographic Details
Main Authors: Jinlu Zhang, Jing He, Yiyi Zhou, Xiaoshuai Sun, Xiao Yu
Format: Article
Language:English
Published: IEEE 2023-01-01
Series:IEEE Access
Subjects:
Online Access:https://ieeexplore.ieee.org/document/10187143/
Description
Summary:In recent years, Question Answering (QA) systems have gained popularity as a means of acquiring knowledge. However, the prevalent approach of matching question-answer pairs still suffers from low precision and efficiency due to the inherent ambiguity of natural language descriptions. To address these issues, we propose a novel QA approach based on hierarchical semantic matching, termed HSM-QA. Specifically, HSM-QA is decomposed into two main steps, i.e., query-question and query-answer matchings, respectively. For query-question matching, a Siamese network is applied to calculate the similarity between query-question pairs, which recalls the most similar questions and their corresponding answers as candidates. In terms of query-answer matching, we adopt the idea of the pairwise algorithm and propose a single-stream structure to calculate the relevance between query and answer, based on which the best-matching candidates are ranked and returned. After training, these two steps are combined as an efficient QA scheme for different languages, e.g., English and Chinese. Furthermore, to address the lack of Chinese QA datasets, we collect a massive amount of text data from Chinese social media and generate a new dataset via a pre-trained language model. Extensive experiments are conducted on six QA datasets to validate our HSM-QA. The experimental results demonstrate the superior performance and efficiency of our method than a set of compared methods.
ISSN:2169-3536