DATLMedQA: A Data Augmentation and Transfer Learning Based Solution for Medical Question Answering
With the outbreak of COVID-19 that has prompted an increased focus on self-care, more and more people hope to obtain disease knowledge from the Internet. In response to this demand, medical question answering and question generation tasks have become an important part of natural language processing...
Main Authors: | , |
---|---|
Format: | Article |
Language: | English |
Published: |
MDPI AG
2021-11-01
|
Series: | Applied Sciences |
Subjects: | |
Online Access: | https://www.mdpi.com/2076-3417/11/23/11251 |
_version_ | 1797508096674758656 |
---|---|
author | Shuohua Zhou Yanping Zhang |
author_facet | Shuohua Zhou Yanping Zhang |
author_sort | Shuohua Zhou |
collection | DOAJ |
description | With the outbreak of COVID-19 that has prompted an increased focus on self-care, more and more people hope to obtain disease knowledge from the Internet. In response to this demand, medical question answering and question generation tasks have become an important part of natural language processing (NLP). However, there are limited samples of medical questions and answers, and the question generation systems cannot fully meet the needs of non-professionals for medical questions. In this research, we propose a BERT medical pretraining model, using GPT-2 for question augmentation and T5-Small for topic extraction, calculating the cosine similarity of the extracted topic and using XGBoost for prediction. With augmentation using GPT-2, the prediction accuracy of our model outperforms the state-of-the-art (SOTA) model performance. Our experiment results demonstrate the outstanding performance of our model in medical question answering and question generation tasks, and its great potential to solve other biomedical question answering challenges. |
first_indexed | 2024-03-10T04:57:29Z |
format | Article |
id | doaj.art-a6aafa3cc21845c6a0ff50ddffdacb8c |
institution | Directory Open Access Journal |
issn | 2076-3417 |
language | English |
last_indexed | 2024-03-10T04:57:29Z |
publishDate | 2021-11-01 |
publisher | MDPI AG |
record_format | Article |
series | Applied Sciences |
spelling | doaj.art-a6aafa3cc21845c6a0ff50ddffdacb8c2023-11-23T02:05:02ZengMDPI AGApplied Sciences2076-34172021-11-0111231125110.3390/app112311251DATLMedQA: A Data Augmentation and Transfer Learning Based Solution for Medical Question AnsweringShuohua Zhou0Yanping Zhang1Department of Informatics, King’s College London, Strand, London WC2R 2LS, UKDepartment of Computer Science, School of Engineering and Applied Science, Gonzaga University, Spokane, WA 99258, USAWith the outbreak of COVID-19 that has prompted an increased focus on self-care, more and more people hope to obtain disease knowledge from the Internet. In response to this demand, medical question answering and question generation tasks have become an important part of natural language processing (NLP). However, there are limited samples of medical questions and answers, and the question generation systems cannot fully meet the needs of non-professionals for medical questions. In this research, we propose a BERT medical pretraining model, using GPT-2 for question augmentation and T5-Small for topic extraction, calculating the cosine similarity of the extracted topic and using XGBoost for prediction. With augmentation using GPT-2, the prediction accuracy of our model outperforms the state-of-the-art (SOTA) model performance. Our experiment results demonstrate the outstanding performance of our model in medical question answering and question generation tasks, and its great potential to solve other biomedical question answering challenges.https://www.mdpi.com/2076-3417/11/23/11251BERTGPT-2XGBoostT5-Smallmedical question answeringtransfer learning |
spellingShingle | Shuohua Zhou Yanping Zhang DATLMedQA: A Data Augmentation and Transfer Learning Based Solution for Medical Question Answering Applied Sciences BERT GPT-2 XGBoost T5-Small medical question answering transfer learning |
title | DATLMedQA: A Data Augmentation and Transfer Learning Based Solution for Medical Question Answering |
title_full | DATLMedQA: A Data Augmentation and Transfer Learning Based Solution for Medical Question Answering |
title_fullStr | DATLMedQA: A Data Augmentation and Transfer Learning Based Solution for Medical Question Answering |
title_full_unstemmed | DATLMedQA: A Data Augmentation and Transfer Learning Based Solution for Medical Question Answering |
title_short | DATLMedQA: A Data Augmentation and Transfer Learning Based Solution for Medical Question Answering |
title_sort | datlmedqa a data augmentation and transfer learning based solution for medical question answering |
topic | BERT GPT-2 XGBoost T5-Small medical question answering transfer learning |
url | https://www.mdpi.com/2076-3417/11/23/11251 |
work_keys_str_mv | AT shuohuazhou datlmedqaadataaugmentationandtransferlearningbasedsolutionformedicalquestionanswering AT yanpingzhang datlmedqaadataaugmentationandtransferlearningbasedsolutionformedicalquestionanswering |