An adaptive term proximity based rocchio’s model for clinical decision support retrieval
Abstract Background In order to better help doctors make decision in the clinical setting, research is necessary to connect electronic health record (EHR) with the biomedical literature. Pseudo Relevance Feedback (PRF) is a kind of classical query modification technique that has shown to be effectiv...
Main Authors: | , , , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
BMC
2019-12-01
|
Series: | BMC Medical Informatics and Decision Making |
Subjects: | |
Online Access: | https://doi.org/10.1186/s12911-019-0986-6 |
_version_ | 1818329806212694016 |
---|---|
author | Min Pan Yue Zhang Qiang Zhu Bo Sun Tingting He Xingpeng Jiang |
author_facet | Min Pan Yue Zhang Qiang Zhu Bo Sun Tingting He Xingpeng Jiang |
author_sort | Min Pan |
collection | DOAJ |
description | Abstract Background In order to better help doctors make decision in the clinical setting, research is necessary to connect electronic health record (EHR) with the biomedical literature. Pseudo Relevance Feedback (PRF) is a kind of classical query modification technique that has shown to be effective in many retrieval models and thus suitable for handling terse language and clinical jargons in EHR. Previous work has introduced a set of constraints (axioms) of traditional PRF model. However, in the feedback document, the importance degree of candidate term and the co-occurrence relationship between a candidate term and a query term. Most methods do not consider both of these factors. Intuitively, terms that have higher co-occurrence degree with a query term are more likely to be related to the query topic. Methods In this paper, we incorporate original HAL model into the Rocchio’s model, and propose a new concept of term proximity feedback weight. A HAL-based Rocchio’s model in the query expansion, called HRoc, is proposed. Meanwhile, we design three normalization methods to better incorporate proximity information to query expansion. Finally, we introduce an adaptive parameter to replace the length of sliding window of HAL model, and it can select window size according to document length. Results Based on 2016 TREC Clinical Support medicine dataset, experimental results demonstrate that the proposed HRoc and HRoc_AP models superior to other advanced models, such as PRoc2 and TF-PRF methods on various evaluation metrics. Among them, compared with the Proc2 and TF-PRF models, the MAP of our model is increased by 8.5% and 12.24% respectively, while the F1 score of our model is increased by 7.86% and 9.88% respectively. Conclusions The proposed HRoc model can effectively enhance the precision and the recall rate of Information Retrieval and gets a more precise result than other models. Furthermore, after introducing self-adaptive parameter, the advanced HRoc_AP model uses less hyper-parameters than other models while enjoys an equivalent performance, which greatly improves the efficiency and applicability of the model and thus helps clinicians to retrieve clinical support document effectively. |
first_indexed | 2024-12-13T12:53:54Z |
format | Article |
id | doaj.art-2b12f3a4ece34d639476059ccf0f1ebd |
institution | Directory Open Access Journal |
issn | 1472-6947 |
language | English |
last_indexed | 2024-12-13T12:53:54Z |
publishDate | 2019-12-01 |
publisher | BMC |
record_format | Article |
series | BMC Medical Informatics and Decision Making |
spelling | doaj.art-2b12f3a4ece34d639476059ccf0f1ebd2022-12-21T23:45:14ZengBMCBMC Medical Informatics and Decision Making1472-69472019-12-0119S911110.1186/s12911-019-0986-6An adaptive term proximity based rocchio’s model for clinical decision support retrievalMin Pan0Yue Zhang1Qiang Zhu2Bo Sun3Tingting He4Xingpeng Jiang5National Engineering Research Center for E-Learning, Central China Normal UniversitySchool of Computer, Central China Normal UniversitySchool of Computer, Central China Normal UniversityNational Engineering Research Center for E-Learning, Central China Normal UniversitySchool of Computer, Central China Normal UniversitySchool of Computer, Central China Normal UniversityAbstract Background In order to better help doctors make decision in the clinical setting, research is necessary to connect electronic health record (EHR) with the biomedical literature. Pseudo Relevance Feedback (PRF) is a kind of classical query modification technique that has shown to be effective in many retrieval models and thus suitable for handling terse language and clinical jargons in EHR. Previous work has introduced a set of constraints (axioms) of traditional PRF model. However, in the feedback document, the importance degree of candidate term and the co-occurrence relationship between a candidate term and a query term. Most methods do not consider both of these factors. Intuitively, terms that have higher co-occurrence degree with a query term are more likely to be related to the query topic. Methods In this paper, we incorporate original HAL model into the Rocchio’s model, and propose a new concept of term proximity feedback weight. A HAL-based Rocchio’s model in the query expansion, called HRoc, is proposed. Meanwhile, we design three normalization methods to better incorporate proximity information to query expansion. Finally, we introduce an adaptive parameter to replace the length of sliding window of HAL model, and it can select window size according to document length. Results Based on 2016 TREC Clinical Support medicine dataset, experimental results demonstrate that the proposed HRoc and HRoc_AP models superior to other advanced models, such as PRoc2 and TF-PRF methods on various evaluation metrics. Among them, compared with the Proc2 and TF-PRF models, the MAP of our model is increased by 8.5% and 12.24% respectively, while the F1 score of our model is increased by 7.86% and 9.88% respectively. Conclusions The proposed HRoc model can effectively enhance the precision and the recall rate of Information Retrieval and gets a more precise result than other models. Furthermore, after introducing self-adaptive parameter, the advanced HRoc_AP model uses less hyper-parameters than other models while enjoys an equivalent performance, which greatly improves the efficiency and applicability of the model and thus helps clinicians to retrieve clinical support document effectively.https://doi.org/10.1186/s12911-019-0986-6Clinical retrievalTerm proximityQuery expansionPseudo relevance feedback |
spellingShingle | Min Pan Yue Zhang Qiang Zhu Bo Sun Tingting He Xingpeng Jiang An adaptive term proximity based rocchio’s model for clinical decision support retrieval BMC Medical Informatics and Decision Making Clinical retrieval Term proximity Query expansion Pseudo relevance feedback |
title | An adaptive term proximity based rocchio’s model for clinical decision support retrieval |
title_full | An adaptive term proximity based rocchio’s model for clinical decision support retrieval |
title_fullStr | An adaptive term proximity based rocchio’s model for clinical decision support retrieval |
title_full_unstemmed | An adaptive term proximity based rocchio’s model for clinical decision support retrieval |
title_short | An adaptive term proximity based rocchio’s model for clinical decision support retrieval |
title_sort | adaptive term proximity based rocchio s model for clinical decision support retrieval |
topic | Clinical retrieval Term proximity Query expansion Pseudo relevance feedback |
url | https://doi.org/10.1186/s12911-019-0986-6 |
work_keys_str_mv | AT minpan anadaptivetermproximitybasedrocchiosmodelforclinicaldecisionsupportretrieval AT yuezhang anadaptivetermproximitybasedrocchiosmodelforclinicaldecisionsupportretrieval AT qiangzhu anadaptivetermproximitybasedrocchiosmodelforclinicaldecisionsupportretrieval AT bosun anadaptivetermproximitybasedrocchiosmodelforclinicaldecisionsupportretrieval AT tingtinghe anadaptivetermproximitybasedrocchiosmodelforclinicaldecisionsupportretrieval AT xingpengjiang anadaptivetermproximitybasedrocchiosmodelforclinicaldecisionsupportretrieval AT minpan adaptivetermproximitybasedrocchiosmodelforclinicaldecisionsupportretrieval AT yuezhang adaptivetermproximitybasedrocchiosmodelforclinicaldecisionsupportretrieval AT qiangzhu adaptivetermproximitybasedrocchiosmodelforclinicaldecisionsupportretrieval AT bosun adaptivetermproximitybasedrocchiosmodelforclinicaldecisionsupportretrieval AT tingtinghe adaptivetermproximitybasedrocchiosmodelforclinicaldecisionsupportretrieval AT xingpengjiang adaptivetermproximitybasedrocchiosmodelforclinicaldecisionsupportretrieval |