Document Difficulty Aspects for Medical Practitioners: Enhancing Information Retrieval in Personalized Search Engines

Timely and relevant information enables clinicians to make informed decisions about patient care outcomes. However, discovering related and understandable information from the vast medical literature is challenging. To address this problem, we aim to enable the development of search engines that mee...

Full description

Bibliographic Details
Main Authors: Sameh Frihat, Catharina Lena Beckmann, Eva Maria Hartmann, Norbert Fuhr
Format: Article
Language:English
Published: MDPI AG 2023-09-01
Series:Applied Sciences
Subjects:
Online Access:https://www.mdpi.com/2076-3417/13/19/10612
_version_ 1827722868509638656
author Sameh Frihat
Catharina Lena Beckmann
Eva Maria Hartmann
Norbert Fuhr
author_facet Sameh Frihat
Catharina Lena Beckmann
Eva Maria Hartmann
Norbert Fuhr
author_sort Sameh Frihat
collection DOAJ
description Timely and relevant information enables clinicians to make informed decisions about patient care outcomes. However, discovering related and understandable information from the vast medical literature is challenging. To address this problem, we aim to enable the development of search engines that meet the needs of medical practitioners by incorporating text difficulty features. We collected a dataset of 209 scientific research abstracts from different medical fields, available in both English and German. To determine the difficulty aspects of readability and technical level of each abstract, 216 medical experts annotated the dataset. We used a pre-trained BERT model, fine-tuned to our dataset, to develop a regression model predicting those difficulty features of abstracts. To highlight the strength of this approach, the model was compared to readability formulas currently in use. Analysis of the dataset revealed that German abstracts are more technically complex and less readable than their English counterparts. Our baseline model showed greater efficacy than current readability formulas in predicting domain-specific readability aspects. Conclusion: Incorporating these text difficulty aspects into the search engine will provide healthcare professionals with reliable and efficient information retrieval tools. Additionally, the dataset can serve as a starting point for future research.
first_indexed 2024-03-10T21:49:21Z
format Article
id doaj.art-2f568d62a50940349fe556f5f0a8021f
institution Directory Open Access Journal
issn 2076-3417
language English
last_indexed 2024-03-10T21:49:21Z
publishDate 2023-09-01
publisher MDPI AG
record_format Article
series Applied Sciences
spelling doaj.art-2f568d62a50940349fe556f5f0a8021f2023-11-19T14:01:47ZengMDPI AGApplied Sciences2076-34172023-09-0113191061210.3390/app131910612Document Difficulty Aspects for Medical Practitioners: Enhancing Information Retrieval in Personalized Search EnginesSameh Frihat0Catharina Lena Beckmann1Eva Maria Hartmann2Norbert Fuhr3Department of Information Engineering, University of Duisburg-Essen, 47057 Duisburg, GermanyDepartment of Computer Science, University of Applied Sciences and Arts Dortmund, 44227 Dortmund, GermanyDepartment of Computer Science, University of Applied Sciences and Arts Dortmund, 44227 Dortmund, GermanyDepartment of Information Engineering, University of Duisburg-Essen, 47057 Duisburg, GermanyTimely and relevant information enables clinicians to make informed decisions about patient care outcomes. However, discovering related and understandable information from the vast medical literature is challenging. To address this problem, we aim to enable the development of search engines that meet the needs of medical practitioners by incorporating text difficulty features. We collected a dataset of 209 scientific research abstracts from different medical fields, available in both English and German. To determine the difficulty aspects of readability and technical level of each abstract, 216 medical experts annotated the dataset. We used a pre-trained BERT model, fine-tuned to our dataset, to develop a regression model predicting those difficulty features of abstracts. To highlight the strength of this approach, the model was compared to readability formulas currently in use. Analysis of the dataset revealed that German abstracts are more technically complex and less readable than their English counterparts. Our baseline model showed greater efficacy than current readability formulas in predicting domain-specific readability aspects. Conclusion: Incorporating these text difficulty aspects into the search engine will provide healthcare professionals with reliable and efficient information retrieval tools. Additionally, the dataset can serve as a starting point for future research.https://www.mdpi.com/2076-3417/13/19/10612personalized information retrievalmedical practitionersreadability assessmentmedical literature aspects
spellingShingle Sameh Frihat
Catharina Lena Beckmann
Eva Maria Hartmann
Norbert Fuhr
Document Difficulty Aspects for Medical Practitioners: Enhancing Information Retrieval in Personalized Search Engines
Applied Sciences
personalized information retrieval
medical practitioners
readability assessment
medical literature aspects
title Document Difficulty Aspects for Medical Practitioners: Enhancing Information Retrieval in Personalized Search Engines
title_full Document Difficulty Aspects for Medical Practitioners: Enhancing Information Retrieval in Personalized Search Engines
title_fullStr Document Difficulty Aspects for Medical Practitioners: Enhancing Information Retrieval in Personalized Search Engines
title_full_unstemmed Document Difficulty Aspects for Medical Practitioners: Enhancing Information Retrieval in Personalized Search Engines
title_short Document Difficulty Aspects for Medical Practitioners: Enhancing Information Retrieval in Personalized Search Engines
title_sort document difficulty aspects for medical practitioners enhancing information retrieval in personalized search engines
topic personalized information retrieval
medical practitioners
readability assessment
medical literature aspects
url https://www.mdpi.com/2076-3417/13/19/10612
work_keys_str_mv AT samehfrihat documentdifficultyaspectsformedicalpractitionersenhancinginformationretrievalinpersonalizedsearchengines
AT catharinalenabeckmann documentdifficultyaspectsformedicalpractitionersenhancinginformationretrievalinpersonalizedsearchengines
AT evamariahartmann documentdifficultyaspectsformedicalpractitionersenhancinginformationretrievalinpersonalizedsearchengines
AT norbertfuhr documentdifficultyaspectsformedicalpractitionersenhancinginformationretrievalinpersonalizedsearchengines