Unsupervised title and abstract screening for systematic review: a retrospective case-study using topic modelling methodology

Abstract Background The importance of systematic reviews in collating and summarising available research output on a particular topic cannot be over-emphasized. However, initial screening of retrieved literature is significantly time and labour intensive. Attempts at automating parts of the systemat...

Full description

Bibliographic Details
Main Authors: Agnes Natukunda, Leacky K. Muchene
Format: Article
Language:English
Published: BMC 2023-01-01
Series:Systematic Reviews
Subjects:
Online Access:https://doi.org/10.1186/s13643-022-02163-4
_version_ 1797958788562550784
author Agnes Natukunda
Leacky K. Muchene
author_facet Agnes Natukunda
Leacky K. Muchene
author_sort Agnes Natukunda
collection DOAJ
description Abstract Background The importance of systematic reviews in collating and summarising available research output on a particular topic cannot be over-emphasized. However, initial screening of retrieved literature is significantly time and labour intensive. Attempts at automating parts of the systematic review process have been made with varying degree of success partly due to being domain-specific, requiring vendor-specific software or manually labelled training data. Our primary objective was to develop statistical methodology for performing automated title and abstract screening for systematic reviews. Secondary objectives included (1) to retrospectively apply the automated screening methodology to previously manually screened systematic reviews and (2) to characterize the performance of the automated screening methodology scoring algorithm in a simulation study. Methods We implemented a Latent Dirichlet Allocation-based topic model to derive representative topics from the retrieved documents’ title and abstract. The second step involves defining a score threshold for classifying the documents as relevant for full-text review or not. The score is derived based on a set of search keywords (often the database retrieval search terms). Two systematic review studies were retrospectively used to illustrate the methodology. Results In one case study (helminth dataset), $$69.83\%$$ 69.83 % sensitivity compared to manual title and abstract screening was achieved. This is against a false positive rate of $$22.63\%$$ 22.63 % . For the second case study (Wilson disease dataset), a sensitivity of $$54.02\%$$ 54.02 % and specificity of $$67.03\%$$ 67.03 % were achieved. Conclusions Unsupervised title and abstract screening has the potential to reduce the workload involved in conducting systematic review. While sensitivity of the methodology on the tested data is low, approximately $$70\%$$ 70 % specificity was achieved. Users ought to keep in mind that potentially low sensitivity might occur. One approach to mitigate this might be to incorporate additional targeted search keywords such as the indexing databases terms into the search term copora. Moreover, automated screening can be used as an additional screener to the manual screeners.
first_indexed 2024-04-11T00:23:44Z
format Article
id doaj.art-1c7ad7cb809f4510a960ffef62adce2a
institution Directory Open Access Journal
issn 2046-4053
language English
last_indexed 2024-04-11T00:23:44Z
publishDate 2023-01-01
publisher BMC
record_format Article
series Systematic Reviews
spelling doaj.art-1c7ad7cb809f4510a960ffef62adce2a2023-01-08T12:05:56ZengBMCSystematic Reviews2046-40532023-01-0112111610.1186/s13643-022-02163-4Unsupervised title and abstract screening for systematic review: a retrospective case-study using topic modelling methodologyAgnes Natukunda0Leacky K. Muchene1Immunomodulation and Vaccines Programme, MRC/UVRI and LSHTM Uganda Research UnitStatsDecide Analytics and Consulting LimitedAbstract Background The importance of systematic reviews in collating and summarising available research output on a particular topic cannot be over-emphasized. However, initial screening of retrieved literature is significantly time and labour intensive. Attempts at automating parts of the systematic review process have been made with varying degree of success partly due to being domain-specific, requiring vendor-specific software or manually labelled training data. Our primary objective was to develop statistical methodology for performing automated title and abstract screening for systematic reviews. Secondary objectives included (1) to retrospectively apply the automated screening methodology to previously manually screened systematic reviews and (2) to characterize the performance of the automated screening methodology scoring algorithm in a simulation study. Methods We implemented a Latent Dirichlet Allocation-based topic model to derive representative topics from the retrieved documents’ title and abstract. The second step involves defining a score threshold for classifying the documents as relevant for full-text review or not. The score is derived based on a set of search keywords (often the database retrieval search terms). Two systematic review studies were retrospectively used to illustrate the methodology. Results In one case study (helminth dataset), $$69.83\%$$ 69.83 % sensitivity compared to manual title and abstract screening was achieved. This is against a false positive rate of $$22.63\%$$ 22.63 % . For the second case study (Wilson disease dataset), a sensitivity of $$54.02\%$$ 54.02 % and specificity of $$67.03\%$$ 67.03 % were achieved. Conclusions Unsupervised title and abstract screening has the potential to reduce the workload involved in conducting systematic review. While sensitivity of the methodology on the tested data is low, approximately $$70\%$$ 70 % specificity was achieved. Users ought to keep in mind that potentially low sensitivity might occur. One approach to mitigate this might be to incorporate additional targeted search keywords such as the indexing databases terms into the search term copora. Moreover, automated screening can be used as an additional screener to the manual screeners.https://doi.org/10.1186/s13643-022-02163-4Automated systematic reviewAbstract screeningLatent Dirichlet AllocationTopic modellingUnsupervised learning
spellingShingle Agnes Natukunda
Leacky K. Muchene
Unsupervised title and abstract screening for systematic review: a retrospective case-study using topic modelling methodology
Systematic Reviews
Automated systematic review
Abstract screening
Latent Dirichlet Allocation
Topic modelling
Unsupervised learning
title Unsupervised title and abstract screening for systematic review: a retrospective case-study using topic modelling methodology
title_full Unsupervised title and abstract screening for systematic review: a retrospective case-study using topic modelling methodology
title_fullStr Unsupervised title and abstract screening for systematic review: a retrospective case-study using topic modelling methodology
title_full_unstemmed Unsupervised title and abstract screening for systematic review: a retrospective case-study using topic modelling methodology
title_short Unsupervised title and abstract screening for systematic review: a retrospective case-study using topic modelling methodology
title_sort unsupervised title and abstract screening for systematic review a retrospective case study using topic modelling methodology
topic Automated systematic review
Abstract screening
Latent Dirichlet Allocation
Topic modelling
Unsupervised learning
url https://doi.org/10.1186/s13643-022-02163-4
work_keys_str_mv AT agnesnatukunda unsupervisedtitleandabstractscreeningforsystematicreviewaretrospectivecasestudyusingtopicmodellingmethodology
AT leackykmuchene unsupervisedtitleandabstractscreeningforsystematicreviewaretrospectivecasestudyusingtopicmodellingmethodology