Pre-trained Language Models for Clinical Systematic Literature Reviews

Although systematic literature reviews play a critical role in clinical-based decision making, manual methods for information extraction can sometimes take prohibitively long. In this work, we first describe the construction of datasets in two distinct clinical domains containing randomized trials a...

Full description

Bibliographic Details
Main Author: Ortiz, Juan M. Ochoa
Other Authors: Barzilay, Regina
Format: Thesis
Published: Massachusetts Institute of Technology 2022
Online Access:https://hdl.handle.net/1721.1/143177
_version_ 1826197555082428416
author Ortiz, Juan M. Ochoa
author2 Barzilay, Regina
author_facet Barzilay, Regina
Ortiz, Juan M. Ochoa
author_sort Ortiz, Juan M. Ochoa
collection MIT
description Although systematic literature reviews play a critical role in clinical-based decision making, manual methods for information extraction can sometimes take prohibitively long. In this work, we first describe the construction of datasets in two distinct clinical domains containing randomized trials and observational studies. We then utilize these two datasets to benchmark the performance of Pretrained Language Model (PLM) based entity and relation extraction models as well as the effect of domain specific pre-training prior to their fine-tuning. Our results show evidence to the effectiveness of pre-training using masked language modeling (MLM), a sentence-level proxy task, on boosting the performance of fine-tuned models on both inter- and intra-sentence level information extraction tasks.
first_indexed 2024-09-23T10:49:48Z
format Thesis
id mit-1721.1/143177
institution Massachusetts Institute of Technology
last_indexed 2024-09-23T10:49:48Z
publishDate 2022
publisher Massachusetts Institute of Technology
record_format dspace
spelling mit-1721.1/1431772022-06-16T03:39:20Z Pre-trained Language Models for Clinical Systematic Literature Reviews Ortiz, Juan M. Ochoa Barzilay, Regina Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science Although systematic literature reviews play a critical role in clinical-based decision making, manual methods for information extraction can sometimes take prohibitively long. In this work, we first describe the construction of datasets in two distinct clinical domains containing randomized trials and observational studies. We then utilize these two datasets to benchmark the performance of Pretrained Language Model (PLM) based entity and relation extraction models as well as the effect of domain specific pre-training prior to their fine-tuning. Our results show evidence to the effectiveness of pre-training using masked language modeling (MLM), a sentence-level proxy task, on boosting the performance of fine-tuned models on both inter- and intra-sentence level information extraction tasks. M.Eng. 2022-06-15T13:01:36Z 2022-06-15T13:01:36Z 2022-02 2022-02-22T18:32:18.655Z Thesis https://hdl.handle.net/1721.1/143177 In Copyright - Educational Use Permitted Copyright MIT http://rightsstatements.org/page/InC-EDU/1.0/ application/pdf Massachusetts Institute of Technology
spellingShingle Ortiz, Juan M. Ochoa
Pre-trained Language Models for Clinical Systematic Literature Reviews
title Pre-trained Language Models for Clinical Systematic Literature Reviews
title_full Pre-trained Language Models for Clinical Systematic Literature Reviews
title_fullStr Pre-trained Language Models for Clinical Systematic Literature Reviews
title_full_unstemmed Pre-trained Language Models for Clinical Systematic Literature Reviews
title_short Pre-trained Language Models for Clinical Systematic Literature Reviews
title_sort pre trained language models for clinical systematic literature reviews
url https://hdl.handle.net/1721.1/143177
work_keys_str_mv AT ortizjuanmochoa pretrainedlanguagemodelsforclinicalsystematicliteraturereviews