Pre-trained Language Models for Clinical Systematic Literature Reviews
Although systematic literature reviews play a critical role in clinical-based decision making, manual methods for information extraction can sometimes take prohibitively long. In this work, we first describe the construction of datasets in two distinct clinical domains containing randomized trials a...
Main Author: | |
---|---|
Other Authors: | |
Format: | Thesis |
Published: |
Massachusetts Institute of Technology
2022
|
Online Access: | https://hdl.handle.net/1721.1/143177 |
_version_ | 1826197555082428416 |
---|---|
author | Ortiz, Juan M. Ochoa |
author2 | Barzilay, Regina |
author_facet | Barzilay, Regina Ortiz, Juan M. Ochoa |
author_sort | Ortiz, Juan M. Ochoa |
collection | MIT |
description | Although systematic literature reviews play a critical role in clinical-based decision making, manual methods for information extraction can sometimes take prohibitively long. In this work, we first describe the construction of datasets in two distinct clinical domains containing randomized trials and observational studies. We then utilize these two datasets to benchmark the performance of Pretrained Language Model (PLM) based entity and relation extraction models as well as the effect of domain specific pre-training prior to their fine-tuning. Our results show evidence to the effectiveness of pre-training using masked language modeling (MLM), a sentence-level proxy task, on boosting the performance of fine-tuned models on both inter- and intra-sentence level information extraction tasks. |
first_indexed | 2024-09-23T10:49:48Z |
format | Thesis |
id | mit-1721.1/143177 |
institution | Massachusetts Institute of Technology |
last_indexed | 2024-09-23T10:49:48Z |
publishDate | 2022 |
publisher | Massachusetts Institute of Technology |
record_format | dspace |
spelling | mit-1721.1/1431772022-06-16T03:39:20Z Pre-trained Language Models for Clinical Systematic Literature Reviews Ortiz, Juan M. Ochoa Barzilay, Regina Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science Although systematic literature reviews play a critical role in clinical-based decision making, manual methods for information extraction can sometimes take prohibitively long. In this work, we first describe the construction of datasets in two distinct clinical domains containing randomized trials and observational studies. We then utilize these two datasets to benchmark the performance of Pretrained Language Model (PLM) based entity and relation extraction models as well as the effect of domain specific pre-training prior to their fine-tuning. Our results show evidence to the effectiveness of pre-training using masked language modeling (MLM), a sentence-level proxy task, on boosting the performance of fine-tuned models on both inter- and intra-sentence level information extraction tasks. M.Eng. 2022-06-15T13:01:36Z 2022-06-15T13:01:36Z 2022-02 2022-02-22T18:32:18.655Z Thesis https://hdl.handle.net/1721.1/143177 In Copyright - Educational Use Permitted Copyright MIT http://rightsstatements.org/page/InC-EDU/1.0/ application/pdf Massachusetts Institute of Technology |
spellingShingle | Ortiz, Juan M. Ochoa Pre-trained Language Models for Clinical Systematic Literature Reviews |
title | Pre-trained Language Models for Clinical Systematic Literature Reviews |
title_full | Pre-trained Language Models for Clinical Systematic Literature Reviews |
title_fullStr | Pre-trained Language Models for Clinical Systematic Literature Reviews |
title_full_unstemmed | Pre-trained Language Models for Clinical Systematic Literature Reviews |
title_short | Pre-trained Language Models for Clinical Systematic Literature Reviews |
title_sort | pre trained language models for clinical systematic literature reviews |
url | https://hdl.handle.net/1721.1/143177 |
work_keys_str_mv | AT ortizjuanmochoa pretrainedlanguagemodelsforclinicalsystematicliteraturereviews |