Event detection for biomedical text

In the last decade, text mining in biomedical domain has received significant attention in research and many studies have been devoted to advancing the state-of-the-art natural language processing (NLP) techniques to biomedical text. Event detection is the primary step in the event extraction task...

Full description

Bibliographic Details
Main Author: Pham, Nguyen Minh Thu
Other Authors: Hui Siu Cheung
Format: Final Year Project (FYP)
Language:English
Published: Nanyang Technological University 2022
Subjects:
Online Access:https://hdl.handle.net/10356/156520
_version_ 1826120752147988480
author Pham, Nguyen Minh Thu
author2 Hui Siu Cheung
author_facet Hui Siu Cheung
Pham, Nguyen Minh Thu
author_sort Pham, Nguyen Minh Thu
collection NTU
description In the last decade, text mining in biomedical domain has received significant attention in research and many studies have been devoted to advancing the state-of-the-art natural language processing (NLP) techniques to biomedical text. Event detection is the primary step in the event extraction task, whose objective is to detect events via trigger mentions that signify the occurrence of events with a particular type. Essentially, event detection requires the construction of a semantic relationship between input text representations and the set of predefined event type labels. However, existing methods tend to pay most attention to learn the input text representations and simply use one-hot vectors for event type labels, which overlooks the importance of understanding the type label meaning. In this research, we propose a novel Label-Pivoting Biomedical Event Detection model (LPBED) which is pretrained with PubMedBERT language model and exploits the semantic meaning of the type label set. More specifically, our proposed model makes use of the underlying semantic meaning of type labels to pivot event types as clues for detecting trigger candidates. Our model gains significant benefits from the pretrained PubMedBERT model for the domain-specific knowledge of the widely-used biomedical data sources. We conduct experiments based on the benchmark GENIA Event 2011 (GE11) dataset. Without using any external knowledge bases and syntactic tools, the experimental results show that our model is robust in performance under the scenarios of limited data availability. In addition, our proposed LPBED model also outperforms the baseline BERT-CRF model used for the MAVEN dataset in general domain. It demonstrates that our proposed model achieves competitive performance for event detection in biomedical text, which provides the potential for further investigation on the event extraction task.
first_indexed 2024-10-01T05:21:34Z
format Final Year Project (FYP)
id ntu-10356/156520
institution Nanyang Technological University
language English
last_indexed 2024-10-01T05:21:34Z
publishDate 2022
publisher Nanyang Technological University
record_format dspace
spelling ntu-10356/1565202022-04-19T06:33:09Z Event detection for biomedical text Pham, Nguyen Minh Thu Hui Siu Cheung School of Computer Science and Engineering ASSCHUI@ntu.edu.sg Engineering::Computer science and engineering::Computing methodologies::Document and text processing In the last decade, text mining in biomedical domain has received significant attention in research and many studies have been devoted to advancing the state-of-the-art natural language processing (NLP) techniques to biomedical text. Event detection is the primary step in the event extraction task, whose objective is to detect events via trigger mentions that signify the occurrence of events with a particular type. Essentially, event detection requires the construction of a semantic relationship between input text representations and the set of predefined event type labels. However, existing methods tend to pay most attention to learn the input text representations and simply use one-hot vectors for event type labels, which overlooks the importance of understanding the type label meaning. In this research, we propose a novel Label-Pivoting Biomedical Event Detection model (LPBED) which is pretrained with PubMedBERT language model and exploits the semantic meaning of the type label set. More specifically, our proposed model makes use of the underlying semantic meaning of type labels to pivot event types as clues for detecting trigger candidates. Our model gains significant benefits from the pretrained PubMedBERT model for the domain-specific knowledge of the widely-used biomedical data sources. We conduct experiments based on the benchmark GENIA Event 2011 (GE11) dataset. Without using any external knowledge bases and syntactic tools, the experimental results show that our model is robust in performance under the scenarios of limited data availability. In addition, our proposed LPBED model also outperforms the baseline BERT-CRF model used for the MAVEN dataset in general domain. It demonstrates that our proposed model achieves competitive performance for event detection in biomedical text, which provides the potential for further investigation on the event extraction task. Bachelor of Engineering (Computer Science) 2022-04-19T06:33:09Z 2022-04-19T06:33:09Z 2022 Final Year Project (FYP) Pham, N. M. T. (2022). Event detection for biomedical text. Final Year Project (FYP), Nanyang Technological University, Singapore. https://hdl.handle.net/10356/156520 https://hdl.handle.net/10356/156520 en SCSE21-0342 application/pdf Nanyang Technological University
spellingShingle Engineering::Computer science and engineering::Computing methodologies::Document and text processing
Pham, Nguyen Minh Thu
Event detection for biomedical text
title Event detection for biomedical text
title_full Event detection for biomedical text
title_fullStr Event detection for biomedical text
title_full_unstemmed Event detection for biomedical text
title_short Event detection for biomedical text
title_sort event detection for biomedical text
topic Engineering::Computer science and engineering::Computing methodologies::Document and text processing
url https://hdl.handle.net/10356/156520
work_keys_str_mv AT phamnguyenminhthu eventdetectionforbiomedicaltext