Overview of the ID, EPI and REL tasks of BioNLP Shared Task 2011

<p>Abstract</p> <p>We present the preparation, resources, results and analysis of three tasks of the BioNLP Shared Task 2011: the main tasks on Infectious Diseases (ID) and Epigenetics and Post-translational Modifications (EPI), and the supporting task on Entity Relations (REL). Th...

Full description

Bibliographic Details
Main Authors: Pyysalo Sampo, Ohta Tomoko, Rak Rafal, Sullivan Dan, Mao Chunhong, Wang Chunxia, Sobral Bruno, Tsujii Jun'ichi, Ananiadou Sophia
Format: Article
Language:English
Published: BMC 2012-06-01
Series:BMC Bioinformatics
Online Access:http://2011.bionlp-st.org
_version_ 1818146734849654784
author Pyysalo Sampo
Ohta Tomoko
Rak Rafal
Sullivan Dan
Mao Chunhong
Wang Chunxia
Sobral Bruno
Tsujii Jun'ichi
Ananiadou Sophia
author_facet Pyysalo Sampo
Ohta Tomoko
Rak Rafal
Sullivan Dan
Mao Chunhong
Wang Chunxia
Sobral Bruno
Tsujii Jun'ichi
Ananiadou Sophia
author_sort Pyysalo Sampo
collection DOAJ
description <p>Abstract</p> <p>We present the preparation, resources, results and analysis of three tasks of the BioNLP Shared Task 2011: the main tasks on Infectious Diseases (ID) and Epigenetics and Post-translational Modifications (EPI), and the supporting task on Entity Relations (REL). The two main tasks represent extensions of the event extraction model introduced in the BioNLP Shared Task 2009 (ST'09) to two new areas of biomedical scientific literature, each motivated by the needs of specific biocuration tasks. The ID task concerns the molecular mechanisms of infection, virulence and resistance, focusing in particular on the functions of a class of signaling systems that are ubiquitous in bacteria. The EPI task is dedicated to the extraction of statements regarding chemical modifications of DNA and proteins, with particular emphasis on changes relating to the epigenetic control of gene expression. By contrast to these two application-oriented main tasks, the REL task seeks to support extraction in general by separating challenges relating to part-of relations into a subproblem that can be addressed by independent systems. Seven groups participated in each of the two main tasks and four groups in the supporting task. The participating systems indicated advances in the capability of event extraction methods and demonstrated generalization in many aspects: from abstracts to full texts, from previously considered subdomains to new ones, and from the ST'09 extraction targets to other entities and events. The highest performance achieved in the supporting task REL, 58% F-score, is broadly comparable with levels reported for other relation extraction tasks. For the ID task, the highest-performing system achieved 56% F-score, comparable to the state-of-the-art performance at the established ST'09 task. In the EPI task, the best result was 53% F-score for the full set of extraction targets and 69% F-score for a reduced set of core extraction targets, approaching a level of performance sufficient for user-facing applications. In this study, we extend on previously reported results and perform further analyses of the outputs of the participating systems. We place specific emphasis on aspects of system performance relating to real-world applicability, considering alternate evaluation metrics and performing additional manual analysis of system outputs. We further demonstrate that the strengths of extraction systems can be combined to improve on the performance achieved by any system in isolation. The manually annotated corpora, supporting resources, and evaluation tools for all tasks are available from <url>http://www.bionlp-st.org</url> and the tasks continue as open challenges for all interested parties.</p>
first_indexed 2024-12-11T12:24:04Z
format Article
id doaj.art-1ed1f1b2648140deb04f6e96ea358264
institution Directory Open Access Journal
issn 1471-2105
language English
last_indexed 2024-12-11T12:24:04Z
publishDate 2012-06-01
publisher BMC
record_format Article
series BMC Bioinformatics
spelling doaj.art-1ed1f1b2648140deb04f6e96ea3582642022-12-22T01:07:26ZengBMCBMC Bioinformatics1471-21052012-06-0113Suppl 11S210.1186/1471-2105-13-S11-S2Overview of the ID, EPI and REL tasks of BioNLP Shared Task 2011Pyysalo SampoOhta TomokoRak RafalSullivan DanMao ChunhongWang ChunxiaSobral BrunoTsujii Jun'ichiAnaniadou Sophia<p>Abstract</p> <p>We present the preparation, resources, results and analysis of three tasks of the BioNLP Shared Task 2011: the main tasks on Infectious Diseases (ID) and Epigenetics and Post-translational Modifications (EPI), and the supporting task on Entity Relations (REL). The two main tasks represent extensions of the event extraction model introduced in the BioNLP Shared Task 2009 (ST'09) to two new areas of biomedical scientific literature, each motivated by the needs of specific biocuration tasks. The ID task concerns the molecular mechanisms of infection, virulence and resistance, focusing in particular on the functions of a class of signaling systems that are ubiquitous in bacteria. The EPI task is dedicated to the extraction of statements regarding chemical modifications of DNA and proteins, with particular emphasis on changes relating to the epigenetic control of gene expression. By contrast to these two application-oriented main tasks, the REL task seeks to support extraction in general by separating challenges relating to part-of relations into a subproblem that can be addressed by independent systems. Seven groups participated in each of the two main tasks and four groups in the supporting task. The participating systems indicated advances in the capability of event extraction methods and demonstrated generalization in many aspects: from abstracts to full texts, from previously considered subdomains to new ones, and from the ST'09 extraction targets to other entities and events. The highest performance achieved in the supporting task REL, 58% F-score, is broadly comparable with levels reported for other relation extraction tasks. For the ID task, the highest-performing system achieved 56% F-score, comparable to the state-of-the-art performance at the established ST'09 task. In the EPI task, the best result was 53% F-score for the full set of extraction targets and 69% F-score for a reduced set of core extraction targets, approaching a level of performance sufficient for user-facing applications. In this study, we extend on previously reported results and perform further analyses of the outputs of the participating systems. We place specific emphasis on aspects of system performance relating to real-world applicability, considering alternate evaluation metrics and performing additional manual analysis of system outputs. We further demonstrate that the strengths of extraction systems can be combined to improve on the performance achieved by any system in isolation. The manually annotated corpora, supporting resources, and evaluation tools for all tasks are available from <url>http://www.bionlp-st.org</url> and the tasks continue as open challenges for all interested parties.</p>http://2011.bionlp-st.org
spellingShingle Pyysalo Sampo
Ohta Tomoko
Rak Rafal
Sullivan Dan
Mao Chunhong
Wang Chunxia
Sobral Bruno
Tsujii Jun'ichi
Ananiadou Sophia
Overview of the ID, EPI and REL tasks of BioNLP Shared Task 2011
BMC Bioinformatics
title Overview of the ID, EPI and REL tasks of BioNLP Shared Task 2011
title_full Overview of the ID, EPI and REL tasks of BioNLP Shared Task 2011
title_fullStr Overview of the ID, EPI and REL tasks of BioNLP Shared Task 2011
title_full_unstemmed Overview of the ID, EPI and REL tasks of BioNLP Shared Task 2011
title_short Overview of the ID, EPI and REL tasks of BioNLP Shared Task 2011
title_sort overview of the id epi and rel tasks of bionlp shared task 2011
url http://2011.bionlp-st.org
work_keys_str_mv AT pyysalosampo overviewoftheidepiandreltasksofbionlpsharedtask2011
AT ohtatomoko overviewoftheidepiandreltasksofbionlpsharedtask2011
AT rakrafal overviewoftheidepiandreltasksofbionlpsharedtask2011
AT sullivandan overviewoftheidepiandreltasksofbionlpsharedtask2011
AT maochunhong overviewoftheidepiandreltasksofbionlpsharedtask2011
AT wangchunxia overviewoftheidepiandreltasksofbionlpsharedtask2011
AT sobralbruno overviewoftheidepiandreltasksofbionlpsharedtask2011
AT tsujiijunichi overviewoftheidepiandreltasksofbionlpsharedtask2011
AT ananiadousophia overviewoftheidepiandreltasksofbionlpsharedtask2011