COST-SENSITIVE STRUCTURED PERCEPTRON INCORPORATING CATEGORY HIERARCHY FOR NAMED ENTITY RECOGNITION

Named Entity Recognition (NER) is a fundamental natural language processing task for the identifi cation and classifi cation of expressions into predefi ned categories, such as person and organization. Existing NER systems usually target about 10 categories and do not incorporate analysis of categor...

Full description

Bibliographic Details
Main Authors: Shohei Higashiyama, Blondel Mathieu, Kazuhiro Seki, Kuniaki Uehara
Format: Article
Language:English
Published: UUM Press 2015-04-01
Series:Journal of ICT
Subjects:
Online Access:https://e-journal.uum.edu.my/index.php/jict/article/view/8153
_version_ 1811314707738394624
author Shohei Higashiyama
Blondel Mathieu
Kazuhiro Seki
Kuniaki Uehara
author_facet Shohei Higashiyama
Blondel Mathieu
Kazuhiro Seki
Kuniaki Uehara
author_sort Shohei Higashiyama
collection DOAJ
description Named Entity Recognition (NER) is a fundamental natural language processing task for the identifi cation and classifi cation of expressions into predefi ned categories, such as person and organization. Existing NER systems usually target about 10 categories and do not incorporate analysis of category relations. However, categories often belong naturally to some predefi ned hierarchy. In such cases, the distance between categories in the hierarchy becomes a rich source of information that can be exploited. This is intuitively useful particularly when the categories are numerous. On that account, this paper proposes an NER approach that can leverage category hierarchy information by introducing, in the structured perceptron framework, a cost function more strongly penalizing category predictions that are more distant from the correct category in the hierarchy. Experimental results on the GENIA biomedical text corpus indicate the effectiveness of the proposed approach as compared with the case where no cost function is utilized. In addition, the proposed approach demonstrates the superior performance over a representative work using multi-class support vector machines on the same corpus. A possible direction to further improve the proposed approach is to investigate more elaborate cost functions than a simple additive cost adopted in this work.  
first_indexed 2024-04-13T11:17:20Z
format Article
id doaj.art-150724e75d4c42ca8ad371e088a8caca
institution Directory Open Access Journal
issn 1675-414X
2180-3862
language English
last_indexed 2024-04-13T11:17:20Z
publishDate 2015-04-01
publisher UUM Press
record_format Article
series Journal of ICT
spelling doaj.art-150724e75d4c42ca8ad371e088a8caca2022-12-22T02:48:56ZengUUM PressJournal of ICT1675-414X2180-38622015-04-0114COST-SENSITIVE STRUCTURED PERCEPTRON INCORPORATING CATEGORY HIERARCHY FOR NAMED ENTITY RECOGNITIONShohei Higashiyama0Blondel Mathieu1Kazuhiro Seki2Kuniaki Uehara3Graduate School of System Informatics Kobe University, JapanNTT Communication Science Laboratories, Kobe University, JapanFaculty of Intelligence and Informatics, Konan University, JapanFaculty of Intelligence and Informatics, Konan University, JapanNamed Entity Recognition (NER) is a fundamental natural language processing task for the identifi cation and classifi cation of expressions into predefi ned categories, such as person and organization. Existing NER systems usually target about 10 categories and do not incorporate analysis of category relations. However, categories often belong naturally to some predefi ned hierarchy. In such cases, the distance between categories in the hierarchy becomes a rich source of information that can be exploited. This is intuitively useful particularly when the categories are numerous. On that account, this paper proposes an NER approach that can leverage category hierarchy information by introducing, in the structured perceptron framework, a cost function more strongly penalizing category predictions that are more distant from the correct category in the hierarchy. Experimental results on the GENIA biomedical text corpus indicate the effectiveness of the proposed approach as compared with the case where no cost function is utilized. In addition, the proposed approach demonstrates the superior performance over a representative work using multi-class support vector machines on the same corpus. A possible direction to further improve the proposed approach is to investigate more elaborate cost functions than a simple additive cost adopted in this work.   https://e-journal.uum.edu.my/index.php/jict/article/view/8153Named entity recognitioncategory hierarchycost-sensitive learningbiomedical text mining
spellingShingle Shohei Higashiyama
Blondel Mathieu
Kazuhiro Seki
Kuniaki Uehara
COST-SENSITIVE STRUCTURED PERCEPTRON INCORPORATING CATEGORY HIERARCHY FOR NAMED ENTITY RECOGNITION
Journal of ICT
Named entity recognition
category hierarchy
cost-sensitive learning
biomedical text mining
title COST-SENSITIVE STRUCTURED PERCEPTRON INCORPORATING CATEGORY HIERARCHY FOR NAMED ENTITY RECOGNITION
title_full COST-SENSITIVE STRUCTURED PERCEPTRON INCORPORATING CATEGORY HIERARCHY FOR NAMED ENTITY RECOGNITION
title_fullStr COST-SENSITIVE STRUCTURED PERCEPTRON INCORPORATING CATEGORY HIERARCHY FOR NAMED ENTITY RECOGNITION
title_full_unstemmed COST-SENSITIVE STRUCTURED PERCEPTRON INCORPORATING CATEGORY HIERARCHY FOR NAMED ENTITY RECOGNITION
title_short COST-SENSITIVE STRUCTURED PERCEPTRON INCORPORATING CATEGORY HIERARCHY FOR NAMED ENTITY RECOGNITION
title_sort cost sensitive structured perceptron incorporating category hierarchy for named entity recognition
topic Named entity recognition
category hierarchy
cost-sensitive learning
biomedical text mining
url https://e-journal.uum.edu.my/index.php/jict/article/view/8153
work_keys_str_mv AT shoheihigashiyama costsensitivestructuredperceptronincorporatingcategoryhierarchyfornamedentityrecognition
AT blondelmathieu costsensitivestructuredperceptronincorporatingcategoryhierarchyfornamedentityrecognition
AT kazuhiroseki costsensitivestructuredperceptronincorporatingcategoryhierarchyfornamedentityrecognition
AT kuniakiuehara costsensitivestructuredperceptronincorporatingcategoryhierarchyfornamedentityrecognition