Span-based model for overlapping entity recognition and multi-relations classification in the food domain

Information extraction (IE) is an important part of the entire knowledge graph lifecycle. In the food domain, extracting information such as ingredient and cooking method from Chinese recipes is crucial to safety risk analysis and identification of ingredient. In comparison with English, due to the...

Full description

Bibliographic Details
Main Authors: Mengqi Zhang, Lei Ma, Yanzhao Ren, Ganggang Zhang, Xinliang Liu
Format: Article
Language:English
Published: AIMS Press 2022-03-01
Series:Mathematical Biosciences and Engineering
Subjects:
Online Access:https://www.aimspress.com/article/doi/10.3934/mbe.2022240?viewType=HTML
_version_ 1811272558636433408
author Mengqi Zhang
Lei Ma
Yanzhao Ren
Ganggang Zhang
Xinliang Liu
author_facet Mengqi Zhang
Lei Ma
Yanzhao Ren
Ganggang Zhang
Xinliang Liu
author_sort Mengqi Zhang
collection DOAJ
description Information extraction (IE) is an important part of the entire knowledge graph lifecycle. In the food domain, extracting information such as ingredient and cooking method from Chinese recipes is crucial to safety risk analysis and identification of ingredient. In comparison with English, due to the complex structure, the richness of information in word combination, and lack of tense, Chinese IE is much more challenging. This dilemma is particularly prominent in the food domain with high-density knowledge, imprecise syntactic structure. However, existing IE methods focus only on the features of entities in a sentence, such as context and position, and ignore features of the entity itself and the influence of self attributes on prediction of inter entity relationship. To solve the problems of overlapping entity recognition and multi-relations classification in the food domain, we propose a span-based model known as SpIE for IE. The SpIE uses the span representation for each possible candidate entity to capture span-level features, which transforms named entity recognition (NER) into a classification mission. Besides, SpIE feeds extra information about the entity into the relation classification (RC) model by considering the effect of entity's attributes (both the entity mention and entity type) on the relationship between entity pairs. We apply SpIE on two datasets and observe that SpIE significantly outperforms the previous neural approaches due to capture the feature of overlapping entity and entity attributes, and it remains very competitive in general IE.
first_indexed 2024-04-12T22:42:29Z
format Article
id doaj.art-460cab2f6efb47d5b18130ca2b003f76
institution Directory Open Access Journal
issn 1551-0018
language English
last_indexed 2024-04-12T22:42:29Z
publishDate 2022-03-01
publisher AIMS Press
record_format Article
series Mathematical Biosciences and Engineering
spelling doaj.art-460cab2f6efb47d5b18130ca2b003f762022-12-22T03:13:40ZengAIMS PressMathematical Biosciences and Engineering1551-00182022-03-011955134515210.3934/mbe.2022240Span-based model for overlapping entity recognition and multi-relations classification in the food domainMengqi Zhang 0Lei Ma1Yanzhao Ren2Ganggang Zhang 3Xinliang Liu41. School of E-business and Logistics, Beijing Technology and Business University, Beijing 100048, China 2. National Engineering Laboratory for Agri-product Quality Traceability, Beijing Technology and Business University, Beijing 100048, China1. School of E-business and Logistics, Beijing Technology and Business University, Beijing 100048, China 2. National Engineering Laboratory for Agri-product Quality Traceability, Beijing Technology and Business University, Beijing 100048, China3. School of Computer Science and Engineering, Beijing Technology and Business University, Beijing 100048, China4. Digital Campus Construction Center, Capital Normal University, Beijing 100048, China1. School of E-business and Logistics, Beijing Technology and Business University, Beijing 100048, China 2. National Engineering Laboratory for Agri-product Quality Traceability, Beijing Technology and Business University, Beijing 100048, ChinaInformation extraction (IE) is an important part of the entire knowledge graph lifecycle. In the food domain, extracting information such as ingredient and cooking method from Chinese recipes is crucial to safety risk analysis and identification of ingredient. In comparison with English, due to the complex structure, the richness of information in word combination, and lack of tense, Chinese IE is much more challenging. This dilemma is particularly prominent in the food domain with high-density knowledge, imprecise syntactic structure. However, existing IE methods focus only on the features of entities in a sentence, such as context and position, and ignore features of the entity itself and the influence of self attributes on prediction of inter entity relationship. To solve the problems of overlapping entity recognition and multi-relations classification in the food domain, we propose a span-based model known as SpIE for IE. The SpIE uses the span representation for each possible candidate entity to capture span-level features, which transforms named entity recognition (NER) into a classification mission. Besides, SpIE feeds extra information about the entity into the relation classification (RC) model by considering the effect of entity's attributes (both the entity mention and entity type) on the relationship between entity pairs. We apply SpIE on two datasets and observe that SpIE significantly outperforms the previous neural approaches due to capture the feature of overlapping entity and entity attributes, and it remains very competitive in general IE.https://www.aimspress.com/article/doi/10.3934/mbe.2022240?viewType=HTMLinformation extractionspan-based approachoverlapping entity recognitioncategory markermulti-relations classificationentity attributes
spellingShingle Mengqi Zhang
Lei Ma
Yanzhao Ren
Ganggang Zhang
Xinliang Liu
Span-based model for overlapping entity recognition and multi-relations classification in the food domain
Mathematical Biosciences and Engineering
information extraction
span-based approach
overlapping entity recognition
category marker
multi-relations classification
entity attributes
title Span-based model for overlapping entity recognition and multi-relations classification in the food domain
title_full Span-based model for overlapping entity recognition and multi-relations classification in the food domain
title_fullStr Span-based model for overlapping entity recognition and multi-relations classification in the food domain
title_full_unstemmed Span-based model for overlapping entity recognition and multi-relations classification in the food domain
title_short Span-based model for overlapping entity recognition and multi-relations classification in the food domain
title_sort span based model for overlapping entity recognition and multi relations classification in the food domain
topic information extraction
span-based approach
overlapping entity recognition
category marker
multi-relations classification
entity attributes
url https://www.aimspress.com/article/doi/10.3934/mbe.2022240?viewType=HTML
work_keys_str_mv AT mengqizhang spanbasedmodelforoverlappingentityrecognitionandmultirelationsclassificationinthefooddomain
AT leima spanbasedmodelforoverlappingentityrecognitionandmultirelationsclassificationinthefooddomain
AT yanzhaoren spanbasedmodelforoverlappingentityrecognitionandmultirelationsclassificationinthefooddomain
AT ganggangzhang spanbasedmodelforoverlappingentityrecognitionandmultirelationsclassificationinthefooddomain
AT xinliangliu spanbasedmodelforoverlappingentityrecognitionandmultirelationsclassificationinthefooddomain