Leveraging Concept-Enhanced Pre-Training Model and Masked-Entity Language Model for Named Entity Disambiguation

Named Entity Disambiguation (NED) refers to the task of resolving multiple named entity mentions in an input-text sequence to their correct references in a knowledge graph. We tackle NED problem by leveraging two novel objectives for pre-training framework, and propose a novel pre-training NED model...

Full description

Bibliographic Details
Main Authors:	Zizheng Ji, Lin Dai, Jin Pang, Tingting Shen
Format:	Article
Language:	English
Published:	IEEE 2020-01-01
Series:	IEEE Access
Subjects:	Named entity disambiguation pre-training lexical knowledge
Online Access:	https://ieeexplore.ieee.org/document/9091850/

_version_	1818379816929329152
author	Zizheng Ji Lin Dai Jin Pang Tingting Shen
author_facet	Zizheng Ji Lin Dai Jin Pang Tingting Shen
author_sort	Zizheng Ji
collection	DOAJ
description	Named Entity Disambiguation (NED) refers to the task of resolving multiple named entity mentions in an input-text sequence to their correct references in a knowledge graph. We tackle NED problem by leveraging two novel objectives for pre-training framework, and propose a novel pre-training NED model. Especially, the proposed pre-training NED model consists of: (i) concept-enhanced pre-training, aiming at identifying valid lexical semantic relations with the concept semantic constraints derived from external resource Probase; and (ii) masked entity language model, aiming to train the contextualized embedding by predicting randomly masked entities based on words and non-masked entities in the given input-text. Therefore, the proposed pre-training NED model could merge the advantage of pre-training mechanism for generating contextualized embedding with the superiority of the lexical knowledge (e.g., concept knowledge emphasized here) for understanding language semantic. We conduct experiments on the CoNLL dataset and TAC dataset, and various datasets provided by GERBIL platform. The experimental results demonstrate that the proposed model achieves significantly higher performance than previous models.
first_indexed	2024-12-14T02:08:48Z
format	Article
id	doaj.art-176fbfa4ba0845e99129abc4f7c98e97
institution	Directory Open Access Journal
issn	2169-3536
language	English
last_indexed	2024-12-14T02:08:48Z
publishDate	2020-01-01
publisher	IEEE
record_format	Article
series	IEEE Access
spelling	doaj.art-176fbfa4ba0845e99129abc4f7c98e972022-12-21T23:20:49ZengIEEEIEEE Access2169-35362020-01-01810046910048410.1109/ACCESS.2020.29942479091850Leveraging Concept-Enhanced Pre-Training Model and Masked-Entity Language Model for Named Entity DisambiguationZizheng Ji0https://orcid.org/0000-0002-5749-4780Lin Dai1https://orcid.org/0000-0002-0093-137XJin Pang2https://orcid.org/0000-0001-8252-135XTingting Shen3https://orcid.org/0000-0002-8446-9978School of Computer, Beijing Institute of Technology, Beijing, ChinaSchool of Computer, Beijing Institute of Technology, Beijing, ChinaState Grid Corporation of China, Beijing, ChinaSchool of Computer, Beijing Institute of Technology, Beijing, ChinaNamed Entity Disambiguation (NED) refers to the task of resolving multiple named entity mentions in an input-text sequence to their correct references in a knowledge graph. We tackle NED problem by leveraging two novel objectives for pre-training framework, and propose a novel pre-training NED model. Especially, the proposed pre-training NED model consists of: (i) concept-enhanced pre-training, aiming at identifying valid lexical semantic relations with the concept semantic constraints derived from external resource Probase; and (ii) masked entity language model, aiming to train the contextualized embedding by predicting randomly masked entities based on words and non-masked entities in the given input-text. Therefore, the proposed pre-training NED model could merge the advantage of pre-training mechanism for generating contextualized embedding with the superiority of the lexical knowledge (e.g., concept knowledge emphasized here) for understanding language semantic. We conduct experiments on the CoNLL dataset and TAC dataset, and various datasets provided by GERBIL platform. The experimental results demonstrate that the proposed model achieves significantly higher performance than previous models.https://ieeexplore.ieee.org/document/9091850/Named entity disambiguationpre-traininglexical knowledge
spellingShingle	Zizheng Ji Lin Dai Jin Pang Tingting Shen Leveraging Concept-Enhanced Pre-Training Model and Masked-Entity Language Model for Named Entity Disambiguation IEEE Access Named entity disambiguation pre-training lexical knowledge
title	Leveraging Concept-Enhanced Pre-Training Model and Masked-Entity Language Model for Named Entity Disambiguation
title_full	Leveraging Concept-Enhanced Pre-Training Model and Masked-Entity Language Model for Named Entity Disambiguation
title_fullStr	Leveraging Concept-Enhanced Pre-Training Model and Masked-Entity Language Model for Named Entity Disambiguation
title_full_unstemmed	Leveraging Concept-Enhanced Pre-Training Model and Masked-Entity Language Model for Named Entity Disambiguation
title_short	Leveraging Concept-Enhanced Pre-Training Model and Masked-Entity Language Model for Named Entity Disambiguation
title_sort	leveraging concept enhanced pre training model and masked entity language model for named entity disambiguation
topic	Named entity disambiguation pre-training lexical knowledge
url	https://ieeexplore.ieee.org/document/9091850/
work_keys_str_mv	AT zizhengji leveragingconceptenhancedpretrainingmodelandmaskedentitylanguagemodelfornamedentitydisambiguation AT lindai leveragingconceptenhancedpretrainingmodelandmaskedentitylanguagemodelfornamedentitydisambiguation AT jinpang leveragingconceptenhancedpretrainingmodelandmaskedentitylanguagemodelfornamedentitydisambiguation AT tingtingshen leveragingconceptenhancedpretrainingmodelandmaskedentitylanguagemodelfornamedentitydisambiguation

Leveraging Concept-Enhanced Pre-Training Model and Masked-Entity Language Model for Named Entity Disambiguation

Similar Items