A reexamination of MRD-based word sense disambiguation

This paper reconsiders the task of MRD-based word sense disambiguation, in extending the basic Lesk algorithm to investigate the impact on WSD performance of different tokenisation schemes and methods of definition extension. In experimentation over the Hinoki Sensebank and the Japanese Senseval-2 d...

Full description

Bibliographic Details
Main Authors: Baldwin, Timothy, Kim, Su Nam, Bond, Francis, Fujita, Sanae, Martinez, David, Tanaka, Takaaki
Other Authors: School of Humanities and Social Sciences
Format: Journal Article
Language:English
Published: 2011
Subjects:
Online Access:https://hdl.handle.net/10356/79580
http://hdl.handle.net/10220/6834
_version_ 1824453082329645056
author Baldwin, Timothy
Kim, Su Nam
Bond, Francis
Fujita, Sanae
Martinez, David
Tanaka, Takaaki
author2 School of Humanities and Social Sciences
author_facet School of Humanities and Social Sciences
Baldwin, Timothy
Kim, Su Nam
Bond, Francis
Fujita, Sanae
Martinez, David
Tanaka, Takaaki
author_sort Baldwin, Timothy
collection NTU
description This paper reconsiders the task of MRD-based word sense disambiguation, in extending the basic Lesk algorithm to investigate the impact on WSD performance of different tokenisation schemes and methods of definition extension. In experimentation over the Hinoki Sensebank and the Japanese Senseval-2 dictionary task, we demonstrate that sense-sensitive definition extension over hyponyms, hypernyms and synonyms, combined with definition extension and word tokenisation leads to WSD accuracy above both unsupervised and supervised baselines. In doing so, we demonstrate the utility of ontology induction and establish new opportunities for the development of baseline unsupervised WSD methods.
first_indexed 2024-10-01T06:59:03Z
format Journal Article
id ntu-10356/79580
institution Nanyang Technological University
language English
last_indexed 2024-10-01T06:59:03Z
publishDate 2011
record_format dspace
spelling ntu-10356/795802020-04-27T10:05:33Z A reexamination of MRD-based word sense disambiguation Baldwin, Timothy Kim, Su Nam Bond, Francis Fujita, Sanae Martinez, David Tanaka, Takaaki School of Humanities and Social Sciences DRNTU::Humanities::Linguistics::Sociolinguistics::Computational linguistics This paper reconsiders the task of MRD-based word sense disambiguation, in extending the basic Lesk algorithm to investigate the impact on WSD performance of different tokenisation schemes and methods of definition extension. In experimentation over the Hinoki Sensebank and the Japanese Senseval-2 dictionary task, we demonstrate that sense-sensitive definition extension over hyponyms, hypernyms and synonyms, combined with definition extension and word tokenisation leads to WSD accuracy above both unsupervised and supervised baselines. In doing so, we demonstrate the utility of ontology induction and establish new opportunities for the development of baseline unsupervised WSD methods. Accepted version 2011-06-29T09:08:14Z 2019-12-06T13:28:38Z 2011-06-29T09:08:14Z 2019-12-06T13:28:38Z 2010 2010 Journal Article 1530-0226 https://hdl.handle.net/10356/79580 http://hdl.handle.net/10220/6834 10.1145/1731035.1731039 155494 en ACM transactions on Asian language information processing © 2010 Association for Computing Machinery. This is the author created version of a work that has been peer reviewed and accepted for publication by ACM Transactions on Asian Language Information Processing, Association for Computing Machinery. It incorporates referee’s comments but changes resulting from the publishing process, such as copyediting, structural formatting, may not be reflected in this document. The published version is available at: [DOI: http://dx.doi.org/10.1145/1731035.1731039]. 21 p. application/pdf
spellingShingle DRNTU::Humanities::Linguistics::Sociolinguistics::Computational linguistics
Baldwin, Timothy
Kim, Su Nam
Bond, Francis
Fujita, Sanae
Martinez, David
Tanaka, Takaaki
A reexamination of MRD-based word sense disambiguation
title A reexamination of MRD-based word sense disambiguation
title_full A reexamination of MRD-based word sense disambiguation
title_fullStr A reexamination of MRD-based word sense disambiguation
title_full_unstemmed A reexamination of MRD-based word sense disambiguation
title_short A reexamination of MRD-based word sense disambiguation
title_sort reexamination of mrd based word sense disambiguation
topic DRNTU::Humanities::Linguistics::Sociolinguistics::Computational linguistics
url https://hdl.handle.net/10356/79580
http://hdl.handle.net/10220/6834
work_keys_str_mv AT baldwintimothy areexaminationofmrdbasedwordsensedisambiguation
AT kimsunam areexaminationofmrdbasedwordsensedisambiguation
AT bondfrancis areexaminationofmrdbasedwordsensedisambiguation
AT fujitasanae areexaminationofmrdbasedwordsensedisambiguation
AT martinezdavid areexaminationofmrdbasedwordsensedisambiguation
AT tanakatakaaki areexaminationofmrdbasedwordsensedisambiguation
AT baldwintimothy reexaminationofmrdbasedwordsensedisambiguation
AT kimsunam reexaminationofmrdbasedwordsensedisambiguation
AT bondfrancis reexaminationofmrdbasedwordsensedisambiguation
AT fujitasanae reexaminationofmrdbasedwordsensedisambiguation
AT martinezdavid reexaminationofmrdbasedwordsensedisambiguation
AT tanakatakaaki reexaminationofmrdbasedwordsensedisambiguation