A reexamination of MRD-based word sense disambiguation
This paper reconsiders the task of MRD-based word sense disambiguation, in extending the basic Lesk algorithm to investigate the impact on WSD performance of different tokenisation schemes and methods of definition extension. In experimentation over the Hinoki Sensebank and the Japanese Senseval-2 d...
Main Authors: | , , , , , |
---|---|
Other Authors: | |
Format: | Journal Article |
Language: | English |
Published: |
2011
|
Subjects: | |
Online Access: | https://hdl.handle.net/10356/79580 http://hdl.handle.net/10220/6834 |
_version_ | 1824453082329645056 |
---|---|
author | Baldwin, Timothy Kim, Su Nam Bond, Francis Fujita, Sanae Martinez, David Tanaka, Takaaki |
author2 | School of Humanities and Social Sciences |
author_facet | School of Humanities and Social Sciences Baldwin, Timothy Kim, Su Nam Bond, Francis Fujita, Sanae Martinez, David Tanaka, Takaaki |
author_sort | Baldwin, Timothy |
collection | NTU |
description | This paper reconsiders the task of MRD-based word sense disambiguation, in extending the basic Lesk algorithm to investigate the impact on WSD performance of different tokenisation schemes and methods of definition extension. In experimentation over the Hinoki Sensebank and the Japanese Senseval-2 dictionary task, we demonstrate that sense-sensitive definition extension over hyponyms, hypernyms and synonyms, combined with definition extension and word tokenisation leads to WSD accuracy above both unsupervised and supervised baselines. In doing so, we demonstrate the utility of ontology induction and establish new opportunities for the development of baseline unsupervised WSD methods. |
first_indexed | 2024-10-01T06:59:03Z |
format | Journal Article |
id | ntu-10356/79580 |
institution | Nanyang Technological University |
language | English |
last_indexed | 2024-10-01T06:59:03Z |
publishDate | 2011 |
record_format | dspace |
spelling | ntu-10356/795802020-04-27T10:05:33Z A reexamination of MRD-based word sense disambiguation Baldwin, Timothy Kim, Su Nam Bond, Francis Fujita, Sanae Martinez, David Tanaka, Takaaki School of Humanities and Social Sciences DRNTU::Humanities::Linguistics::Sociolinguistics::Computational linguistics This paper reconsiders the task of MRD-based word sense disambiguation, in extending the basic Lesk algorithm to investigate the impact on WSD performance of different tokenisation schemes and methods of definition extension. In experimentation over the Hinoki Sensebank and the Japanese Senseval-2 dictionary task, we demonstrate that sense-sensitive definition extension over hyponyms, hypernyms and synonyms, combined with definition extension and word tokenisation leads to WSD accuracy above both unsupervised and supervised baselines. In doing so, we demonstrate the utility of ontology induction and establish new opportunities for the development of baseline unsupervised WSD methods. Accepted version 2011-06-29T09:08:14Z 2019-12-06T13:28:38Z 2011-06-29T09:08:14Z 2019-12-06T13:28:38Z 2010 2010 Journal Article 1530-0226 https://hdl.handle.net/10356/79580 http://hdl.handle.net/10220/6834 10.1145/1731035.1731039 155494 en ACM transactions on Asian language information processing © 2010 Association for Computing Machinery. This is the author created version of a work that has been peer reviewed and accepted for publication by ACM Transactions on Asian Language Information Processing, Association for Computing Machinery. It incorporates referee’s comments but changes resulting from the publishing process, such as copyediting, structural formatting, may not be reflected in this document. The published version is available at: [DOI: http://dx.doi.org/10.1145/1731035.1731039]. 21 p. application/pdf |
spellingShingle | DRNTU::Humanities::Linguistics::Sociolinguistics::Computational linguistics Baldwin, Timothy Kim, Su Nam Bond, Francis Fujita, Sanae Martinez, David Tanaka, Takaaki A reexamination of MRD-based word sense disambiguation |
title | A reexamination of MRD-based word sense disambiguation |
title_full | A reexamination of MRD-based word sense disambiguation |
title_fullStr | A reexamination of MRD-based word sense disambiguation |
title_full_unstemmed | A reexamination of MRD-based word sense disambiguation |
title_short | A reexamination of MRD-based word sense disambiguation |
title_sort | reexamination of mrd based word sense disambiguation |
topic | DRNTU::Humanities::Linguistics::Sociolinguistics::Computational linguistics |
url | https://hdl.handle.net/10356/79580 http://hdl.handle.net/10220/6834 |
work_keys_str_mv | AT baldwintimothy areexaminationofmrdbasedwordsensedisambiguation AT kimsunam areexaminationofmrdbasedwordsensedisambiguation AT bondfrancis areexaminationofmrdbasedwordsensedisambiguation AT fujitasanae areexaminationofmrdbasedwordsensedisambiguation AT martinezdavid areexaminationofmrdbasedwordsensedisambiguation AT tanakatakaaki areexaminationofmrdbasedwordsensedisambiguation AT baldwintimothy reexaminationofmrdbasedwordsensedisambiguation AT kimsunam reexaminationofmrdbasedwordsensedisambiguation AT bondfrancis reexaminationofmrdbasedwordsensedisambiguation AT fujitasanae reexaminationofmrdbasedwordsensedisambiguation AT martinezdavid reexaminationofmrdbasedwordsensedisambiguation AT tanakatakaaki reexaminationofmrdbasedwordsensedisambiguation |