From Word Alignment to Word Senses, via Multilingual Wordnets

Most of the successful commercial applications in language processing (text and/or speech) dispense with any explicit concern on semantics, with the usual motivations stemming from the computational high costs required for dealing with semantics, in case of large volumes of data. With recent advance...

Full description

Bibliographic Details
Main Author:	Dan Tufis
Format:	Article
Language:	English
Published:	Vladimir Andrunachievici Institute of Mathematics and Computer Science 2006-05-01
Series:	Computer Science Journal of Moldova
Online Access:	http://www.math.md/files/csjm/v14-n1/v14-n1-(pp3-33).pdf

_version_	1798024405551415296
author	Dan Tufis
author_facet	Dan Tufis
author_sort	Dan Tufis
collection	DOAJ
description	Most of the successful commercial applications in language processing (text and/or speech) dispense with any explicit concern on semantics, with the usual motivations stemming from the computational high costs required for dealing with semantics, in case of large volumes of data. With recent advances in corpus linguistics and statistical-based methods in NLP, revealing useful semantic features of linguistic data is becoming cheaper and cheaper and the accuracy of this process is steadily improving. Lately, there seems to be a growing acceptance of the idea that multilingual lexical ontologisms might be the key towards aligning different views on the semantic atomic units to be used in characterizing the general meaning of various and multilingual documents. Depending on the granularity at which semantic distinctions are necessary, the accuracy of the basic semantic processing (such as word sense disambiguation) can be very high with relatively low complexity computing. The paper substantiates this statement by presenting a statistical/based system for word alignment and word sense disambiguation in parallel corpora. We describe a word alignment platform which ensures text pre-processing (tokenization, POS-tagging, lemmatization, chunking, sentence and word alignment) as required by an accurate word sense disambiguation.
first_indexed	2024-04-11T18:01:53Z
format	Article
id	doaj.art-5fa66b4f736e49d28dc5f9f4c3bdd061
institution	Directory Open Access Journal
issn	1561-4042
language	English
last_indexed	2024-04-11T18:01:53Z
publishDate	2006-05-01
publisher	Vladimir Andrunachievici Institute of Mathematics and Computer Science
record_format	Article
series	Computer Science Journal of Moldova
spelling	doaj.art-5fa66b4f736e49d28dc5f9f4c3bdd0612022-12-22T04:10:26ZengVladimir Andrunachievici Institute of Mathematics and Computer ScienceComputer Science Journal of Moldova1561-40422006-05-01141(40)333From Word Alignment to Word Senses, via Multilingual WordnetsDan Tufis0Institute for Artificial Intelligence, 13, "13 Septembrie", 050711, Bucharest 5, Romania Most of the successful commercial applications in language processing (text and/or speech) dispense with any explicit concern on semantics, with the usual motivations stemming from the computational high costs required for dealing with semantics, in case of large volumes of data. With recent advances in corpus linguistics and statistical-based methods in NLP, revealing useful semantic features of linguistic data is becoming cheaper and cheaper and the accuracy of this process is steadily improving. Lately, there seems to be a growing acceptance of the idea that multilingual lexical ontologisms might be the key towards aligning different views on the semantic atomic units to be used in characterizing the general meaning of various and multilingual documents. Depending on the granularity at which semantic distinctions are necessary, the accuracy of the basic semantic processing (such as word sense disambiguation) can be very high with relatively low complexity computing. The paper substantiates this statement by presenting a statistical/based system for word alignment and word sense disambiguation in parallel corpora. We describe a word alignment platform which ensures text pre-processing (tokenization, POS-tagging, lemmatization, chunking, sentence and word alignment) as required by an accurate word sense disambiguation.http://www.math.md/files/csjm/v14-n1/v14-n1-(pp3-33).pdf
spellingShingle	Dan Tufis From Word Alignment to Word Senses, via Multilingual Wordnets Computer Science Journal of Moldova
title	From Word Alignment to Word Senses, via Multilingual Wordnets
title_full	From Word Alignment to Word Senses, via Multilingual Wordnets
title_fullStr	From Word Alignment to Word Senses, via Multilingual Wordnets
title_full_unstemmed	From Word Alignment to Word Senses, via Multilingual Wordnets
title_short	From Word Alignment to Word Senses, via Multilingual Wordnets
title_sort	from word alignment to word senses via multilingual wordnets
url	http://www.math.md/files/csjm/v14-n1/v14-n1-(pp3-33).pdf
work_keys_str_mv	AT dantufis fromwordalignmenttowordsensesviamultilingualwordnets

From Word Alignment to Word Senses, via Multilingual Wordnets

Similar Items