Toward the Soundness of Sense Structure Definitions in Thesaurus-Dictionaries. Parsing Problems and Solutions

In this paper we point out some difficult problems of thesaurus-dictionary entry parsing, relying on the parsing technology of SCD (Segmentation-Cohesion-Dependency) configurations, successfully applied on six largest thesauri -- Romanian (2), French, German (2), and Russian. \textbf{Challenging Pro...

Full description

Bibliographic Details
Main Authors: Neculai Curteanu, Alex Moruz
Format: Article
Language:English
Published: Vladimir Andrunachievici Institute of Mathematics and Computer Science 2012-10-01
Series:Computer Science Journal of Moldova
Subjects:
Online Access:http://www.math.md/files/csjm/v20-n3/v20-n3-(pp275-303).pdf
_version_ 1828109186325544960
author Neculai Curteanu
Alex Moruz
author_facet Neculai Curteanu
Alex Moruz
author_sort Neculai Curteanu
collection DOAJ
description In this paper we point out some difficult problems of thesaurus-dictionary entry parsing, relying on the parsing technology of SCD (Segmentation-Cohesion-Dependency) configurations, successfully applied on six largest thesauri -- Romanian (2), French, German (2), and Russian. \textbf{Challenging Problems:} \textbf{(a)}~Intricate and~/~or recursive structures of the lexicographic segments met in the entries of certain thesauri; \textbf{(b)}~Cyclicity (recursive) calls of some sense marker classes on marker sequences; \textbf{(c)}~Establishing the hypergraph-driven dependencies between all the atomic and non-atomic sense definitions. Classical approach to solve these parsing problems is hard mainly because of depth-first search of sense definitions and markers, the substantial complexity of entries, and the sense tree dynamic construction embodied within these parsers. \textbf{SCD-based Parsing Solutions:} \textbf{(a)}~The SCD parsing method is a procedural tool, completely formal grammar-free, handling the recursive structure of the lexicographic segments by procedural non-recursive calls performed on the SCD parsing configurations of the entry structure. \textbf{(b)}~For dealing with cyclicity (recursive) calls between secondary sense markers and the sense enumeration markers, we proposed the Enumeration Closing Condition, sometimes coupled with New{\_}Paragraphs typographic markers transformed into numeral sense enumeration. \textbf{(c)}~These problems, their lexicographic modeling and parsing solutions are addressed to both dictionary parser programmers to experience the SCD-based parsing method, as well as to lexicographers and thesauri designers for tailoring balanced lexical-semantics granularities and sounder sense tree definitions of the dictionary entries.
first_indexed 2024-04-11T10:58:48Z
format Article
id doaj.art-7d32b9b4d1ab4d33b7296d6ee6bb7680
institution Directory Open Access Journal
issn 1561-4042
language English
last_indexed 2024-04-11T10:58:48Z
publishDate 2012-10-01
publisher Vladimir Andrunachievici Institute of Mathematics and Computer Science
record_format Article
series Computer Science Journal of Moldova
spelling doaj.art-7d32b9b4d1ab4d33b7296d6ee6bb76802022-12-22T04:28:41ZengVladimir Andrunachievici Institute of Mathematics and Computer ScienceComputer Science Journal of Moldova1561-40422012-10-01203(60)275303Toward the Soundness of Sense Structure Definitions in Thesaurus-Dictionaries. Parsing Problems and SolutionsNeculai Curteanu0Alex Moruz1Institute of Computer Science, Romanian Academy, Iasi Branch, Str. Gh. Asachi, Nr. 3, 700483 Iasi, RomaniaInstitute of Computer Science, Romanian Academy, Iasi Branch, Faculty of Computer Science,``Al. I. Cuza'' University of IasiIn this paper we point out some difficult problems of thesaurus-dictionary entry parsing, relying on the parsing technology of SCD (Segmentation-Cohesion-Dependency) configurations, successfully applied on six largest thesauri -- Romanian (2), French, German (2), and Russian. \textbf{Challenging Problems:} \textbf{(a)}~Intricate and~/~or recursive structures of the lexicographic segments met in the entries of certain thesauri; \textbf{(b)}~Cyclicity (recursive) calls of some sense marker classes on marker sequences; \textbf{(c)}~Establishing the hypergraph-driven dependencies between all the atomic and non-atomic sense definitions. Classical approach to solve these parsing problems is hard mainly because of depth-first search of sense definitions and markers, the substantial complexity of entries, and the sense tree dynamic construction embodied within these parsers. \textbf{SCD-based Parsing Solutions:} \textbf{(a)}~The SCD parsing method is a procedural tool, completely formal grammar-free, handling the recursive structure of the lexicographic segments by procedural non-recursive calls performed on the SCD parsing configurations of the entry structure. \textbf{(b)}~For dealing with cyclicity (recursive) calls between secondary sense markers and the sense enumeration markers, we proposed the Enumeration Closing Condition, sometimes coupled with New{\_}Paragraphs typographic markers transformed into numeral sense enumeration. \textbf{(c)}~These problems, their lexicographic modeling and parsing solutions are addressed to both dictionary parser programmers to experience the SCD-based parsing method, as well as to lexicographers and thesauri designers for tailoring balanced lexical-semantics granularities and sounder sense tree definitions of the dictionary entries.http://www.math.md/files/csjm/v20-n3/v20-n3-(pp275-303).pdfdictionary entry parsing; parsing method of SCD configurationsrecursive lexicographic segmentsrecursive calls of sense markersEnumeration Closing Conditionsoundness of sense structure definitions
spellingShingle Neculai Curteanu
Alex Moruz
Toward the Soundness of Sense Structure Definitions in Thesaurus-Dictionaries. Parsing Problems and Solutions
Computer Science Journal of Moldova
dictionary entry parsing; parsing method of SCD configurations
recursive lexicographic segments
recursive calls of sense markers
Enumeration Closing Condition
soundness of sense structure definitions
title Toward the Soundness of Sense Structure Definitions in Thesaurus-Dictionaries. Parsing Problems and Solutions
title_full Toward the Soundness of Sense Structure Definitions in Thesaurus-Dictionaries. Parsing Problems and Solutions
title_fullStr Toward the Soundness of Sense Structure Definitions in Thesaurus-Dictionaries. Parsing Problems and Solutions
title_full_unstemmed Toward the Soundness of Sense Structure Definitions in Thesaurus-Dictionaries. Parsing Problems and Solutions
title_short Toward the Soundness of Sense Structure Definitions in Thesaurus-Dictionaries. Parsing Problems and Solutions
title_sort toward the soundness of sense structure definitions in thesaurus dictionaries parsing problems and solutions
topic dictionary entry parsing; parsing method of SCD configurations
recursive lexicographic segments
recursive calls of sense markers
Enumeration Closing Condition
soundness of sense structure definitions
url http://www.math.md/files/csjm/v20-n3/v20-n3-(pp275-303).pdf
work_keys_str_mv AT neculaicurteanu towardthesoundnessofsensestructuredefinitionsinthesaurusdictionariesparsingproblemsandsolutions
AT alexmoruz towardthesoundnessofsensestructuredefinitionsinthesaurusdictionariesparsingproblemsandsolutions