Automatic Identification of Domain Terms: An Approach for Italian
The problem of creating a fully automated specific-domain thesaurus is very topical. The paper presents a novel method to address this problem in the Italian language. The main feature of this approach is the integration of different methods: machine learning classification methods working on the se...
Main Authors: | , |
---|---|
Format: | Article |
Language: | English |
Published: |
Bulgarian Academy of Sciences, Institute of Mathematics and Informatics
2020-09-01
|
Series: | Digital Presentation and Preservation of Cultural and Scientific Heritage |
Subjects: | |
Online Access: | https://dipp.math.bas.bg/dipp/article/view/121 |
_version_ | 1811257559352344576 |
---|---|
author | Maria Teresa Artese Isabella Gagliardi |
author_facet | Maria Teresa Artese Isabella Gagliardi |
author_sort | Maria Teresa Artese |
collection | DOAJ |
description | The problem of creating a fully automated specific-domain thesaurus is very topical. The paper presents a novel method to address this problem in the
Italian language. The main feature of this approach is the integration of different methods: machine learning classification methods working on the semantic representation of candidate terms, word embeddings models, able to capture the semantics of words, and a computation of the degree of specialization of a term.
The work is in progress and results obtained so far are promising. |
first_indexed | 2024-04-12T17:59:22Z |
format | Article |
id | doaj.art-85f9a4e49d2d456f8878e7251b469494 |
institution | Directory Open Access Journal |
issn | 1314-4006 2535-0366 |
language | English |
last_indexed | 2024-04-12T17:59:22Z |
publishDate | 2020-09-01 |
publisher | Bulgarian Academy of Sciences, Institute of Mathematics and Informatics |
record_format | Article |
series | Digital Presentation and Preservation of Cultural and Scientific Heritage |
spelling | doaj.art-85f9a4e49d2d456f8878e7251b4694942022-12-22T03:22:14ZengBulgarian Academy of Sciences, Institute of Mathematics and InformaticsDigital Presentation and Preservation of Cultural and Scientific Heritage1314-40062535-03662020-09-011010.55630/dipp.2020.10.21Automatic Identification of Domain Terms: An Approach for ItalianMaria Teresa Artese0Isabella Gagliardi1IMATI – CNR, Via Bassini 15, 20133, Milan, ItalyIMATI – CNR, Via Bassini 15, 20133, Milan, ItalyThe problem of creating a fully automated specific-domain thesaurus is very topical. The paper presents a novel method to address this problem in the Italian language. The main feature of this approach is the integration of different methods: machine learning classification methods working on the semantic representation of candidate terms, word embeddings models, able to capture the semantics of words, and a computation of the degree of specialization of a term. The work is in progress and results obtained so far are promising.https://dipp.math.bas.bg/dipp/article/view/121Classification MethodsWord Embedding ModelsProbabilityFoodItalian Language |
spellingShingle | Maria Teresa Artese Isabella Gagliardi Automatic Identification of Domain Terms: An Approach for Italian Digital Presentation and Preservation of Cultural and Scientific Heritage Classification Methods Word Embedding Models Probability Food Italian Language |
title | Automatic Identification of Domain Terms: An Approach for Italian |
title_full | Automatic Identification of Domain Terms: An Approach for Italian |
title_fullStr | Automatic Identification of Domain Terms: An Approach for Italian |
title_full_unstemmed | Automatic Identification of Domain Terms: An Approach for Italian |
title_short | Automatic Identification of Domain Terms: An Approach for Italian |
title_sort | automatic identification of domain terms an approach for italian |
topic | Classification Methods Word Embedding Models Probability Food Italian Language |
url | https://dipp.math.bas.bg/dipp/article/view/121 |
work_keys_str_mv | AT mariateresaartese automaticidentificationofdomaintermsanapproachforitalian AT isabellagagliardi automaticidentificationofdomaintermsanapproachforitalian |