Automatic Identification of Domain Terms: An Approach for Italian

The problem of creating a fully automated specific-domain thesaurus is very topical. The paper presents a novel method to address this problem in the Italian language. The main feature of this approach is the integration of different methods: machine learning classification methods working on the se...

Full description

Bibliographic Details
Main Authors: Maria Teresa Artese, Isabella Gagliardi
Format: Article
Language:English
Published: Bulgarian Academy of Sciences, Institute of Mathematics and Informatics 2020-09-01
Series:Digital Presentation and Preservation of Cultural and Scientific Heritage
Subjects:
Online Access:https://dipp.math.bas.bg/dipp/article/view/121
Description
Summary:The problem of creating a fully automated specific-domain thesaurus is very topical. The paper presents a novel method to address this problem in the Italian language. The main feature of this approach is the integration of different methods: machine learning classification methods working on the semantic representation of candidate terms, word embeddings models, able to capture the semantics of words, and a computation of the degree of specialization of a term. The work is in progress and results obtained so far are promising.
ISSN:1314-4006
2535-0366