Automatic Identification of Domain Terms: An Approach for Italian
The problem of creating a fully automated specific-domain thesaurus is very topical. The paper presents a novel method to address this problem in the Italian language. The main feature of this approach is the integration of different methods: machine learning classification methods working on the se...
Main Authors: | , |
---|---|
Format: | Article |
Language: | English |
Published: |
Bulgarian Academy of Sciences, Institute of Mathematics and Informatics
2020-09-01
|
Series: | Digital Presentation and Preservation of Cultural and Scientific Heritage |
Subjects: | |
Online Access: | https://dipp.math.bas.bg/dipp/article/view/121 |
Summary: | The problem of creating a fully automated specific-domain thesaurus is very topical. The paper presents a novel method to address this problem in the
Italian language. The main feature of this approach is the integration of different methods: machine learning classification methods working on the semantic representation of candidate terms, word embeddings models, able to capture the semantics of words, and a computation of the degree of specialization of a term.
The work is in progress and results obtained so far are promising. |
---|---|
ISSN: | 1314-4006 2535-0366 |