RESEARCH ON THE SPECIFIC FEATURES OF DETERMINING THE SEMANTIC SIMILARITY OF ARBITRARY-LENGTH TEXT CONTENT USING MULTILINGUAL TRANSFORMER-BASED MODELS

The possibilities of determining the semantic similarity of multilingual arbitrary-length text content have been investigated using their vector representations obtained within different multilingual models based on Transformer architecture. A comparative analysis of the Transformers has been perfor...

Full description

Bibliographic Details
Main Authors:	Serhii Olizarenko, Vladimir Argunov
Format:	Article
Language:	English
Published:	National Technical University "Kharkiv Polytechnic Institute" 2020-10-01
Series:	Сучасні інформаційні системи
Subjects:	Natural Language Processing BERT semantic similarities news content Deep Learning multilingual text content
Online Access:	http://ais.khpi.edu.ua/article/view/213331

Description
Summary:	The possibilities of determining the semantic similarity of multilingual arbitrary-length text content have been investigated using their vector representations obtained within different multilingual models based on Transformer architecture. A comparative analysis of the Transformers has been performed to select the most advantageous model for this class of problems. Also, two new unique approaches to determining the semantic similarity of a multilingual text content have been developed to be used in the HIPSTO Open AI Information Discovery Platform, the challenge being to allow arbitrary text length. Experimental and research evidence is offered to support the new approaches as a solution to the semantic similarity problem.
ISSN:	2522-9052

RESEARCH ON THE SPECIFIC FEATURES OF DETERMINING THE SEMANTIC SIMILARITY OF ARBITRARY-LENGTH TEXT CONTENT USING MULTILINGUAL TRANSFORMER-BASED MODELS

Similar Items