Survey of Neural Text Representation Models

In natural language processing, text needs to be transformed into a machine-readable representation before any processing. The quality of further natural language processing tasks greatly depends on the quality of those representations. In this survey, we systematize and analyze 50 neural models fro...

Full description

Bibliographic Details
Main Authors: Karlo Babić, Sanda Martinčić-Ipšić, Ana Meštrović
Format: Article
Language:English
Published: MDPI AG 2020-10-01
Series:Information
Subjects:
Online Access:https://www.mdpi.com/2078-2489/11/11/511
_version_ 1827703216364584960
author Karlo Babić
Sanda Martinčić-Ipšić
Ana Meštrović
author_facet Karlo Babić
Sanda Martinčić-Ipšić
Ana Meštrović
author_sort Karlo Babić
collection DOAJ
description In natural language processing, text needs to be transformed into a machine-readable representation before any processing. The quality of further natural language processing tasks greatly depends on the quality of those representations. In this survey, we systematize and analyze 50 neural models from the last decade. The models described are grouped by the architecture of neural networks as shallow, recurrent, recursive, convolutional, and attention models. Furthermore, we categorize these models by representation level, input level, model type, and model supervision. We focus on task-independent representation models, discuss their advantages and drawbacks, and subsequently identify the promising directions for future neural text representation models. We describe the evaluation datasets and tasks used in the papers that introduced the models and compare the models based on relevant evaluations. The quality of a representation model can be evaluated as its capability to generalize to multiple unrelated tasks. Benchmark standardization is visible amongst recent models and the number of different tasks models are evaluated on is increasing.
first_indexed 2024-03-10T15:13:15Z
format Article
id doaj.art-b3d055b095d24da59f270eee6255e05c
institution Directory Open Access Journal
issn 2078-2489
language English
last_indexed 2024-03-10T15:13:15Z
publishDate 2020-10-01
publisher MDPI AG
record_format Article
series Information
spelling doaj.art-b3d055b095d24da59f270eee6255e05c2023-11-20T19:13:01ZengMDPI AGInformation2078-24892020-10-01111151110.3390/info11110511Survey of Neural Text Representation ModelsKarlo Babić0Sanda Martinčić-Ipšić1Ana Meštrović2Center for Artificial Intelligence and Cybersecurity and Department of Informatics, University of Rijeka, 51000 Rijeka, CroatiaCenter for Artificial Intelligence and Cybersecurity and Department of Informatics, University of Rijeka, 51000 Rijeka, CroatiaCenter for Artificial Intelligence and Cybersecurity and Department of Informatics, University of Rijeka, 51000 Rijeka, CroatiaIn natural language processing, text needs to be transformed into a machine-readable representation before any processing. The quality of further natural language processing tasks greatly depends on the quality of those representations. In this survey, we systematize and analyze 50 neural models from the last decade. The models described are grouped by the architecture of neural networks as shallow, recurrent, recursive, convolutional, and attention models. Furthermore, we categorize these models by representation level, input level, model type, and model supervision. We focus on task-independent representation models, discuss their advantages and drawbacks, and subsequently identify the promising directions for future neural text representation models. We describe the evaluation datasets and tasks used in the papers that introduced the models and compare the models based on relevant evaluations. The quality of a representation model can be evaluated as its capability to generalize to multiple unrelated tasks. Benchmark standardization is visible amongst recent models and the number of different tasks models are evaluated on is increasing.https://www.mdpi.com/2078-2489/11/11/511deep learningembeddingneural language modelneural networksNLPtext representation
spellingShingle Karlo Babić
Sanda Martinčić-Ipšić
Ana Meštrović
Survey of Neural Text Representation Models
Information
deep learning
embedding
neural language model
neural networks
NLP
text representation
title Survey of Neural Text Representation Models
title_full Survey of Neural Text Representation Models
title_fullStr Survey of Neural Text Representation Models
title_full_unstemmed Survey of Neural Text Representation Models
title_short Survey of Neural Text Representation Models
title_sort survey of neural text representation models
topic deep learning
embedding
neural language model
neural networks
NLP
text representation
url https://www.mdpi.com/2078-2489/11/11/511
work_keys_str_mv AT karlobabic surveyofneuraltextrepresentationmodels
AT sandamartincicipsic surveyofneuraltextrepresentationmodels
AT anamestrovic surveyofneuraltextrepresentationmodels