End-to-End Transformer-Based Models in Textual-Based NLP

Transformer architectures are highly expressive because they use self-attention mechanisms to encode long-range dependencies in the input sequences. In this paper, we present a literature review on Transformer-based (TB) models, providing a detailed overview of each model in comparison to the Transf...

Full description

Bibliographic Details
Main Authors:	Abir Rahali, Moulay A. Akhloufi
Format:	Article
Language:	English
Published:	MDPI AG 2023-01-01
Series:	AI
Subjects:	Transformers deep learning natural language processing transfer learning
Online Access:	https://www.mdpi.com/2673-2688/4/1/4

_version_	1827751725662994432
author	Abir Rahali Moulay A. Akhloufi
author_facet	Abir Rahali Moulay A. Akhloufi
author_sort	Abir Rahali
collection	DOAJ
description	Transformer architectures are highly expressive because they use self-attention mechanisms to encode long-range dependencies in the input sequences. In this paper, we present a literature review on Transformer-based (TB) models, providing a detailed overview of each model in comparison to the Transformer’s standard architecture. This survey focuses on TB models used in the field of Natural Language Processing (NLP) for textual-based tasks. We begin with an overview of the fundamental concepts at the heart of the success of these models. Then, we classify them based on their architecture and training mode. We compare the advantages and disadvantages of popular techniques in terms of architectural design and experimental value. Finally, we discuss open research, directions, and potential future work to help solve current TB application challenges in NLP.
first_indexed	2024-03-11T07:02:10Z
format	Article
id	doaj.art-0a3bdbf9897642faa213d7d52b39efcc
institution	Directory Open Access Journal
issn	2673-2688
language	English
last_indexed	2024-03-11T07:02:10Z
publishDate	2023-01-01
publisher	MDPI AG
record_format	Article
series	AI
spelling	doaj.art-0a3bdbf9897642faa213d7d52b39efcc2023-11-17T09:08:39ZengMDPI AGAI2673-26882023-01-01415411010.3390/ai4010004End-to-End Transformer-Based Models in Textual-Based NLPAbir Rahali0Moulay A. Akhloufi1Perception, Robotics, and Intelligent Machines Research Group (PRIME), Department of Computer Science, Université de Moncton, Moncton, NB E1A 3E9, CanadaPerception, Robotics, and Intelligent Machines Research Group (PRIME), Department of Computer Science, Université de Moncton, Moncton, NB E1A 3E9, CanadaTransformer architectures are highly expressive because they use self-attention mechanisms to encode long-range dependencies in the input sequences. In this paper, we present a literature review on Transformer-based (TB) models, providing a detailed overview of each model in comparison to the Transformer’s standard architecture. This survey focuses on TB models used in the field of Natural Language Processing (NLP) for textual-based tasks. We begin with an overview of the fundamental concepts at the heart of the success of these models. Then, we classify them based on their architecture and training mode. We compare the advantages and disadvantages of popular techniques in terms of architectural design and experimental value. Finally, we discuss open research, directions, and potential future work to help solve current TB application challenges in NLP.https://www.mdpi.com/2673-2688/4/1/4Transformersdeep learningnatural language processingtransfer learning
spellingShingle	Abir Rahali Moulay A. Akhloufi End-to-End Transformer-Based Models in Textual-Based NLP AI Transformers deep learning natural language processing transfer learning
title	End-to-End Transformer-Based Models in Textual-Based NLP
title_full	End-to-End Transformer-Based Models in Textual-Based NLP
title_fullStr	End-to-End Transformer-Based Models in Textual-Based NLP
title_full_unstemmed	End-to-End Transformer-Based Models in Textual-Based NLP
title_short	End-to-End Transformer-Based Models in Textual-Based NLP
title_sort	end to end transformer based models in textual based nlp
topic	Transformers deep learning natural language processing transfer learning
url	https://www.mdpi.com/2673-2688/4/1/4
work_keys_str_mv	AT abirrahali endtoendtransformerbasedmodelsintextualbasednlp AT moulayaakhloufi endtoendtransformerbasedmodelsintextualbasednlp

End-to-End Transformer-Based Models in Textual-Based NLP

Similar Items