End-to-End Transformer-Based Models in Textual-Based NLP

Transformer architectures are highly expressive because they use self-attention mechanisms to encode long-range dependencies in the input sequences. In this paper, we present a literature review on Transformer-based (TB) models, providing a detailed overview of each model in comparison to the Transf...

Full description

Bibliographic Details
Main Authors: Abir Rahali, Moulay A. Akhloufi
Format: Article
Language:English
Published: MDPI AG 2023-01-01
Series:AI
Subjects:
Online Access:https://www.mdpi.com/2673-2688/4/1/4
_version_ 1827751725662994432
author Abir Rahali
Moulay A. Akhloufi
author_facet Abir Rahali
Moulay A. Akhloufi
author_sort Abir Rahali
collection DOAJ
description Transformer architectures are highly expressive because they use self-attention mechanisms to encode long-range dependencies in the input sequences. In this paper, we present a literature review on Transformer-based (TB) models, providing a detailed overview of each model in comparison to the Transformer’s standard architecture. This survey focuses on TB models used in the field of Natural Language Processing (NLP) for textual-based tasks. We begin with an overview of the fundamental concepts at the heart of the success of these models. Then, we classify them based on their architecture and training mode. We compare the advantages and disadvantages of popular techniques in terms of architectural design and experimental value. Finally, we discuss open research, directions, and potential future work to help solve current TB application challenges in NLP.
first_indexed 2024-03-11T07:02:10Z
format Article
id doaj.art-0a3bdbf9897642faa213d7d52b39efcc
institution Directory Open Access Journal
issn 2673-2688
language English
last_indexed 2024-03-11T07:02:10Z
publishDate 2023-01-01
publisher MDPI AG
record_format Article
series AI
spelling doaj.art-0a3bdbf9897642faa213d7d52b39efcc2023-11-17T09:08:39ZengMDPI AGAI2673-26882023-01-01415411010.3390/ai4010004End-to-End Transformer-Based Models in Textual-Based NLPAbir Rahali0Moulay A. Akhloufi1Perception, Robotics, and Intelligent Machines Research Group (PRIME), Department of Computer Science, Université de Moncton, Moncton, NB E1A 3E9, CanadaPerception, Robotics, and Intelligent Machines Research Group (PRIME), Department of Computer Science, Université de Moncton, Moncton, NB E1A 3E9, CanadaTransformer architectures are highly expressive because they use self-attention mechanisms to encode long-range dependencies in the input sequences. In this paper, we present a literature review on Transformer-based (TB) models, providing a detailed overview of each model in comparison to the Transformer’s standard architecture. This survey focuses on TB models used in the field of Natural Language Processing (NLP) for textual-based tasks. We begin with an overview of the fundamental concepts at the heart of the success of these models. Then, we classify them based on their architecture and training mode. We compare the advantages and disadvantages of popular techniques in terms of architectural design and experimental value. Finally, we discuss open research, directions, and potential future work to help solve current TB application challenges in NLP.https://www.mdpi.com/2673-2688/4/1/4Transformersdeep learningnatural language processingtransfer learning
spellingShingle Abir Rahali
Moulay A. Akhloufi
End-to-End Transformer-Based Models in Textual-Based NLP
AI
Transformers
deep learning
natural language processing
transfer learning
title End-to-End Transformer-Based Models in Textual-Based NLP
title_full End-to-End Transformer-Based Models in Textual-Based NLP
title_fullStr End-to-End Transformer-Based Models in Textual-Based NLP
title_full_unstemmed End-to-End Transformer-Based Models in Textual-Based NLP
title_short End-to-End Transformer-Based Models in Textual-Based NLP
title_sort end to end transformer based models in textual based nlp
topic Transformers
deep learning
natural language processing
transfer learning
url https://www.mdpi.com/2673-2688/4/1/4
work_keys_str_mv AT abirrahali endtoendtransformerbasedmodelsintextualbasednlp
AT moulayaakhloufi endtoendtransformerbasedmodelsintextualbasednlp