End-to-End Transformer-Based Models in Textual-Based NLP
Transformer architectures are highly expressive because they use self-attention mechanisms to encode long-range dependencies in the input sequences. In this paper, we present a literature review on Transformer-based (TB) models, providing a detailed overview of each model in comparison to the Transf...
Main Authors: | , |
---|---|
Format: | Article |
Language: | English |
Published: |
MDPI AG
2023-01-01
|
Series: | AI |
Subjects: | |
Online Access: | https://www.mdpi.com/2673-2688/4/1/4 |
_version_ | 1827751725662994432 |
---|---|
author | Abir Rahali Moulay A. Akhloufi |
author_facet | Abir Rahali Moulay A. Akhloufi |
author_sort | Abir Rahali |
collection | DOAJ |
description | Transformer architectures are highly expressive because they use self-attention mechanisms to encode long-range dependencies in the input sequences. In this paper, we present a literature review on Transformer-based (TB) models, providing a detailed overview of each model in comparison to the Transformer’s standard architecture. This survey focuses on TB models used in the field of Natural Language Processing (NLP) for textual-based tasks. We begin with an overview of the fundamental concepts at the heart of the success of these models. Then, we classify them based on their architecture and training mode. We compare the advantages and disadvantages of popular techniques in terms of architectural design and experimental value. Finally, we discuss open research, directions, and potential future work to help solve current TB application challenges in NLP. |
first_indexed | 2024-03-11T07:02:10Z |
format | Article |
id | doaj.art-0a3bdbf9897642faa213d7d52b39efcc |
institution | Directory Open Access Journal |
issn | 2673-2688 |
language | English |
last_indexed | 2024-03-11T07:02:10Z |
publishDate | 2023-01-01 |
publisher | MDPI AG |
record_format | Article |
series | AI |
spelling | doaj.art-0a3bdbf9897642faa213d7d52b39efcc2023-11-17T09:08:39ZengMDPI AGAI2673-26882023-01-01415411010.3390/ai4010004End-to-End Transformer-Based Models in Textual-Based NLPAbir Rahali0Moulay A. Akhloufi1Perception, Robotics, and Intelligent Machines Research Group (PRIME), Department of Computer Science, Université de Moncton, Moncton, NB E1A 3E9, CanadaPerception, Robotics, and Intelligent Machines Research Group (PRIME), Department of Computer Science, Université de Moncton, Moncton, NB E1A 3E9, CanadaTransformer architectures are highly expressive because they use self-attention mechanisms to encode long-range dependencies in the input sequences. In this paper, we present a literature review on Transformer-based (TB) models, providing a detailed overview of each model in comparison to the Transformer’s standard architecture. This survey focuses on TB models used in the field of Natural Language Processing (NLP) for textual-based tasks. We begin with an overview of the fundamental concepts at the heart of the success of these models. Then, we classify them based on their architecture and training mode. We compare the advantages and disadvantages of popular techniques in terms of architectural design and experimental value. Finally, we discuss open research, directions, and potential future work to help solve current TB application challenges in NLP.https://www.mdpi.com/2673-2688/4/1/4Transformersdeep learningnatural language processingtransfer learning |
spellingShingle | Abir Rahali Moulay A. Akhloufi End-to-End Transformer-Based Models in Textual-Based NLP AI Transformers deep learning natural language processing transfer learning |
title | End-to-End Transformer-Based Models in Textual-Based NLP |
title_full | End-to-End Transformer-Based Models in Textual-Based NLP |
title_fullStr | End-to-End Transformer-Based Models in Textual-Based NLP |
title_full_unstemmed | End-to-End Transformer-Based Models in Textual-Based NLP |
title_short | End-to-End Transformer-Based Models in Textual-Based NLP |
title_sort | end to end transformer based models in textual based nlp |
topic | Transformers deep learning natural language processing transfer learning |
url | https://www.mdpi.com/2673-2688/4/1/4 |
work_keys_str_mv | AT abirrahali endtoendtransformerbasedmodelsintextualbasednlp AT moulayaakhloufi endtoendtransformerbasedmodelsintextualbasednlp |