A Systematic Review of Transformer-Based Pre-Trained Language Models through Self-Supervised Learning

Transfer learning is a technique utilized in deep learning applications to transmit learned inference to a different target domain. The approach is mainly to solve the problem of a few training datasets resulting in model overfitting, which affects model performance. The study was carried out on pub...

Full description

Bibliographic Details
Main Authors:	Evans Kotei, Ramkumar Thirunavukarasu
Format:	Article
Language:	English
Published:	MDPI AG 2023-03-01
Series:	Information
Subjects:	transformer network transfer learning pretraining natural language processing language models
Online Access:	https://www.mdpi.com/2078-2489/14/3/187

_version_	1797611076402020352
author	Evans Kotei Ramkumar Thirunavukarasu
author_facet	Evans Kotei Ramkumar Thirunavukarasu
author_sort	Evans Kotei
collection	DOAJ
description	Transfer learning is a technique utilized in deep learning applications to transmit learned inference to a different target domain. The approach is mainly to solve the problem of a few training datasets resulting in model overfitting, which affects model performance. The study was carried out on publications retrieved from various digital libraries such as SCOPUS, ScienceDirect, IEEE Xplore, ACM Digital Library, and Google Scholar, which formed the Primary studies. Secondary studies were retrieved from Primary articles using the backward and forward snowballing approach. Based on set inclusion and exclusion parameters, relevant publications were selected for review. The study focused on transfer learning pretrained NLP models based on the deep transformer network. BERT and GPT were the two elite pretrained models trained to classify global and local representations based on larger unlabeled text datasets through self-supervised learning. Pretrained transformer models offer numerous advantages to natural language processing models, such as knowledge transfer to downstream tasks that deal with drawbacks associated with training a model from scratch. This review gives a comprehensive view of transformer architecture, self-supervised learning and pretraining concepts in language models, and their adaptation to downstream tasks. Finally, we present future directions to further improvement in pretrained transformer-based language models.
first_indexed	2024-03-11T06:23:45Z
format	Article
id	doaj.art-1f56bc11529648ea87049bcc497958eb
institution	Directory Open Access Journal
issn	2078-2489
language	English
last_indexed	2024-03-11T06:23:45Z
publishDate	2023-03-01
publisher	MDPI AG
record_format	Article
series	Information
spelling	doaj.art-1f56bc11529648ea87049bcc497958eb2023-11-17T11:44:18ZengMDPI AGInformation2078-24892023-03-0114318710.3390/info14030187A Systematic Review of Transformer-Based Pre-Trained Language Models through Self-Supervised LearningEvans Kotei0Ramkumar Thirunavukarasu1School of Information Technology and Engineering, Vellore Institute of Technology, Vellore 632014, IndiaSchool of Information Technology and Engineering, Vellore Institute of Technology, Vellore 632014, IndiaTransfer learning is a technique utilized in deep learning applications to transmit learned inference to a different target domain. The approach is mainly to solve the problem of a few training datasets resulting in model overfitting, which affects model performance. The study was carried out on publications retrieved from various digital libraries such as SCOPUS, ScienceDirect, IEEE Xplore, ACM Digital Library, and Google Scholar, which formed the Primary studies. Secondary studies were retrieved from Primary articles using the backward and forward snowballing approach. Based on set inclusion and exclusion parameters, relevant publications were selected for review. The study focused on transfer learning pretrained NLP models based on the deep transformer network. BERT and GPT were the two elite pretrained models trained to classify global and local representations based on larger unlabeled text datasets through self-supervised learning. Pretrained transformer models offer numerous advantages to natural language processing models, such as knowledge transfer to downstream tasks that deal with drawbacks associated with training a model from scratch. This review gives a comprehensive view of transformer architecture, self-supervised learning and pretraining concepts in language models, and their adaptation to downstream tasks. Finally, we present future directions to further improvement in pretrained transformer-based language models.https://www.mdpi.com/2078-2489/14/3/187transformer networktransfer learningpretrainingnatural language processinglanguage models
spellingShingle	Evans Kotei Ramkumar Thirunavukarasu A Systematic Review of Transformer-Based Pre-Trained Language Models through Self-Supervised Learning Information transformer network transfer learning pretraining natural language processing language models
title	A Systematic Review of Transformer-Based Pre-Trained Language Models through Self-Supervised Learning
title_full	A Systematic Review of Transformer-Based Pre-Trained Language Models through Self-Supervised Learning
title_fullStr	A Systematic Review of Transformer-Based Pre-Trained Language Models through Self-Supervised Learning
title_full_unstemmed	A Systematic Review of Transformer-Based Pre-Trained Language Models through Self-Supervised Learning
title_short	A Systematic Review of Transformer-Based Pre-Trained Language Models through Self-Supervised Learning
title_sort	systematic review of transformer based pre trained language models through self supervised learning
topic	transformer network transfer learning pretraining natural language processing language models
url	https://www.mdpi.com/2078-2489/14/3/187
work_keys_str_mv	AT evanskotei asystematicreviewoftransformerbasedpretrainedlanguagemodelsthroughselfsupervisedlearning AT ramkumarthirunavukarasu asystematicreviewoftransformerbasedpretrainedlanguagemodelsthroughselfsupervisedlearning AT evanskotei systematicreviewoftransformerbasedpretrainedlanguagemodelsthroughselfsupervisedlearning AT ramkumarthirunavukarasu systematicreviewoftransformerbasedpretrainedlanguagemodelsthroughselfsupervisedlearning

A Systematic Review of Transformer-Based Pre-Trained Language Models through Self-Supervised Learning

Similar Items