Improving text classification with transformers and layer normalization

More than 25,000 injuries and 25 fatalities occur each year due to unstable furniture tip-over incidents. Classifying these furniture tip-over incidents is an essential task for understanding incident patterns and building safer products. For example, this classification can help standards developme...

Full description

Bibliographic Details
Main Authors: Ben Rodrawangpai, Witawat Daungjaiboon
Format: Article
Language:English
Published: Elsevier 2022-12-01
Series:Machine Learning with Applications
Subjects:
Online Access:http://www.sciencedirect.com/science/article/pii/S2666827022000792
_version_ 1811296477433036800
author Ben Rodrawangpai
Witawat Daungjaiboon
author_facet Ben Rodrawangpai
Witawat Daungjaiboon
author_sort Ben Rodrawangpai
collection DOAJ
description More than 25,000 injuries and 25 fatalities occur each year due to unstable furniture tip-over incidents. Classifying these furniture tip-over incidents is an essential task for understanding incident patterns and building safer products. For example, this classification can help standards development organizations (SDOs) and policy makers discover hidden insights, which can be used to develop standards and regulations that help improve furniture and make homes safer. Since 2000, the U.S. Consumer Product Safety Commission (CPSC) has published data related to consumer product injuries. The amount of data has grown rapidly, and the process of manually reviewing and classifying individual incidents has correspondingly become very resource intensive. This paper proposes an improved method that employs a combination of natural language processing (NLP) techniques and machine learning (ML) algorithms to classify textual data. Machine learning models can help reduce time and effort by streamlining incident narrative classification for determining whether incidents are related to furniture tip-overs. Challenges often presented by real-world data sets (such as the CPSC data used in our experiment) include imbalanced target classes and narratives requiring domain knowledge, since the data sets contain abbreviations and jargon. Using out-of-the-box, default classification models such as bidirectional encoder representations from transformers (BERT) might not yield adequate results. Our proposed method adds layer normalization and dropout layers to a transformer-based language model, which achieves better classification results than using a transformer-based language alone with imbalanced classes. We carefully measure the impact of hidden layers in order to fine-tune the model.
first_indexed 2024-04-13T05:49:08Z
format Article
id doaj.art-07c099c5cfa0438882e00e03837ba837
institution Directory Open Access Journal
issn 2666-8270
language English
last_indexed 2024-04-13T05:49:08Z
publishDate 2022-12-01
publisher Elsevier
record_format Article
series Machine Learning with Applications
spelling doaj.art-07c099c5cfa0438882e00e03837ba8372022-12-22T02:59:50ZengElsevierMachine Learning with Applications2666-82702022-12-0110100403Improving text classification with transformers and layer normalizationBen Rodrawangpai0Witawat Daungjaiboon1Underwriters Laboratories, 333 Pfingsten Road, Northbrook, IL 60062, United StatesCorresponding author.; Underwriters Laboratories, 333 Pfingsten Road, Northbrook, IL 60062, United StatesMore than 25,000 injuries and 25 fatalities occur each year due to unstable furniture tip-over incidents. Classifying these furniture tip-over incidents is an essential task for understanding incident patterns and building safer products. For example, this classification can help standards development organizations (SDOs) and policy makers discover hidden insights, which can be used to develop standards and regulations that help improve furniture and make homes safer. Since 2000, the U.S. Consumer Product Safety Commission (CPSC) has published data related to consumer product injuries. The amount of data has grown rapidly, and the process of manually reviewing and classifying individual incidents has correspondingly become very resource intensive. This paper proposes an improved method that employs a combination of natural language processing (NLP) techniques and machine learning (ML) algorithms to classify textual data. Machine learning models can help reduce time and effort by streamlining incident narrative classification for determining whether incidents are related to furniture tip-overs. Challenges often presented by real-world data sets (such as the CPSC data used in our experiment) include imbalanced target classes and narratives requiring domain knowledge, since the data sets contain abbreviations and jargon. Using out-of-the-box, default classification models such as bidirectional encoder representations from transformers (BERT) might not yield adequate results. Our proposed method adds layer normalization and dropout layers to a transformer-based language model, which achieves better classification results than using a transformer-based language alone with imbalanced classes. We carefully measure the impact of hidden layers in order to fine-tune the model.http://www.sciencedirect.com/science/article/pii/S2666827022000792Computational modelingMachine learningNatural language processingPattern classificationSemanticsText analysis
spellingShingle Ben Rodrawangpai
Witawat Daungjaiboon
Improving text classification with transformers and layer normalization
Machine Learning with Applications
Computational modeling
Machine learning
Natural language processing
Pattern classification
Semantics
Text analysis
title Improving text classification with transformers and layer normalization
title_full Improving text classification with transformers and layer normalization
title_fullStr Improving text classification with transformers and layer normalization
title_full_unstemmed Improving text classification with transformers and layer normalization
title_short Improving text classification with transformers and layer normalization
title_sort improving text classification with transformers and layer normalization
topic Computational modeling
Machine learning
Natural language processing
Pattern classification
Semantics
Text analysis
url http://www.sciencedirect.com/science/article/pii/S2666827022000792
work_keys_str_mv AT benrodrawangpai improvingtextclassificationwithtransformersandlayernormalization
AT witawatdaungjaiboon improvingtextclassificationwithtransformersandlayernormalization