Improving text classification with transformers and layer normalization
More than 25,000 injuries and 25 fatalities occur each year due to unstable furniture tip-over incidents. Classifying these furniture tip-over incidents is an essential task for understanding incident patterns and building safer products. For example, this classification can help standards developme...
Main Authors: | , |
---|---|
Format: | Article |
Language: | English |
Published: |
Elsevier
2022-12-01
|
Series: | Machine Learning with Applications |
Subjects: | |
Online Access: | http://www.sciencedirect.com/science/article/pii/S2666827022000792 |
_version_ | 1811296477433036800 |
---|---|
author | Ben Rodrawangpai Witawat Daungjaiboon |
author_facet | Ben Rodrawangpai Witawat Daungjaiboon |
author_sort | Ben Rodrawangpai |
collection | DOAJ |
description | More than 25,000 injuries and 25 fatalities occur each year due to unstable furniture tip-over incidents. Classifying these furniture tip-over incidents is an essential task for understanding incident patterns and building safer products. For example, this classification can help standards development organizations (SDOs) and policy makers discover hidden insights, which can be used to develop standards and regulations that help improve furniture and make homes safer. Since 2000, the U.S. Consumer Product Safety Commission (CPSC) has published data related to consumer product injuries. The amount of data has grown rapidly, and the process of manually reviewing and classifying individual incidents has correspondingly become very resource intensive. This paper proposes an improved method that employs a combination of natural language processing (NLP) techniques and machine learning (ML) algorithms to classify textual data. Machine learning models can help reduce time and effort by streamlining incident narrative classification for determining whether incidents are related to furniture tip-overs. Challenges often presented by real-world data sets (such as the CPSC data used in our experiment) include imbalanced target classes and narratives requiring domain knowledge, since the data sets contain abbreviations and jargon. Using out-of-the-box, default classification models such as bidirectional encoder representations from transformers (BERT) might not yield adequate results. Our proposed method adds layer normalization and dropout layers to a transformer-based language model, which achieves better classification results than using a transformer-based language alone with imbalanced classes. We carefully measure the impact of hidden layers in order to fine-tune the model. |
first_indexed | 2024-04-13T05:49:08Z |
format | Article |
id | doaj.art-07c099c5cfa0438882e00e03837ba837 |
institution | Directory Open Access Journal |
issn | 2666-8270 |
language | English |
last_indexed | 2024-04-13T05:49:08Z |
publishDate | 2022-12-01 |
publisher | Elsevier |
record_format | Article |
series | Machine Learning with Applications |
spelling | doaj.art-07c099c5cfa0438882e00e03837ba8372022-12-22T02:59:50ZengElsevierMachine Learning with Applications2666-82702022-12-0110100403Improving text classification with transformers and layer normalizationBen Rodrawangpai0Witawat Daungjaiboon1Underwriters Laboratories, 333 Pfingsten Road, Northbrook, IL 60062, United StatesCorresponding author.; Underwriters Laboratories, 333 Pfingsten Road, Northbrook, IL 60062, United StatesMore than 25,000 injuries and 25 fatalities occur each year due to unstable furniture tip-over incidents. Classifying these furniture tip-over incidents is an essential task for understanding incident patterns and building safer products. For example, this classification can help standards development organizations (SDOs) and policy makers discover hidden insights, which can be used to develop standards and regulations that help improve furniture and make homes safer. Since 2000, the U.S. Consumer Product Safety Commission (CPSC) has published data related to consumer product injuries. The amount of data has grown rapidly, and the process of manually reviewing and classifying individual incidents has correspondingly become very resource intensive. This paper proposes an improved method that employs a combination of natural language processing (NLP) techniques and machine learning (ML) algorithms to classify textual data. Machine learning models can help reduce time and effort by streamlining incident narrative classification for determining whether incidents are related to furniture tip-overs. Challenges often presented by real-world data sets (such as the CPSC data used in our experiment) include imbalanced target classes and narratives requiring domain knowledge, since the data sets contain abbreviations and jargon. Using out-of-the-box, default classification models such as bidirectional encoder representations from transformers (BERT) might not yield adequate results. Our proposed method adds layer normalization and dropout layers to a transformer-based language model, which achieves better classification results than using a transformer-based language alone with imbalanced classes. We carefully measure the impact of hidden layers in order to fine-tune the model.http://www.sciencedirect.com/science/article/pii/S2666827022000792Computational modelingMachine learningNatural language processingPattern classificationSemanticsText analysis |
spellingShingle | Ben Rodrawangpai Witawat Daungjaiboon Improving text classification with transformers and layer normalization Machine Learning with Applications Computational modeling Machine learning Natural language processing Pattern classification Semantics Text analysis |
title | Improving text classification with transformers and layer normalization |
title_full | Improving text classification with transformers and layer normalization |
title_fullStr | Improving text classification with transformers and layer normalization |
title_full_unstemmed | Improving text classification with transformers and layer normalization |
title_short | Improving text classification with transformers and layer normalization |
title_sort | improving text classification with transformers and layer normalization |
topic | Computational modeling Machine learning Natural language processing Pattern classification Semantics Text analysis |
url | http://www.sciencedirect.com/science/article/pii/S2666827022000792 |
work_keys_str_mv | AT benrodrawangpai improvingtextclassificationwithtransformersandlayernormalization AT witawatdaungjaiboon improvingtextclassificationwithtransformersandlayernormalization |