Developing an Advanced Software Requirements Classification Model Using BERT: An Empirical Evaluation Study on Newly Generated Turkish Data

Requirements Engineering (RE) is an important step in the whole software development lifecycle. The problem in RE is to determine the class of the software requirements as functional (FR) and non-functional (NFR). Proper and early identification of these requirements is vital for the entire developm...

Full description

Bibliographic Details
Main Author: Fatih Yucalar
Format: Article
Language:English
Published: MDPI AG 2023-10-01
Series:Applied Sciences
Subjects:
Online Access:https://www.mdpi.com/2076-3417/13/20/11127
_version_ 1827721892477272064
author Fatih Yucalar
author_facet Fatih Yucalar
author_sort Fatih Yucalar
collection DOAJ
description Requirements Engineering (RE) is an important step in the whole software development lifecycle. The problem in RE is to determine the class of the software requirements as functional (FR) and non-functional (NFR). Proper and early identification of these requirements is vital for the entire development cycle. On the other hand, manual identification of these classes is a timewaster, and it needs to be automated. Methodically, machine learning (ML) approaches are applied to address this problem. In this study, twenty ML algorithms, such as Naïve Bayes, Rotation Forests, Convolutional Neural Networks, and transformers such as BERT, were used to predict FR and NFR. Any ML algorithm requires a dataset for training. For this goal, we generated a unique Turkish dataset having collected the requirements from real-world software projects with 4600 samples. The generated Turkish dataset was used to assess the performance of the three groups of ML algorithms in terms of F-score and related statistical metrics. In particular, out of 20 ML algorithms, BERTurk was found to be the most successful algorithm for discriminating FR and NFR in terms of a 95% F-score metric. From the FR and NFR identification problem point of view, transformer algorithms show significantly better performances.
first_indexed 2024-03-10T21:29:05Z
format Article
id doaj.art-1d58fd96ab6049a3bd2c69026e74dd01
institution Directory Open Access Journal
issn 2076-3417
language English
last_indexed 2024-03-10T21:29:05Z
publishDate 2023-10-01
publisher MDPI AG
record_format Article
series Applied Sciences
spelling doaj.art-1d58fd96ab6049a3bd2c69026e74dd012023-11-19T15:28:54ZengMDPI AGApplied Sciences2076-34172023-10-0113201112710.3390/app132011127Developing an Advanced Software Requirements Classification Model Using BERT: An Empirical Evaluation Study on Newly Generated Turkish DataFatih Yucalar0Department of Software Engineering, Manisa Celal Bayar University, Manisa 45400, TurkeyRequirements Engineering (RE) is an important step in the whole software development lifecycle. The problem in RE is to determine the class of the software requirements as functional (FR) and non-functional (NFR). Proper and early identification of these requirements is vital for the entire development cycle. On the other hand, manual identification of these classes is a timewaster, and it needs to be automated. Methodically, machine learning (ML) approaches are applied to address this problem. In this study, twenty ML algorithms, such as Naïve Bayes, Rotation Forests, Convolutional Neural Networks, and transformers such as BERT, were used to predict FR and NFR. Any ML algorithm requires a dataset for training. For this goal, we generated a unique Turkish dataset having collected the requirements from real-world software projects with 4600 samples. The generated Turkish dataset was used to assess the performance of the three groups of ML algorithms in terms of F-score and related statistical metrics. In particular, out of 20 ML algorithms, BERTurk was found to be the most successful algorithm for discriminating FR and NFR in terms of a 95% F-score metric. From the FR and NFR identification problem point of view, transformer algorithms show significantly better performances.https://www.mdpi.com/2076-3417/13/20/11127software requirements classificationtransformer learningdeep neural networksmachine learningfunctional requirementsnon-functional requirements
spellingShingle Fatih Yucalar
Developing an Advanced Software Requirements Classification Model Using BERT: An Empirical Evaluation Study on Newly Generated Turkish Data
Applied Sciences
software requirements classification
transformer learning
deep neural networks
machine learning
functional requirements
non-functional requirements
title Developing an Advanced Software Requirements Classification Model Using BERT: An Empirical Evaluation Study on Newly Generated Turkish Data
title_full Developing an Advanced Software Requirements Classification Model Using BERT: An Empirical Evaluation Study on Newly Generated Turkish Data
title_fullStr Developing an Advanced Software Requirements Classification Model Using BERT: An Empirical Evaluation Study on Newly Generated Turkish Data
title_full_unstemmed Developing an Advanced Software Requirements Classification Model Using BERT: An Empirical Evaluation Study on Newly Generated Turkish Data
title_short Developing an Advanced Software Requirements Classification Model Using BERT: An Empirical Evaluation Study on Newly Generated Turkish Data
title_sort developing an advanced software requirements classification model using bert an empirical evaluation study on newly generated turkish data
topic software requirements classification
transformer learning
deep neural networks
machine learning
functional requirements
non-functional requirements
url https://www.mdpi.com/2076-3417/13/20/11127
work_keys_str_mv AT fatihyucalar developinganadvancedsoftwarerequirementsclassificationmodelusingbertanempiricalevaluationstudyonnewlygeneratedturkishdata