Review of Existing Datasets Used for Software Effort Estimation

The Software Effort Estimation (SEE) tool calculates an estimate of the amount of work that will be necessary to effectively finish the project. Managers usually want to know how hard a new project will be ahead of time so they can divide their limited resources in a fair way. In fact, it is common...

Full description

Bibliographic Details
Main Authors: Mizanur, Rahman, Gonçalves, Teresa, Sarwar, Hasan
Format: Article
Language:English
Published: The Science and Information Organization 2023
Subjects:
Online Access:http://umpir.ump.edu.my/id/eprint/39640/1/Review%20of%20Existing%20Datasets%20Used%20for%20Software%20Effort%20Estimation.pdf
_version_ 1796996098294808576
author Mizanur, Rahman
Gonçalves, Teresa
Sarwar, Hasan
author_facet Mizanur, Rahman
Gonçalves, Teresa
Sarwar, Hasan
author_sort Mizanur, Rahman
collection UMP
description The Software Effort Estimation (SEE) tool calculates an estimate of the amount of work that will be necessary to effectively finish the project. Managers usually want to know how hard a new project will be ahead of time so they can divide their limited resources in a fair way. In fact, it is common to use effort datasets to train a prediction model that can predict how much work a project will take. To train a good estimator, you need enough data, but most data owners don’t want to share their closed source project effort data because they are worried about privacy. This means that we can only get a small amount of effort data. The purpose of this research was to evaluate the quality of 15 datasets that have been widely utilized in studies of software project estimation. The analysis shows that most of the chosen studies use artificial neural networks (ANN) as ML models, NASA as datasets, and the mean magnitude of relative error (MMRE) as a measure of accuracy. In more cases, ANN and support vector machine (SVM) have done better than other ML techniques.
first_indexed 2024-03-06T13:11:57Z
format Article
id UMPir39640
institution Universiti Malaysia Pahang
language English
last_indexed 2024-03-06T13:11:57Z
publishDate 2023
publisher The Science and Information Organization
record_format dspace
spelling UMPir396402023-12-13T07:28:31Z http://umpir.ump.edu.my/id/eprint/39640/ Review of Existing Datasets Used for Software Effort Estimation Mizanur, Rahman Gonçalves, Teresa Sarwar, Hasan QA75 Electronic computers. Computer science The Software Effort Estimation (SEE) tool calculates an estimate of the amount of work that will be necessary to effectively finish the project. Managers usually want to know how hard a new project will be ahead of time so they can divide their limited resources in a fair way. In fact, it is common to use effort datasets to train a prediction model that can predict how much work a project will take. To train a good estimator, you need enough data, but most data owners don’t want to share their closed source project effort data because they are worried about privacy. This means that we can only get a small amount of effort data. The purpose of this research was to evaluate the quality of 15 datasets that have been widely utilized in studies of software project estimation. The analysis shows that most of the chosen studies use artificial neural networks (ANN) as ML models, NASA as datasets, and the mean magnitude of relative error (MMRE) as a measure of accuracy. In more cases, ANN and support vector machine (SVM) have done better than other ML techniques. The Science and Information Organization 2023 Article PeerReviewed pdf en cc_by_4 http://umpir.ump.edu.my/id/eprint/39640/1/Review%20of%20Existing%20Datasets%20Used%20for%20Software%20Effort%20Estimation.pdf Mizanur, Rahman and Gonçalves, Teresa and Sarwar, Hasan (2023) Review of Existing Datasets Used for Software Effort Estimation. International Journal of Advanced Computer Science and Applications,, 14 (7). ISSN 2158-107X (Print); 2156-5570 (Online). (Published) https://dx.doi.org/10.14569/IJACSA.2023.01407100
spellingShingle QA75 Electronic computers. Computer science
Mizanur, Rahman
Gonçalves, Teresa
Sarwar, Hasan
Review of Existing Datasets Used for Software Effort Estimation
title Review of Existing Datasets Used for Software Effort Estimation
title_full Review of Existing Datasets Used for Software Effort Estimation
title_fullStr Review of Existing Datasets Used for Software Effort Estimation
title_full_unstemmed Review of Existing Datasets Used for Software Effort Estimation
title_short Review of Existing Datasets Used for Software Effort Estimation
title_sort review of existing datasets used for software effort estimation
topic QA75 Electronic computers. Computer science
url http://umpir.ump.edu.my/id/eprint/39640/1/Review%20of%20Existing%20Datasets%20Used%20for%20Software%20Effort%20Estimation.pdf
work_keys_str_mv AT mizanurrahman reviewofexistingdatasetsusedforsoftwareeffortestimation
AT goncalvesteresa reviewofexistingdatasetsusedforsoftwareeffortestimation
AT sarwarhasan reviewofexistingdatasetsusedforsoftwareeffortestimation