Review of Existing Datasets Used for Software Effort Estimation

The Software Effort Estimation (SEE) tool calculates an estimate of the amount of work that will be necessary to effectively finish the project. Managers usually want to know how hard a new project will be ahead of time so they can divide their limited resources in a fair way. In fact, it is common...

Full description

Bibliographic Details
Main Authors: Mizanur, Rahman, Gonçalves, Teresa, Sarwar, Hasan
Format: Article
Language:English
Published: The Science and Information Organization 2023
Subjects:
Online Access:http://umpir.ump.edu.my/id/eprint/39640/1/Review%20of%20Existing%20Datasets%20Used%20for%20Software%20Effort%20Estimation.pdf
Description
Summary:The Software Effort Estimation (SEE) tool calculates an estimate of the amount of work that will be necessary to effectively finish the project. Managers usually want to know how hard a new project will be ahead of time so they can divide their limited resources in a fair way. In fact, it is common to use effort datasets to train a prediction model that can predict how much work a project will take. To train a good estimator, you need enough data, but most data owners don’t want to share their closed source project effort data because they are worried about privacy. This means that we can only get a small amount of effort data. The purpose of this research was to evaluate the quality of 15 datasets that have been widely utilized in studies of software project estimation. The analysis shows that most of the chosen studies use artificial neural networks (ANN) as ML models, NASA as datasets, and the mean magnitude of relative error (MMRE) as a measure of accuracy. In more cases, ANN and support vector machine (SVM) have done better than other ML techniques.