Multifunctional optimized group method data handling for software effort estimation

Nowadays, the trend of significant effort estimations is in demand. Due to its popularity, the stakeholder needs effective and efficient software development processes with the best estimation and accuracy to suit all data types. Nevertheless, finding the best effort estimation model with good accur...

Full description

Bibliographic Details
Main Author: Arbain, Siti Hajar
Format: Thesis
Language:English
Published: 2022
Subjects:
Online Access:http://eprints.utm.my/101491/1/SitiHajarArbainPSC2022.pdf.pdf
Description
Summary:Nowadays, the trend of significant effort estimations is in demand. Due to its popularity, the stakeholder needs effective and efficient software development processes with the best estimation and accuracy to suit all data types. Nevertheless, finding the best effort estimation model with good accuracy is hard to serve this purpose. Group Method of Data Handling (GMDH) algorithms have been widely used for modelling and identifying complex systems and potentially applied in software effort estimation. However, there is limited study to determine the best architecture and optimal weight coefficients of the transfer function for the GMDH model. This study aimed to propose a hybrid multifunctional GMDH with Artificial Bee Colony (GMDH-ABC) based on a combination of four individual GMDH models, namely, GMDH-Polynomial, GMDH-Sigmoid, GMDH-Radial Basis Function, and GMDH-Tangent. The best GMDH architecture is determined based on L9 Taguchi orthogonal array. Five datasets (i.e., Cocomo, Dershanais, Albrecht, Kemerer and ISBSG) were used to validate the proposed models. The missing values in the dataset are imputed by the developed MissForest Multiple imputation method (MFMI). The Mean Absolute Percentage Error (MAPE) was used as performance measurement. The result showed that the GMDH-ABC model outperformed the individual GMDH by more than 50% improvement compared to standard conventional GMDH models and the benchmark ANN model in all datasets. The Cocomo dataset improved by 49% compared to the conventional GMDH-LSM. Improvements of 71%, 63%, 67%, and 82% in accuracy were obtained for the Dershanis dataset, Albrecht dataset, Kemerer dataset, and ISBSG dataset, respectively, as compared with the conventional GMDH-LSM. The results indicated that the proposed GMDH-ABC model has the ability to achieve higher accuracy in software effort estimation.