Summary: | Spectrum Based Fault Localization (SBFL) uses different metrics called risk evaluation formula to guide and pinpoint faults in debugging process. The accuracy of a specific SBFL method may be limited by the used formulae and program spectra. However, it has been demonstrated recently that Genetic Programming could be used to automatically design formulae directly from the program spectra. Therefore, this article presents Genetic Programming approach for proposing risk evaluation formula with the inclusion of radicals to evolve suspiciousness metric directly from the program spectra. 92 faults from Unix utilities of SIR repository and 357 real faults from Defect4J repository were used. The approach combines these data sets, used 25% of the total faults (113) to evolve the formulae and the remaining 75% (336) to validate the effectiveness of the metrics generated by our approach. The proposed approach then uses Genetic Programming to run 30 evolution to produce different 30 metrics. The GP-generated metrics consistently out-performed all the classic formulae in both single and multiple faults, especially OP2 on average of 2.25% in single faults and 3.42% in multiple faults. The experiment results conclude that the combination of Hybrid data set and radical is a good technique to evolve effective formulae for spectra-based fault localization.
|