Keywords
|
Software defect prediction, Feature selection, Binary gray wolf optimizer, Machine learning, Module classification
|
Abstract
|
Context
Software defect prediction means finding defect-prone modules before the testing process which will reduce testing cost and time. Machine learning methods can provide valuable models for developers to classify software faulty modules.
Problem
The inherent problem of the classification is the large volume of the training dataset's features, which reduces the accuracy and precision of the classification results. The selection of the effective features of the training dataset for classification is an NP-hard problem that can be solved using heuristic algorithms.
Method
In this study, a binary version of the Gray Wolf optimizer (bGWO) was developed to select the most effective features of the training dataset. By selecting the most influential features in the classification, the precision and accuracy of the software module classifiers can be increased.
Contribution
Developing a binary version of the gray wolf optimization algorithm to optimally select the effective features and creating an effective defect predictor are the main contributions of this study. To evaluate the effectiveness of the proposed method, five real-world and standard datasets have been used for the training and testing stages of the classifier.
Results
The results indicate that among the 21 features of the train datasets, the basic complexity, sum of operators and operands, lines of codes, number of lines containing code and comments, and sum of operands have the greatest effect in predicting software defects. In this research, by combining the bGWO method and machine learning algorithms, accuracy, precision, recall, and F1 criteria have been considerably increased.
|