To search, Click below search items.


All Published Papers Search Service


Cross-Project Pooling of Defects for Handling Class Imbalance


J. M. Catherine, S Djodilatchoumy2


Vol. 22  No. 10  pp. 11-16


Applying predictive analytics to predict software defects has improved the overall quality and decreased maintenance costs. Many supervised and unsupervised learning algorithms have been used for defect prediction on publicly available datasets. Most of these datasets suffer from an imbalance in the output classes. We study the impact of class imbalance in the defect datasets on the efficiency of the defect prediction model and propose a CPP method for handling imbalances in the dataset. The performance of the methods is evaluated using measures like Matthew’s Correlation Coefficient (MCC), Recall, and Accuracy measures. The proposed sampling technique shows significant improvement in the efficiency of the classifier in predicting defects.


Software Defect, Random Forest, Matthews Correlation Coefficient, Accuracy, Dataset Imbalance handling