Cross Project Defect Prediction Menggunakan Random Forest
Abstract
This study develops a software defect prediction model using the Random Forest algorithm with a Many-to-One Cross-Project Defect Prediction approach. The model is tested using the AEEEM dataset as the training data source and the PROMISE dataset as the testing target. Both datasets consist of various software projects with important features such as code size, code churn, complexity, and other metrics that play a role in predicting software defects. The model’s performance is evaluated using metrics such as Accuracy, AUC, Recall, Precision, and F1-Score to measure its ability to detect defects across different datasets. The results show that the Random Forest model delivers excellent performance, with accuracy above 94% and AUC greater than 0.92 on most datasets. The model is also able to balance Recall and Precision effectively, resulting in more accurate and reliable predictions. Furthermore, this study applies ensemble techniques such as stacking and voting to combine predictions from multiple models, which significantly improve the stability and accuracy of the predictions. With this approach, the study demonstrates that the use of the Random Forest algorithm in CPDP can enhance the accuracy and efficiency of cross-project software defect prediction.References
[1] K. Javed and and M. A. W. , Ren Shengbing , Muhammad Asim , “Cross-Project Defect Prediction Based on Domain Adaptation and LSTM Optimization,” https://www.mdpi.com/journal/algorithms, 2024, doi: https://doi.org/10.3390/a17050175.
[2] S. Pal and A. Sillitti, “Cross-Project Defect Prediction: A Literature Review,” 2022. doi:
10.1109/ACCESS.2022.3221184.
[3] Y. Zhao, Y. Zhu, Q. Yu, and X. Chen, “Cross-Project Defect Prediction Considering Multiple Data Distribution Simultaneously,” Symmetry (Basel)., vol. 14, no. 2, 2022, doi: 10.3390/sym14020401.
[4] T. Lei, J. Xue, Y. Wang, Z. Niu, Z. Shi, and Y. Zhang, “WCM-WTrA: A Cross-Project Defect Prediction Method Based on Feature Selection and Distance-Weight Transfer Learning,” Chinese J. Electron., vol. 31, no. 2, 2022, doi: 10.1049/cje.2021.00.119.
[5] J. Zou, Z. Li, X. Liu, and H. Tong, “MSCPDPLab: A MATLAB toolbox for transfer learning based multi-source cross-project defect prediction,” SoftwareX, vol. 21, 2023, doi: 10.1016/j.softx.2022.101286.
[6] Y. Zhao, Y. Zhu, Q. Yu, and X. Chen, “Cross-project defect prediction method based on manifold feature transformation,” Futur. Internet, vol. 13, no. 8, 2021, doi: 10.3390/fi13080216.
[7] Y. Khatri and S. K. Singh, “Cross project defect prediction: a comprehensive survey with its SWOT analysis,” Innov. Syst. Softw. Eng., vol. 18, no. 2, 2022, doi: 10.1007/s11334-020- 00380-5.
[8] K. K. Bejjanki, S. P. Kanchanapally, and M. K. Thota, “Class Imbalance Reduction and Centroid based Relevant Project Selection for Cross Project Defect Prediction,” Int. J. Recent Innov. Trends Comput. Commun., vol. 11, no. 6 s, 2023, doi: 10.17762/ijritcc.v11i6s.6933.
[9] A. Agrawal and R. Malhotra, “Cross project defect prediction for open source software,” Int. J. Inf. Technol., vol. 14, no. 1, 2022, doi: 10.1007/s41870-019-00299-6.
[10] Y. Sun et al., “Unsupervised domain adaptation based on discriminative subspace learning for cross-project defect prediction,” Comput. Mater. Contin., vol. 68, no. 3, 2021, doi: 10.32604/cmc.2021.016539.
[11] B. L. Sinaga, S. Ahmad, Z. A. Abas, and I. E. A. Jalil, “A recommendation system of training data selection method for cross-project defect prediction,” Indones. J. Electr. Eng. Comput. Sci., vol. 27, no. 2, 2022, doi: 10.11591/ijeecs.v27.i2.pp990-1006.
[12] Z. Li, H. Zhang, X. Y. Jing, J. Xie, M. Guo, and J. Ren, “DSSDPP: Data Selection and Sampling Based Domain Programming Predictor for Cross-Project Defect Prediction,” IEEE Trans. Softw. Eng., vol. 49, no. 4, 2023, doi: 10.1109/TSE.2022.3204589.
[13] B. Sotto-Mayor and M. Kalech, “Cross-project smell-based defect prediction,” Soft Comput., vol. 25, no. 22, 2021, doi: 10.1007/s00500-021-06254-7.
[14] F. Wu, X. Zheng, Y. Sun, Y. Gao, and X. Y. Jing, “Joint Domain Adaption and Pseudo- Labeling for Cross-Project Defect Prediction,” IEICE Trans. Inf. Syst., vol. E105D, no. 2, 2022, doi: 10.1587/transinf.2021EDL8061.
[15] B. Umamaheswara Sharma and R. Sadam, “How far does the predictive decision impact the software project? The cost, service time, and failure analysis from a cross-project defect prediction model,” J. Syst. Softw., vol. 195, 2023, doi: 10.1016/j.jss.2022.111522.
[16] J. Bai, J. Jia, and L. F. Capretz, “A three-stage transfer learning framework for multi-source cross-project software defect prediction,” Inf. Softw. Technol., vol. 150, 2022, doi: 10.1016/j.infsof.2022.106985.
[17] S. Noreen, R. Bin Faiz, S. Alyahya, and M. Maddeh, “Performance Evaluation of Convolutional Neural Network for Multi-Class in Cross Project Defect Prediction,” Appl. Sci., vol. 12, no. 23, 2022, doi: 10.3390/app122312269.
[18] S. Tang, S. Huang, C. Zheng, E. Liu, C. Zong, and Y. Ding, “A novel cross-project software defect prediction algorithm based on transfer learning,” Tsinghua Sci. Technol., vol. 27, no. 1, 2022, doi: 10.26599/TST.2020.9010040.
[19] A. Jalil, R. Bin Faiz, S. Alyahya, and M. Maddeh, “Impact of Optimal Feature Selection Using Hybrid Method for a Multiclass Problem in Cross Project Defect Prediction,” Appl. Sci., vol. 12, no. 23, 2022, doi: 10.3390/app122312167.
[20] Y. Li, M. Wen, Z. Liu, and H. Zhang, “Using Cost-cognitive Bagging Ensemble to Improve Cross-project Defects Prediction,” J. Internet Technol., vol. 23, no. 4, 2022, doi: 10.53106/160792642022072304013.
[21] Y. Xing, X. Qian, Y. Guan, B. Yang, and Y. Zhang, “Cross-project defect prediction based on G-LSTM model,” Pattern Recognit. Lett., vol. 160, 2022, doi: 10.1016/j.patrec.2022.04.039.
[22] H. Tong, W. Lu, W. Xing, and S. Wang, “ARRAY: Adaptive triple feature-weighted transfer Naive Bayes for cross-project defect prediction,” J. Syst. Softw., vol. 202, 2023, doi: 10.1016/j.jss.2023.111721.
[23] M. M. Ozturk, “Complexfuzzy: Novel Clustering Method for Selecting Training Instances of Cross-Project Defect Prediction,” Comput. Sci., vol. 22, no. 1, 2021, doi: 10.7494/csci.2021.22.1.3743.
[24] X. Zong, G. Li, S. Zheng, H. Zou, H. Yu, and S. Gao, “Heterogeneous Cross-Project Defect Prediction via Optimal Transport,” IEEE Access, vol. 11, 2023, doi: 10.1109/ACCESS.2023.3241924.
[25] R. Haque, A. Ali, S. Mcclean, I. Cleland, and J. Noppen, “Heterogeneous Cross-Project Defect Prediction Using Encoder Networks and Transfer Learning,” IEEE Access, vol. 12, 2024, doi: 10.1109/ACCESS.2023.3343329.
[26] Y. Z. Bala, P. Abdul Samat, K. Y. Sharif, and N. Manshor, “Improving Cross-Project Software Defect Prediction Method Through Transformation and Feature Selection Approach,” IEEE Access, vol. 11, 2023, doi: 10.1109/ACCESS.2022.3231456.
[27] S. Amasaki, H. Aman, and T. Yokogawa, “An extended study on applicability and performance of homogeneous cross-project defect prediction approaches under homogeneous cross-company effort estimation situation,” Empir. Softw. Eng., vol. 27, no. 2, 2022, doi: 10.1007/s10664-021-10103-4.
[28] J. Wu, Y. Wu, N. Niu, and M. Zhou, “MHCPDP: multi-source heterogeneous cross-project defect prediction via multi-source transfer learning and autoencoder,” Softw. Qual. J., vol. 29, no. 2, 2021, doi: 10.1007/s11219-021-09553-2.
[29] Q. Zou, L. Lu, Z. Yang, X. Gu, and S. Qiu, “Joint feature representation learning and progressive distribution matching for cross-project defect prediction,” Inf. Softw. Technol., vol. 137, 2021, doi: 10.1016/j.infsof.2021.106588.
[30] O. P. Omondiagbe, S. A. Licorish, and S. G. MacDonell, “Improving transfer learning for software cross-project defect prediction,” Appl. Intell., vol. 54, no. 7, 2024, doi: 10.1007/s10489-024-05459-1.
[31] F. Zeng, W. Lin, Y. Xing, L. Sun, and B. Yang, “A Cross-project Defect Prediction Model Using Feature Transfer and Ensemble Learning,” Teh. Vjesn., vol. 29, no. 4, 2022, doi: 10.17559/TV-20220421110027.
[32] W. Wen et al., “A Cross-Project Defect Prediction Model Based on Deep Learning With Self-Attention,” IEEE Access, vol. 10, 2022, doi: 10.1109/ACCESS.2022.3214536.
[33] H. Song, G. Wu, L. Ma, Y. Pan, Q. Huang, and S. Jiang, “Adversarial domain adaptation for cross-project defect prediction,” Empir. Softw. Eng., vol. 28, no. 5, 2023, doi: 10.1007/s10664-023-10371-2.
[34] C. Cui, B. Liu, and S. Wang, “WIFLF: An approach independent of the target project for cross-project defect prediction,” J. Softw. Evol. Process, vol. 34, no. 12, 2022, doi: 10.1002/smr.2497.
[35] C. Ni, X. Xia, D. Lo, X. Chen, and Q. Gu, “Revisiting Supervised and Unsupervised Methods for Effort-Aware Cross-Project Defect Prediction,” IEEE Trans. Softw. Eng., vol. 48, no. 3, 2022, doi: 10.1109/TSE.2020.3001739.
[36] L. Goel, M. Sharma, S. K. Khatri, and D. Damodaran, “An empirical analysis of the statistical learning models for different categories of cross-project defect prediction,” Int. J. Comput. Aided Eng. Technol., vol. 14, no. 2, 2021, doi: 10.1504/IJCAET.2021.113549.
[37] R. Malhotra and S. Meena, “Empirical validation of feature selection techniques for cross- project defect prediction,” Int. J. Syst. Assur. Eng. Manag., vol. 15, no. 5, 2024, doi: 10.1007/s13198-023-02051-7.
[2] S. Pal and A. Sillitti, “Cross-Project Defect Prediction: A Literature Review,” 2022. doi:
10.1109/ACCESS.2022.3221184.
[3] Y. Zhao, Y. Zhu, Q. Yu, and X. Chen, “Cross-Project Defect Prediction Considering Multiple Data Distribution Simultaneously,” Symmetry (Basel)., vol. 14, no. 2, 2022, doi: 10.3390/sym14020401.
[4] T. Lei, J. Xue, Y. Wang, Z. Niu, Z. Shi, and Y. Zhang, “WCM-WTrA: A Cross-Project Defect Prediction Method Based on Feature Selection and Distance-Weight Transfer Learning,” Chinese J. Electron., vol. 31, no. 2, 2022, doi: 10.1049/cje.2021.00.119.
[5] J. Zou, Z. Li, X. Liu, and H. Tong, “MSCPDPLab: A MATLAB toolbox for transfer learning based multi-source cross-project defect prediction,” SoftwareX, vol. 21, 2023, doi: 10.1016/j.softx.2022.101286.
[6] Y. Zhao, Y. Zhu, Q. Yu, and X. Chen, “Cross-project defect prediction method based on manifold feature transformation,” Futur. Internet, vol. 13, no. 8, 2021, doi: 10.3390/fi13080216.
[7] Y. Khatri and S. K. Singh, “Cross project defect prediction: a comprehensive survey with its SWOT analysis,” Innov. Syst. Softw. Eng., vol. 18, no. 2, 2022, doi: 10.1007/s11334-020- 00380-5.
[8] K. K. Bejjanki, S. P. Kanchanapally, and M. K. Thota, “Class Imbalance Reduction and Centroid based Relevant Project Selection for Cross Project Defect Prediction,” Int. J. Recent Innov. Trends Comput. Commun., vol. 11, no. 6 s, 2023, doi: 10.17762/ijritcc.v11i6s.6933.
[9] A. Agrawal and R. Malhotra, “Cross project defect prediction for open source software,” Int. J. Inf. Technol., vol. 14, no. 1, 2022, doi: 10.1007/s41870-019-00299-6.
[10] Y. Sun et al., “Unsupervised domain adaptation based on discriminative subspace learning for cross-project defect prediction,” Comput. Mater. Contin., vol. 68, no. 3, 2021, doi: 10.32604/cmc.2021.016539.
[11] B. L. Sinaga, S. Ahmad, Z. A. Abas, and I. E. A. Jalil, “A recommendation system of training data selection method for cross-project defect prediction,” Indones. J. Electr. Eng. Comput. Sci., vol. 27, no. 2, 2022, doi: 10.11591/ijeecs.v27.i2.pp990-1006.
[12] Z. Li, H. Zhang, X. Y. Jing, J. Xie, M. Guo, and J. Ren, “DSSDPP: Data Selection and Sampling Based Domain Programming Predictor for Cross-Project Defect Prediction,” IEEE Trans. Softw. Eng., vol. 49, no. 4, 2023, doi: 10.1109/TSE.2022.3204589.
[13] B. Sotto-Mayor and M. Kalech, “Cross-project smell-based defect prediction,” Soft Comput., vol. 25, no. 22, 2021, doi: 10.1007/s00500-021-06254-7.
[14] F. Wu, X. Zheng, Y. Sun, Y. Gao, and X. Y. Jing, “Joint Domain Adaption and Pseudo- Labeling for Cross-Project Defect Prediction,” IEICE Trans. Inf. Syst., vol. E105D, no. 2, 2022, doi: 10.1587/transinf.2021EDL8061.
[15] B. Umamaheswara Sharma and R. Sadam, “How far does the predictive decision impact the software project? The cost, service time, and failure analysis from a cross-project defect prediction model,” J. Syst. Softw., vol. 195, 2023, doi: 10.1016/j.jss.2022.111522.
[16] J. Bai, J. Jia, and L. F. Capretz, “A three-stage transfer learning framework for multi-source cross-project software defect prediction,” Inf. Softw. Technol., vol. 150, 2022, doi: 10.1016/j.infsof.2022.106985.
[17] S. Noreen, R. Bin Faiz, S. Alyahya, and M. Maddeh, “Performance Evaluation of Convolutional Neural Network for Multi-Class in Cross Project Defect Prediction,” Appl. Sci., vol. 12, no. 23, 2022, doi: 10.3390/app122312269.
[18] S. Tang, S. Huang, C. Zheng, E. Liu, C. Zong, and Y. Ding, “A novel cross-project software defect prediction algorithm based on transfer learning,” Tsinghua Sci. Technol., vol. 27, no. 1, 2022, doi: 10.26599/TST.2020.9010040.
[19] A. Jalil, R. Bin Faiz, S. Alyahya, and M. Maddeh, “Impact of Optimal Feature Selection Using Hybrid Method for a Multiclass Problem in Cross Project Defect Prediction,” Appl. Sci., vol. 12, no. 23, 2022, doi: 10.3390/app122312167.
[20] Y. Li, M. Wen, Z. Liu, and H. Zhang, “Using Cost-cognitive Bagging Ensemble to Improve Cross-project Defects Prediction,” J. Internet Technol., vol. 23, no. 4, 2022, doi: 10.53106/160792642022072304013.
[21] Y. Xing, X. Qian, Y. Guan, B. Yang, and Y. Zhang, “Cross-project defect prediction based on G-LSTM model,” Pattern Recognit. Lett., vol. 160, 2022, doi: 10.1016/j.patrec.2022.04.039.
[22] H. Tong, W. Lu, W. Xing, and S. Wang, “ARRAY: Adaptive triple feature-weighted transfer Naive Bayes for cross-project defect prediction,” J. Syst. Softw., vol. 202, 2023, doi: 10.1016/j.jss.2023.111721.
[23] M. M. Ozturk, “Complexfuzzy: Novel Clustering Method for Selecting Training Instances of Cross-Project Defect Prediction,” Comput. Sci., vol. 22, no. 1, 2021, doi: 10.7494/csci.2021.22.1.3743.
[24] X. Zong, G. Li, S. Zheng, H. Zou, H. Yu, and S. Gao, “Heterogeneous Cross-Project Defect Prediction via Optimal Transport,” IEEE Access, vol. 11, 2023, doi: 10.1109/ACCESS.2023.3241924.
[25] R. Haque, A. Ali, S. Mcclean, I. Cleland, and J. Noppen, “Heterogeneous Cross-Project Defect Prediction Using Encoder Networks and Transfer Learning,” IEEE Access, vol. 12, 2024, doi: 10.1109/ACCESS.2023.3343329.
[26] Y. Z. Bala, P. Abdul Samat, K. Y. Sharif, and N. Manshor, “Improving Cross-Project Software Defect Prediction Method Through Transformation and Feature Selection Approach,” IEEE Access, vol. 11, 2023, doi: 10.1109/ACCESS.2022.3231456.
[27] S. Amasaki, H. Aman, and T. Yokogawa, “An extended study on applicability and performance of homogeneous cross-project defect prediction approaches under homogeneous cross-company effort estimation situation,” Empir. Softw. Eng., vol. 27, no. 2, 2022, doi: 10.1007/s10664-021-10103-4.
[28] J. Wu, Y. Wu, N. Niu, and M. Zhou, “MHCPDP: multi-source heterogeneous cross-project defect prediction via multi-source transfer learning and autoencoder,” Softw. Qual. J., vol. 29, no. 2, 2021, doi: 10.1007/s11219-021-09553-2.
[29] Q. Zou, L. Lu, Z. Yang, X. Gu, and S. Qiu, “Joint feature representation learning and progressive distribution matching for cross-project defect prediction,” Inf. Softw. Technol., vol. 137, 2021, doi: 10.1016/j.infsof.2021.106588.
[30] O. P. Omondiagbe, S. A. Licorish, and S. G. MacDonell, “Improving transfer learning for software cross-project defect prediction,” Appl. Intell., vol. 54, no. 7, 2024, doi: 10.1007/s10489-024-05459-1.
[31] F. Zeng, W. Lin, Y. Xing, L. Sun, and B. Yang, “A Cross-project Defect Prediction Model Using Feature Transfer and Ensemble Learning,” Teh. Vjesn., vol. 29, no. 4, 2022, doi: 10.17559/TV-20220421110027.
[32] W. Wen et al., “A Cross-Project Defect Prediction Model Based on Deep Learning With Self-Attention,” IEEE Access, vol. 10, 2022, doi: 10.1109/ACCESS.2022.3214536.
[33] H. Song, G. Wu, L. Ma, Y. Pan, Q. Huang, and S. Jiang, “Adversarial domain adaptation for cross-project defect prediction,” Empir. Softw. Eng., vol. 28, no. 5, 2023, doi: 10.1007/s10664-023-10371-2.
[34] C. Cui, B. Liu, and S. Wang, “WIFLF: An approach independent of the target project for cross-project defect prediction,” J. Softw. Evol. Process, vol. 34, no. 12, 2022, doi: 10.1002/smr.2497.
[35] C. Ni, X. Xia, D. Lo, X. Chen, and Q. Gu, “Revisiting Supervised and Unsupervised Methods for Effort-Aware Cross-Project Defect Prediction,” IEEE Trans. Softw. Eng., vol. 48, no. 3, 2022, doi: 10.1109/TSE.2020.3001739.
[36] L. Goel, M. Sharma, S. K. Khatri, and D. Damodaran, “An empirical analysis of the statistical learning models for different categories of cross-project defect prediction,” Int. J. Comput. Aided Eng. Technol., vol. 14, no. 2, 2021, doi: 10.1504/IJCAET.2021.113549.
[37] R. Malhotra and S. Meena, “Empirical validation of feature selection techniques for cross- project defect prediction,” Int. J. Syst. Assur. Eng. Manag., vol. 15, no. 5, 2024, doi: 10.1007/s13198-023-02051-7.
Published
2025-12-22
How to Cite
DARUSMAN, Darusman; SUBEKTI, Agus.
Cross Project Defect Prediction Menggunakan Random Forest.
INFORMATICS FOR EDUCATORS AND PROFESSIONAL : Journal of Informatics, [S.l.], v. 10, n. 2, p. 1 - 10, dec. 2025.
ISSN 2548-3412.
Available at: <https://ejournal-binainsani.ac.id/index.php/ITBI/article/view/3706>. Date accessed: 05 july 2026.
doi: https://doi.org/10.51211/itbi.v10i2.3706.








