Username   Password       Forgot your password?  Forgot your username? 

 

Using Cross-Entropy Value of Code for Better Defect Prediction

Volume 14, Number 9, September 2018, pp. 2105-2115
DOI: 10.23940/ijpe.18.09.p19.21052115

Xian Zhang, Kerong Ben, and Jie Zeng

Department of Computer and Data Engineering, Naval University of Engineering, Wuhan, 430033, China

(Submitted on May 15, 2018; Revised on July 22, 2018; Accepted on August 11, 2018)

Abstract:

Defect prediction is meaningful because it can assist software inspection by predicting defective code locations and improving software reliability. Many software features are designed for defect prediction models to identify potential bugs, but no one feature set can perform well in most cases yet. To improve defect prediction, this paper proposes a new code feature, the cross-entropy value of the sequence of code’s abstract syntax tree nodes (CE-AST), and develops a neural language model for feature measurement. To evaluate the effectiveness of CE-AST, we first investigate its discrimination for defect-proneness. Experiments on 12 Java projects show that CE-AST is more discriminative than 45% of twenty widely used traditional features. Furthermore, we investigate CE-AST’s contribution to defect prediction. Combined with different traditional feature suites to feed prediction models, CE-AST can bring performance improvements of 4.7% in Precision, 2.5% in Recall, and 3.5% in F1 on average.

 

References: 31

                1. T. Hall, S. Beecham, D. Bowes, D. Grayc, and S. Counsell, “A Systematic Literature Review on Fault Prediction Performance in Software Engineering,” IEEE Transactions on Software Engineering, Vol. 38, No. 6, pp. 1276-1304, November 2012
                2. D. Radjenović, M. Heričko, R. Torkar, and A. Živkovič, “Software Fault Prediction Metrics: A Systematic Literature Review,” Information and Software Technology, Vol. 55, No. 8, pp. 1397-1418, August 2013
                3. S. Y. Lee, D. Li, and Y. Li, “An Investigation of Essential Topics on Software Fault-Proneness Prediction,” in Proceedings of the 2nd International Symposium on System and Software Reliability (ISSSR), pp. 37-46, Shanghai, China, October 2016
                4. T. J. McCabe, “A Complexity Measure,” IEEE Transactions on Software Engineering, Vol. 2, No. 4, pp. 308-320, December 1976
                5. S. R. Chidamber and C. F. Kemerer, “A Metrics Suite for Object Oriented Design,” IEEE Transactions on software engineering, Vol. 20, No. 6, pp. 476-493, June 1994
                6. J. Bansiya and C. G. Davis, “A Hierarchical Model for Object-Oriented Design Quality Assessment,” IEEE Transactions on Software Engineering, Vol. 28, No. 1, pp. 4-17, January 2002
                7. T. Jiang, L. Tan, and S. Kim, “Personalized Defect Prediction,” in Proceedings of the 28th IEEE/ACM International Conference on Automated Software Engineering (ASE), pp. 279-289, Silicon Valley, California, USA, November 2013
                8. L. Madeyski and M. Jureczko, “Which Process Metrics Can Significantly Improve Defect Prediction Models? An Empirical Study,” Software Quality Journal, Vol. 23, No. 3, pp. 393-422, September 2015
                9. S. Wang, T. Liu, and L. Tan, “Automatically Learning Semantic Features for Defect Prediction,” in Proceedings of the 38th International Conference on Software Engineering (ICSE), pp. 297-308, Austin, Texas, USA, May 2016
                10. J. Li, P. He, J. Zhu, and R. L. Michael, “Software Defect Prediction via Convolutional Neural Network,” in Proceedings of the 17th IEEE International Conference on Software Quality, Reliability and Security (QRS), pp. 318-328, Prague, Czech Republic, August 2017
                11. A. Hindle, E. T. Barr, Z. Su, M. Gabel, and P. Devanbu, “On the Naturalness of Software,” in Proceedings of the International Conference on Software Engineering (ICSE), pp. 837-847, Zurich, Switzerland, June 2012
                12. B. Ray, V. Hellendoorn, S. Godhane, Z. Tu, A. Bacchelli, and P. Devanbu, “On the Naturalness of Buggy Code,” in Proceedings of the 38th International Conference on Software Engineering (ICSE), pp. 428-439, Austin, Texas, USA, May 2016
                13. S. Wang, D. Chollak, D. Movshovitz-Attias, and L. Tian, “Bugram: Bug Detection with N-gram Language Models,” in Proceedings of the 31st IEEE/ACM International Conference on Automated Software Engineering (ASE), pp. 708-719, Singapore, Octobor 2016
                14. M. Allamanis, E. T. Barr, P. Devanbu, and C. Sutton, “A Survey of Machine Learning for Big Code and Naturalness,” arXiv Preprint, arXiv: 1709.06182, September 2017
                15. X. Zhang, K. Ben, and J. Zeng, “Cross-Entropy: A New Metric for Software Defect Prediction,” in Proceedings of the 18th IEEE International Conference on Software Quality, Reliability and Security (QRS), Lisbon, Portugal, (https://github.com/TOM-ZXian/-Cross-entropy-metric-of-code-for-defect-prediction, accessed July 2018)
                16. D. Jurafsky and J. H. Martin, “Speech and Language Processing,” 2nd Edition, Pearson/Prentice Hall, Upper Saddle River, 2009
                17. Y. LeCun, Y. Bengio, and G. Hinton, “Deep Learning,” Nature, Vol. 512, No. 7553, pp. 436-444, May 2015
                18. Y. Bengio, R. Ducharme, and P. Vincent, “A Neural Probabilistic Language Model,” in Proceedings of Advances in Neural Information Processing Systems (NIPS), pp. 932-938, Vancouver, Canada, December 2001
                19. J. Hirschberg and C. D. Manning, “Advances in Natural Language Processing,” Science, Vol. 349, No. 6245, pp. 261-266, July 2015
                20. H. Salehinejad, J. Baarbe, S. Sankar, J. Barfett, E. Colak, and S. Valaee, “Recent Advances in Recurrent Neural Networks,” arXiv preprint, arXiv: 1801.01078, January 2018
                21. T. Mikolov, M. Karafiát, L. Burget, J. Černocký, and S. Khudanpur, “Recurrent Neural Network based Language Model,” in Proceedings of the 11th Annual Conference of the International Speech Communication Association (INTERSPEECH), pp. 1045-1048, Makuhari, Chiba, Japan, September 2010
                22. T. Mikolov, W. T. Yih, and G. Zweig, “Linguistic Regularities in Continuous Space Word Representations,” in Proceedings of the Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pp. 746-751, Atlanta, US, June 2013
                23. A. Agrawal and T. Menzies, “Is ‘Better Data’ Better Than ‘Better Data Miners’?: On the Benefits of Tuning SMOTE for Defect Prediction,” in Proceedings of the 40th International Conference on Software Engineering (ICSE), Gothenburg, Sweden, May 2018.
                24. S. Hochreiter and J. Schmidhuber, “Long Short-Term Memory,” Neural Computation, Vol. 9, No. 8, pp. 1735-1780, November 1997
                25. W. Zaremba, I. Sutskever, and O. Vinyals. “Recurrent Neural Network Regularization,” arXiv preprint, arXiv: 1409.2329, September 2014
                26. R. Pascanu, T. Mikolov, and Y. Bengio, “On the Difficulty of Training Recurrent Neural Networks,” in Proceedings of the 30th International Conference on Machine Learning (ICML), pp. 1310-1318, Atlanta, USA, June 2013
                27. I. H. Laradji, M. Alshayeb, and L. Ghouti, “Software Defect Prediction using Ensemble Learning on Selected Features,” Information and Software Technology, Vol. 58, pp. 388-402, February 2015
                28. M. Jureczko, and L. Madeyski, “Towards Identifying Software Project Clusters with Regard to Defect Prediction,” in Proceedings of the 6th International Conference on Predictive Models in Software Engineering, Timişoara, Romania, September 2010
                29. B. Ghotra, S. McIntosh, and A. E. Hassan, “A Large-Scale Study of the Impact of Feature Selection Techniques on Defect Classification Models,” in Proceedings of the 14th IEEE/ACM International Conference on Mining Software Repositories (MSR), pp. 146-157, Buenos Aires, Argentina, May 2017
                30. C. M. Bishop, “Pattern Recognition and Machine Learning,” Springer, New York, 2006
                31. X. Yang, D. Lo, X. Xia, Y. Zhang, and J. Sun, “Deep Learning for Just-In-Time Defect Prediction,” in Proceedings of the 15th IEEE International Conference on Software Quality, Reliability and Security (QRS), pp. 17-26, Vancouver, Canada, August 2015

                               

                              Please note : You will need Adobe Acrobat viewer to view the full articles.Get Free Adobe Reader

                              Attachments:
                              Download this file (19-IJPE-09-19.pdf)19-IJPE-09-19.pdf[Using Cross-Entropy Value of Code for Better Defect Prediction]715 Kb
                               
                              This site uses encryption for transmitting your passwords. ratmilwebsolutions.com