Int J Performability Eng ›› 2018, Vol. 14 ›› Issue (8): 1695-1704.doi: 10.23940/ijpe.18.08.p5.16951704

• Original articles • Previous Articles     Next Articles

Data Complexity Analysis for Software Defect Detection

Ying Maa, b, *, Yichang Lia, Junwen Lua, Peng Sunc, Yu Sund, and Xiatian Zhue   

  1. aXiamen University of Technology, Xiamen, 361024, China
    bEngineering Research Center for Software Testing and Evaluation of Fujian Province, Xiamen, 361024, China
    cUniversity of Electronic Science and Technology of China, Chengdu, 610054, China
    dXiamen Institute of Software Technology, Xiamen, 361000, China
    eQueen Mary, University of London, London, E1 4NS, UK

Abstract:

Most researchers conduct defect detection under the assumption that the training and future test data must be in the same feature space and the same distribution. However, in the practical applications, data sets come from different domains and different distributions. Sometimes, local data in the target projects are limited and data are usually affected by noise. In these cases, the performance of the software defect detection model is uncertain. Firstly, we introduce the data complexity concept into the software engineering from data mining field. Secondly, we investigate the data complexity measurement on public software data sets to find out which complexity metric is appropriate to apply in defect detection. Finally, we analyze the relationship between complexity metrics and model performance to gain valuable insight into the effects of data complexity on defect detection. We are optimistic that our method can provide decision-making support for detection model management and design.


Submitted on May 11, 2018; Revised on June 20, 2018; Accepted on July 26, 2018
References: 19