Username   Password       Forgot your password?  Forgot your username? 


Test Set Augmentation Technique for Deep Learning Image Classifiers

Volume 15, Number 7, July 2019, pp. 1998-2007
DOI: 10.23940/ijpe.19.07.p27.19982007

Qiang Chen, Zhanwei Hui, and Jialuo Liu

Command and Control Engineering College, Army Engineering University of PLA, Nanjing, 210007, China


(Submitted on May 16, 2019; Revised on June 30, 2019; Accepted on July 20, 2019)


Widely applied in various fields, deep learning (DL) is becoming the key driving force in the industry. Although it has achieved great success in artificial intelligence tasks, similar to traditional software, it has defects involving unpredictable accidents and losses due to failure. To ensure the quality of DL software, adequate testing needs to be carried out. In this paper, we propose a test set augmentation technique based on an adversarial example generation algorithm for image classification deep neural networks (DNNs). It can generate a large number of useful test cases, especially when test cases are insufficient. We briefly introduce the adversarial example generation algorithm and implement the framework of our method. We conduct experiments on classic DNN models and datasets. We further evaluate the test set by using a coverage metric based on states of the DNN.


References: 30

  1. G. Marcus, “DL: A Critical Appraisal,” arXiv Preprint arXiv:1801.00631, 2018
  2. G. Hinton, L. Deng, D. Yu, G. E. Dahl, A.-r. Mohamed, N. Jaitly, et al., “DNNs for Acoustic Modeling in Speech Recognition: The Shared Views of Four Research Groups,” IEEE Signal Processing Magazine, Vol. 29, No. 6, pp. 82-97, 2012
  3. D. Ciregan, U. Meier, and J. Schmidhuber, “Multi-Column DNNs for Image Classification,” in Proceedings of 2013 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 3642-3649, 2012
  4. Y. Wu, M. Schuster, Z. Chen, Q. V. Le, M. Norouzi, W. Macherey, et al., “Google's Neural Machine Translation System: Bridging the Gap Between Human and Machine Translation,” arXiv preprint arXiv:1609.08144, 2016
  5. A. Krizhevsky, I. Sutskever, and G. Hinton, “ImageNet Classification with Deep Convolutional Neural Networks,” in Proceedings of the 25th International Conference on Neural Information Processing Systems, Vol. 1, pp. 1097-1105, 2012
  6. M. Wang and W. Deng, “Deep Face Recognition: A Survey,” arXiv e-print arXiv:1804.06655, April 2018
  7. M. Pritt and G. Chern, “Satellite Image Classification with DL,” in Proceedings of 2017 IEEE Applied Imagery Pattern Recognition Workshop (AIPR), IEEE Computer Society, 2017
  8. V. Gulshan, L. Peng, M. Coram, M. C. Stumpe, D. Wu, A. Narayanaswamy, et al., “Development and Validation of a DL Algorithm for Detection of Diabetic Retinopathy in Retinal Fundus Photographs,” Jama, Vol. 316, No. 22, pp. 2402-2410, 2016
  9. M. Bojarski, D. D. Testa, D. Dworakowski, B. Firner, B. Flepp, P. Goyal, et al., “End to End Learning for Self-Driving Cars,” arXiv e-print arXiv:1312.6199, April 2016
  10. C. Szegedy, W. Zaremba, and I. Sutskever, “Intriguing Properties of Neural Networks,” arXiv e-print arXiv:1604.07316, 2013
  11. I. J. Goodfellow, J. Shlens, and C. Szegedy, “Explaining and Harnessing Adversarial Examples,” in Proceedings of International Conference on Learning Representations, 2015
  12. N. Papernot, P. Mcdaniel, and S. Jha, “The Limitations of DL in Adversarial Settings,” in Proceedings of 2016 IEEE European Symposium on Security and Privacy (EuroS&P), pp. 372-387, Saarbrucken, Germany, March 2016
  13. S. Baluja and I. Fischer, “Adversarial Transformation Networks: Learning to Generate Adversarial Examples,” arXiv e-print arXiv:1703.09387, March 2017
  14. L. Ma, F. J. Xu, M. Xue, Q. Hu, S. Chen, B. Li, et al., “Secure Deep Learning Engineering: A Software Quality Assurance Perspective,” arXiv e-print arXiv:1810.04538, October 2018
  15. Z. C. Lipton, “The Mythos of Model Interpretability,” Queue - Machine Learning, Vol. 16, No. 3, pp. 30, June 2018
  16. S. M. Moosavi-Dezfooli, A. Fawzi, and P. Frossard, “DeepFool: A Simple and Accurate Method to Fool DNNs,” in Proceedings of 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 2574-2582, 2015
  17. Y. LeCun, C. Cortes, and C. J. C. Burges, “MNIST, a Database of Handwritten Digits,” (, accessed 1998)
  18. K. Serebryany, “Libfuzzer, A library for Coverage-Guided Fuzz Testing,” (, accessed 2016)
  19. H. Chen, Y. Li, B. Chen, Y. Xue, and Y. Liu, “Fot: A Versatile, Configurable, Extensible Fuzzing Framework,” in Proceedings of The ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering (ESEC/FSE), November 2018
  20. K. X. Pei, Y. Z. Cao, J. F. Yang, and S. Jana, “DeepXplore: Automated Whitebox Testing of Deep Learning Systems,” in Proceedings of the 26th Symposium on Operating Systems Principles, pp. 1-18, Shanghai, China, 2017
  21. L. Ma, F. J. Xu, F. Y. Zhang, J. Y. Sun, M. H. Xue, B. Li, et al., “DeepGauge: Multi-Granularity Testing Criteria for Deep Learning Systems,” in Proceedings of the 33rd ACM/IEEE International Conference on Automated Software Engineering, pp. 120-131, Montpellier, France, 2018
  22. Y. Sun, X. Huang, and D. Kroening, “Testing Deep Neural Networks,” arXiv e-print arXiv:1803.04792, March 2018
  23. K. J. Hayhurst, D. S. Veerhusen, J. J. Chilenski, and L. K. Rierson, “A Practical Tutorial on Modified Condition/Decision Coverage,” NASA Langley Technical Report Server, 2001
  24. Y. Tian, K. Pei, S. Jana, and B. Ray, “Deeptest: Automated Testing of Deep-Neural-Network-Driven Autonomous Cars,” in Proceedings of the 40th International Conference on Software Engineering, pp. 303-314, Gothenburg, Sweden, 2018
  25. O. Russakovsky, J. Deng, and H. Su, “ImageNet Large Scale Visual Recognition Challenge,” International Journal of Computer Vision, Vol. 115, No. 3, pp. 211-252, 2014
  26. A. Odena and I. Goodfellow, “TensorFuzz: Debugging Neural Networks with Coverage-Guided Fuzzing,” arXiv e-print arXiv:1807.10875, July 2018
  27. X. Xie, L. Ma, F. Juefei-Xu, H. Chen, M. Xue, B. Li, et al., “DeepHunter: Hunting DNN Defects via Coverage-Guided Fuzzing,” arXiv e-print arXiv:1809.01266, September 2018
  28. Y. Tian, K. Pei, S. Jana, and B. Ray, “Deeptest: Automated Testing of Deep-Neural-Network-Driven Autonomous Cars,” in Proceedings of the 40th International Conference on Software Engineering, Gothenburg, Sweden, May 27 - June 3, 2018
  29. M. Zhang, Y. Zhang, and L. Zhang, “DeepRoad: GAN-based Metamorphic Autonomous Driving System Testing,” in Proceedings of the 33rd ACM/IEEE International Conference on Automated Software Engineering, pp. 132-142, Montpellier, France, 2018
  30. Y. LeCun, L. Bottou, Y. Bengio, and P. Haffner, “Gradient-based Learning Applied to Document Recognition,” Proceedings of the IEEE, Vol. 86, No. 11, pp. 2278-2324, 1998


Please note : You will need Adobe Acrobat viewer to view the full articles.Get Free Adobe Reader

This site uses encryption for transmitting your passwords.