Username   Password       Forgot your password?  Forgot your username? 

A Classification Algorithm of CART Decision Tree based on MapReduce Attribute Weights

Volume 14, Number 1, January 2018, pp. 17-25
DOI: 10.23940/ijpe.18.01.p3.1725

Fubao Zhu, Mengmeng Tang, Lijie Xie, Haodong Zhu

School of Computer and Communication Engineering, Zhengzhou University of Light Industry, Zhengzhou, Henan, 450002, China

(Submitted on July 24, 2017; Revised on September 25, 2017; Accepted on November 29, 2017)


A CART decision tree algorithm based on attribute weight is proposed in this paper because of the present problems of complex classification, poor accuracy, low efficiency, and severe memory consumption of CART decision. What is more, the algorithm is combined with the parallel computing model of MapReduce. Theory of attribute weights is used in the algorithm. A decision tree is built through the sum of weights, which is decided by the degree that the attributes affect a decision. Thus the accuracy of classification through decision tree is improved. Parallel sorting algorithms of CART decision tree for massive data is implemented through the MapReduce programming technology of cloud computing. All the results of theoretical analysis and experimental comparison show that it is very important to mark attributes by weights through MapReduce. Furthermore, the accuracy of the classification of large sample data sets is improved significantly, classification efficiency of decision tree is improved and the trained time is also significantly reduced.


References: 28

1. A. Bar-Hen, S. Gey, J. M. Poggi, “Influence Measures for CART Classification Trees,” Journal of Classification, Vol. 32, pp. 1-25, 2015.
2. R. C. Barros, M. P. Basgalupp, “Towards the automatic design of decision tree induction algorithms,” Springerbriefs in Computer Science, Vol. 13, pp. 567-574, 2015.
3. A. Bechini, F. Marcelloni, A. Segatori, “A MapReduce solution for associative classification of big data,” Information Sciences An International Journal, Vol. 332, pp. 33-55, 2016.
4. Y. Ben-Haim, E. Tom-Tov, “A Streaming Parallel Decision Tree Algorithm,” Journal of Machine Learning Research, Vol. 11, pp. 849-872, 2010.
5. B. Chandra, R. Kothari, P. Paul, “A new node splitting measure for decision tree construction,” Pattern Recognition, Vol. 43, pp. 2725-2731, 2010.
6. L. Chasmer, C. Hopkinson, T. Veness, et al, “A decision-tree classification for low-lying complex land cover types within the zone of discontinuous permafrost ,” Remote Sensing of Environment, Vol. 143, pp. 73-84, 2014.
7. H. L. Chen, D. X. Xia, “Applied Research on Data Mining Based on CART Decision Tree Algorithm,” Coal Technology, Vol. 30, pp. 164-166, 2011.
8. M. R. Hassan, R. Kotagiri, “A new approach to enhance the performance of decision tree for classifying gene expression data,” Bmc Proceedings, Vol. 7, pp. 1-8, 2013.
9. K. S. Hong, P. L. Ooi, C. K. Ye, “Multivariate alternating decision trees,” Pattern Recognition, Vol. 50, pp. 195-209, 2016.
10. M. Jovanović, B. Delibašić, M. Vukićević, “Evolutionary approach for automated component-based decision tree algorithm design ,” Intelligent Data Analysis, Vol. 18, pp. 63-77, 2014.
11. S. Y. Kim, A. Upneja, “Predicting restaurant financial distress using decision tree and AdaBoosted decision tree models,” Economic Modelling, Vol. 36, pp. 354-362, 2014.
12. M. Kumar, S. K. Rath, “Classification of microarray using MapReduce based proximal support vector machine classifier,” Knowledge-Based Systems, Vol. 89, pp. 584-602, 2015.
13. L. Li, “A Bayes Classifier-Based OVFDT Algorithm for Massive Stream Data Mining on Big Data Platform.” Conference on Complex, Intelligent, and Software Intensive Systems. Springer, Cham, pp. 537-546, 2017.
14. J. Liu, Y. H. Li, M. M. Yang, “Decision Tree Algorithm Based on MapReduce in Telecommunications Churn,” Computer Knowledge & Technology, Vol. 30, pp. 6710-6713+6716, 2013.
15. L. Ma, S. Destercke, Y. Wang, “Online active learning of decision trees with evidential data,” Pattern Recognition, Vol. 52, pp. 33-45, 2016.
16. A. M. Mahmood, M. Imran, N. Satuluri, “An Improved CART Decision Tree for Datasets with Irrelevant Feature,” Swarm, Evolutionary, and Memetic Computing, pp. 539-549, 2011.
17. D. Mudali, L. K. Teune, R. J. Renken, “Classification of Parkinsonian syndromes from FDG-PET brain data using decision trees with SSM/PCA features,” Computational & Mathematical Methods in Medicine, Vol. 2015, pp. 1-10, 2015.
18. V. Purdilă, S. G. Pentiuc, “MR-Tree - A Scalable MapReduce Algorithm for Building Decision Trees,” Journal of Applied Computer Science & Mathematics, Vol.8, pp. 16-19, 2014.
19. J. Qian, D. Miao, Z. Zhang, “Parallel attribute reduction algorithms using MapReduce,” Information Sciences, Vol. 279, pp. 671-690, 2014.
20. L. U. Qiu, X. H. Cheng, “Parallelization of decision tree algorithm based on MapReduce,” Journal of Computer Applications, Vol. 32, pp. 2463-2462+2469, 2012.
21. L. Rutkowski, M. Jaworski, L. Pietruczuk, “The CART decision tree for mining data streams,” Information Sciences, Vol. 266, pp. 1-15, 2014.
22. F. Saqib, A. Dutta, J. Plusquellic, “Pipelined Decision Tree Classification Accelerator Implementation in FPGA (DT-CAIF),” IEEE Transactions on Computers, Vol. 64, pp. 280-285, 2015.
23. G. L. Song, Z. X. Hao, “An Improved Algorithm Based on CART Decision,” Journal of Harbin University of Science and Technology, Vol. 14, pp. 17-20, 2009.
24. D. Tapiador, W. O’Mullane, A. G. A. Brown, “A framework for building hypercubes using MapReduce,” Computer Physics Communications, Vol. 185, pp. 1429-1438, 2014.
25. I. Triguero, D. Peralta, J. Bacardit, “MRPR: A MapReduce solution for prototype reduction in big data classification,” Neurocomputing, Vol. 150, pp. 331-345, 2015.
26. W. Wang, K. Zhu, L. Ying, “Map Task Scheduling in MapReduce With Data Locality: Throughput and Heavy-Traffic Optimality,” IEEE/ACM Transactions on Networking, Vol. 24, pp. 190-203, 2016.
27. M. Zeinalkhani, M. Eftekhari, “Comparing Different Stopping Criteria for Fuzzy Decision Tree Induction through IDFID3,” Iranian Journal of Fuzzy Systems, Vol. 11, pp. 27-48, 2014.
28. L. Zhang, Q. Ning, “Two improvements on CART decision tree and its application,” Computer Engineering and Design, Vol. 36, pp. 1209-1213, 2015.


Please note : You will need Adobe Acrobat viewer to view the full articles.Get Free Adobe Reader

This site uses encryption for transmitting your passwords.