Username   Password       Forgot your password?  Forgot your username? 


Coarse-Grained Parallel AP Clustering Algorithm based on Intra-Class and Inter-Class Distance

Volume 14, Number 12, December 2018, pp. 3174-3183
DOI: 10.23940/ijpe.18.12.p27.31743183

Suzhi Zhang, Rui Yang, and Yanan Zhao

School of Computer and Communication Engineering, Zhengzhou University of Light Industry, Zhengzhou, 450002, China

(Submitted on July 12, 2018; Revised on August 14, 2018; Accepted on September 13, 2018)


Affinity Propagation (AP) clustering is an algorithm based on message passing between data points, which mainly achieves clustering through the similarity between data. Compared with traditional clustering methods, the AP clustering algorithm can implement clustering without giving a predetermined number of clusters. Therefore, it has the advantages of fast and high efficiency. However, it has certain limitations in dealing with high-dimensional complex datasets. In order to improve the efficiency and accuracy of the AP clustering algorithm, a coarse-grained parallel AP clustering algorithm based on intra-class and inter-class distances is proposed: IOCAP. Firstly, the idea of granularity is introduced to divide the initial dataset into multiple subsets. Secondly, the similarity matrix is improved by combining the intra-class and inter-class distances for each subset. Finally, the improved parallel AP clustering is implemented based on the MapReduce model. Experiments on the Iris dataset, the Diabetes dataset, and the MNIST dataset show that the IOCAP algorithm has good adaptability on large datasets and can effectively improve the accuracy of the algorithm while maintaining the AP clustering effect.


References: 18

                    1. W. Z. Yan, U. Brahmakshatriya, Y. Xue, M. Gilder, and B. Wise, “P-PIC: Parallel Power Iteration Clustering for Big Data,” Journal of Parallel & Distributed Computing, Vol. 73, No. 3, pp. 352-359, 2013
                    2. L. Zhou, X. J. Ping, S. Xu, and T. Zhang, “Cluster Ensemble based on Spectral Clustering,” Acta Automatica Sinica, Vol. 38, No. 8, pp. 1335-1342, 2012
                    3. D. D. Nguyen, T. N. Long, T. P. Long, and P. Witold, “Towards Hybrid Clustering Approach to Data Classification: Multiple Kernels based Interval-Valued Fuzzy C-Means Algorithms,” Fuzzy Sets & Systems, Vol. 279, pp. 17-39, 2015
                    4. D. M. Farid, A. Nowe, and B. Manderick, “A Feature Grouping Method for Ensemble Clustering of High-Dimensional Genomic Big Data,” in Proceedings of Future Technologies Conference IEEE, pp. 260-268, 2017
                    5. B. J. Frey and D. Dueck, “Clustering by Passing Messages Between Data Points,” Science, Vol. 315, No. 5814, pp. 972-976, 2007
                    6. W. C. Hung, C. Y. Chu, and Y. L. Wu, “Map/Reduce Affinity Propagation Clustering Algorithm,” International Journal of Electronics and Electrical Engineering, Vol. 3, No. 4, pp. 311-317, 2015
                    7. W. M. Lu, C. Y. Du, B. G. Wei, C. H. Shen, and Z. C. Ye, “Distributed Affinity Propagation Clustering based on MapReduce,” Journal of Computer Research & Development, Vol. 49, No. 8, pp. 1762-1772, 2012
                    8. X. N. Liu, M. J. Yin, M. T. Li, D. Yao, and W. P. Chen, “Hierarchical Affinity Propagation Clustering for Large-Scale Dataset,” Computer Science, 2014
                    9. W. C. Hung, C. Y. Chu, Y. L. Wu, and C. Y. Tang, “Map/Reduce Affinity Propagation Clustering Algorithm,” in Proceedings of International Conference on Control, Robotics and Cybernetics, 2015
                    10. J. Ributzka, Y. Hayashi, J. B. Manzano, and G. R. Gao, “The Elephant and the Mice: The Role of Non-Strict Fine-Grain Synchronization for Modern Many-Core Architectures,” in Proceedings of International Conference on Supercomputing, Tucson, Az, USA, DBLP, pp. 338-347, 2011
                    11. Z. Yu, G. Han, L. Li, J. Liu, and J.Zhang, “Adaptive Noise Immune Cluster Ensemble Using Affinity Propagation,” in Proceedings of International Conference on Data Engineering, Vol. 27, No. 12, pp. 3176-3189, 2015
                    12. X. Zhang, C. Furtlehner, C. Germain-Renaud, and M. Sebag, “Data Stream Clustering with Affinity Propagation,” IEEE Transactions on Knowledge & Data Engineering, Vol. 26, No. 7, pp. 1644-1656, 2014
                    13. L. H. Lee, C. H. Wan, R. Rajkumar, and D. Isa, “An Enhanced Support Vector Machine Classification Framework by Using Euclidean Distance Function for Text Document Categorization,” Applied Intelligence, Vol. 37, No. 1, pp. 80-99, 2012
                    14. L. Liu and C. Wu, “Possibilistic Clustering Segmentation Algorithm based on Intra-Class and Inter-Class Distance,” Journal of Image & Graphics, Vol. 12, No. 9, pp. 1155-1165, 2016
                    15. P. Zhu, X. Yang, K. Li, and C. Ji, “Optimized Big Data K-means Clustering using MapReduce,” Journal of Supercomputing, Vol. 70, No. 3, pp. 1249-1259, 2014
                    16. L. Ma, S. Destercke, and Y. Wang, “Online Active Learning of Decision Trees with Evidential Data,” Pattern Recognition, Vol. 52, pp. 33-45, 2016
                    17. W. Y. Chen, Y. Song, H. Bai, C. J. Lin, and E. Y. Chang, “Parallel Spectral Clustering in Distributed Systems,” IEEE Transactions on Pattern Analysis & Machine Intelligence, Vol. 33, No. 3, pp. 568-586, 2011
                    18. X. Y. Zhang, J. Zhang, Y. J. Gong, Z. H. Zhan, W. N. Chen, and Y. Li, “Kuhn-Munkres Parallel Genetic Algorithm for the Set Cover Problem and its Application to Large-Scale Wireless Sensor Networks,” IEEE Transactions on Evolutionary Computation, Vol. 20, No. 5, pp. 695-710, 2016


                                      Please note : You will need Adobe Acrobat viewer to view the full articles.Get Free Adobe Reader

                                      Download this file (IJPE-2018-12-27.pdf)IJPE-2018-12-27.pdf[Coarse-Grained Parallel AP Clustering Algorithm based on Intra-Class and Inter-Class Distance]548 Kb
                                      This site uses encryption for transmitting your passwords.