Username   Password       Forgot your password?  Forgot your username? 

Text Feature Selection based on Feature Dispersion Degree and Feature Concentration Degree

Volume 13, Number 7, November 2017 - Paper 19  - pp. 1159-1164
DOI: 10.23940/ijpe.17.07.p19.11591164

Zhifeng Zhanga, Yuhua Lia, Haodong Zhub,*

aSchool of Software, Zhengzhou University of Light Industry, Zhengzhou, Henan, 450002, P. R. China
bSchool of Computer and Communication Engineering, Zhengzhou University of Light Industry, Zhengzhou, Henan, 450002, P. R. China

(Submitted on July 17, 2017; First Revised on October 7, 2017; Second Revised on October 15, 2017; Accepted on October 17, 2017)


Text feature selection is one of the key steps in text classification, and thus can affect performance of text classification. In this paper, the feature dispersion degree of between-class documents is first put forward to measure the feature dispersion between categories (the greater its value, the larger the influence of the feature has). The feature concentration degree of within-class documents is then proposed to measure feature concentration in the text of a category (the greater its value, the larger the influence of feature has). Subsequently, a text feature selection method is presented, which uses both of the proposed degrees comprehensively to measure the importance of features. Experimental comparison results show that the proposed feature selection method can often get more representative feature subsets and improve performance of text classification.


References: 11

        1. J. Cai, J. Luo, C. Liang, S. Yang, " A Novel Information Theory-Based Ensemble Feature Selection Framework for High-Dimensional Microarray Data", International Journal of Performability Engineering, vol. 13, no. 5, pp. 742-753, 2017.
        2. A. Destrero, S. Mosci, C. D. Mol,A. Verri, F. Odone, "Feature selection for high-dimensional data", Computational management science, vol. 6, no. 1, pp. 25-40, 2009.
        3. F. Jiménez, G. Sánchez, J. M. García, et al, "Multi-objective evolutionary feature selection for online sales forecasting", Neurocomputing, vol. 234, pp. 75-92, 2017.
        4. S. R. Y. Leela, V.  Sucharita, B. Debnath, H. J. Kim, "Performance evaluation of feature selection methods on large dimensional databases ", International Journal of Database Theory and Application, vol. 9, no. 9, pp. 75-82, 2016.
        5. J. H. Liu, Y. J. Lin, M. L. Lin, "Feature selection based on quality of information", Neurocomputing, vol. 225, pp. 11-22, 2017.
        6. J. N. Meng, H. F. Lin, Y. H. Yu, "A two-stage feature selection method for text categorization", Computers & Mathematics with Applications, vol. 62, no. 7, pp. 2793-2800, 2011.
        7. M. H. Nguyen, D. F. Torre, "Optimal feature selection for support vector machines", Pattern Recognition, vol. 43, no. 3, pp. 584-591, 2010.
        8. A. Rehman, K. Javed, H. A. Babri, "Feature selection based on a normalized difference measure for text classification", Information Processing & Management, vol. 53, no. 2, pp. 473-489, 2017.
        9. T. Sun, S. Y. Qian, H. D. Zhu, "Feature selection method based on category correlation and discernible sets", Journal of Computational Information Systems, vol.11, no. 22, pp. 9687-9698, 2014.
        10. S. Q. Wang, J. M. Wei, "Feature selection based on measurement of ability to classify subproblems", Neurocomputing, vol. 224, pp. 155-165, 2017.
        11. H. D. Zhu, H. C. Li, D. Wu, D. S. Huang, B. Wang, "Feature selection method based on feature distinguishability and fractal dimension", Journal of Information and Computational Science, vol. 36, no. 5, pp. 6033-6041, 2015.


              Click here to download the paper.

              Please note : You will need Adobe Acrobat viewer to view the full articles.Get Free Adobe Reader

              This site uses encryption for transmitting your passwords.