Username   Password       Forgot your password?  Forgot your username? 


Chinese Word Segmentation based on Bidirectional GRU-CRF Model

Volume 14, Number 12, December 2018, pp. 3066-3075
DOI: 10.23940/ijpe.18.12.p16.30663075

Jinli Che, Liwei Tang, Shijie Deng, and Xujun Su

Department of Artillery Engineering, Army Engineering University, Shijiazhuang, 050003, China

(Submitted on September 13, 2018; Revised on October 18, 2018; Accepted on November 14, 2018)


As an effective model for processing time series data, the recurrent neural network has been widely used in the problem of sequence tagging tasks. In order to solve the typical sequence tagging task of Chinese word segmentation, in this paper we propose an improved bidirectional gated recurrent unit conditional random field (BI-GRU-CRF) model based on the gated recurrent unit (GRU) neural network. This network is more easily trained than the LSTM neural network. This method can not only effectively utilize text information in two directions through bidirectional gated recurrent units, but also obtain the globally optimal tagging sequence as a result by considering the correlation between neighbor tags through the conditional random field. In this paper, experiments are carried out on the common evaluation set (PKU, MSRA, CTB) with the four-tag-set and six-tag-set respectively. The results show that the BI-GRU-CRF model has high performance in Chinese word segmentation, and the six-tag-set can improve the performance of the network.


References: 25

                    1. C. N. Huang and H. Zhao, “Chinese Word Segmentation: A Decade Review,” Journal of Chinese Information Processing, Vol. 21, No. 3, pp. 8-19, May 2017
                    2. G. H. Feng and W. Zhen, “Review of Chinese Automatic Word Segmentation,” Science Library and Information Service, Vol. 55, No. 2, pp. 41-45, January 2011
                    3. N. W. Xue, “Chinese Word Segmentation as Character Tagging,” Computational Linguistics and Chinese Language Processing, Vol. 8, No. 1, pp. 29-48, February 2003
                    4. Q. Liu, H. P Zhang, H. K. Yu, and X. Q. Cheng, “Chinese Lexical Analysis Using Cascaded Hidden Markov Model,” Journal of Computer Research and Development, Vol. 41, No. 8, pp. 1421-1429, August 2004
                    5. Z. Y. Qian, J. Z. Zhou, G. P. Tong, and X. N. Sun, “Research on Automatic Word Segmentation and POS Tagging For Chu Ci bBased on HMM,” Library and Information Service, Vol. 58, No. 4, pp. 105-110, February 2014
                    6. R. Collobert and J. Weston, “A Unified Architecture for Natural Language Processing: Deep Neural Networks with Multitask Learning,” in Proceedings of the 25th International Conference on Machine Learning, pp. 160-167, Helsinki, Finland, June 2008
                    7. H. Zhao, C. N. Huang, M. Li, and B. L. Lu, “Effective Tag Set Selection in Chinese Word Segmentation via Conditional Random Field Modeling,” in Proceedings of the 20th Pacific Asia Conference on Language Information and Computation, pp. 87-94, Wuhan, China, November 2006
                    8. Y. M Hou, H. Q. Zhou, and Z. Y Wang, “Overview of Speech Recognition based on Deep Learning,” Application Research of Computers, Vol. 34, No. 8, pp. 2241-2246, August 2017
                    9. H. T. Lu and Q. C. Zhang, “Applications of Deep Convolutional Neural Network in Computer Vision,” Journal of Data Acquisition and Processing, Vol. 31, No. 1, pp. 1-17, January 2016
                    10. X. F. Xi and G. D. Zhou, “A Survey on Deep Learning for Natural Language Processing,” ACTA Automatic Sinica, Vol. 42, No. 10, pp. 1445-1465, October 2016
                    11. Y. Bengio, R. Ducharme, P. Vincent, and C. Jauvin, “A Neural Probabilistic Language Model,” Journal of Machine Learning Research, Vol. 3, No. 6, pp. 1137-1155, March 2003
                    12. X. Q. Zheng, H. Y. Chen, and T. Y. Xu, “Deep Learning for Chinese Word Segmentation and POS Tagging,” in Proceedings of the 2013 Conference on Empirical Methods in Natural Language Processing, pp. 647-657, Seattle, Washington, USA, October 2013
                    13. X. C. Chen, X. P. Qiu, C. X. Zhu, and X. J. Huang, “Gated Recursive Neural Network for Chinese Word Segmentation,” in Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing, pp. 567-572, Beijing, China, July 2015
                    14. X. C. Chen, X. P. Qiu, C. X. Zhu, P. F. Liu, and X. J. Huang, “Long Short-Term Memory Neural Networks for Chinese Word Segmentation,” in Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 1197-1206, Lisbon, Portugal, September 2015
                    15. K. Cho, B. Van Merrienboer, and C. Gulcehre, “Learning Phrase Representations Using RNN Encoder-Decoder for Statistical Machine Translation,” in Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing, pp. 1724-1734, Doha, Qatar, October 2014
                    16. R. Jozefowicz, W. Zaremba, and I. Sutskever, “An Empirical Exploration of Recurrent Network Architectures,” in Proceedings of the 32nd International Conference on Machine Learning, pp. 2342-2350, Lille, France, July 2015
                    17. A. Graves and J. Schmidhuber, “Framewise Phoneme Classification with Bidirectional LSTM and Other Neural Network Architectures,” Neural Networks, Vol. 18, No. 5-6, pp. 602-610, March 2005
                    18. C. Jin, W. H. Li, C. Ji, X. Z. Jin, and Y. B. Guo, “Bi-Directional Long Short-Term Memory Neural Networks for Chinese Word Segmentation,” Journal of Chinese Information Processing, Vol. 32, No. 2, pp. 29-37, February 2018
                    19. Y. Bengio, P. Simard, and P. Frasconi, “Learning Long-Term Dependencies with Gradient Descent is Difficult,” IEEE Transactions on Neural Networks, Vol. 5, No. 2, pp. 157-166, February 2002
                    20. R. Pascanu, T. Mikolov, and Y. Bengio, “On the Difficulty if Training Recurrent Neural Networks,” in Proceedings of the 30th International Conference on Machine Learning, pp. 1301-1310, Atlanta, Georgia, USA, June 2013
                    21. S. Hochreiter and J. Schmidhuber, “Long Short-Term Memory,” Neural Computation, Vol. 9, No. 8, pp. 1735-1780, November 1997
                    22. Z. H. Ren, H. Y. Xu, S. L. Feng, H. Zhou, and J. Shi, “Sequence Labeling Chinese Word Segmentation Method based on LSTM Networks,” Application Research of Computers, Vol. 33, No. 5, pp. 1321-1326, November 2017
                    23. Y. S. Yao and Z. Huang, “Bi-directional LSTM Recurrent Neural Network for Chinese Word Segmentation,” in Proceedings of the 23rd International Conference on Neural Information Processing, pp. 345-353, Kyoto, Japan, October 2016
                    24. T. Mikolov, K. Chen, G. Corrado, and J. Dean, “Efficient Estimation of Word Representations in Vector Space,” Computer Science, 2013
                    25. H. Tseng, P. Chang, G. Andrew G, D. Jurafsky, and C. Manning, “A Conditional Random Field Word Segmenter for Sighan Bakeoff 2005,” in Proceedings of the Fourth SIGHAN Workshop on Chinese Language Processing, pp. 168-171, Jeju, Korea, October 2005

                    Please note : You will need Adobe Acrobat viewer to view the full articles.Get Free Adobe Reader

                    This site uses encryption for transmitting your passwords.