Can Machine Automatically Discover Text Image from Overall Perspective

Volume 15, Number 1, January 2019, pp. 281-287
DOI: 10.23940/ijpe.19.01.p28.281287

Wei Jianga, Jiayi Wua, and Chao Yaob

aSchool of Software, North China University of Water Resources and Electric Power, Zhengzhou, 450045, China
bSchool of Automation, Northwestern Polytechnic University, Xi’an, 710071, China

(Submitted on October 12, 2018; Revised on November 11, 2018; Accepted on December 23, 2018)


Recently, more and more researchers have focused on the problem about how to automatically distinguish text images from non-text ones. Most of previous works have originated from local features, which are computational expensive, and usually employ GPU in their procedure. To address this problem, we propose a new and simple but effective scheme from an overall perspective. In the proposed scheme, a sort of holistic feature is first extracted from Fourier spectrum, which describes the characteristic of the image or the sub-image as a whole without local feature extraction; then, random forests are utilized to classify images into text and non-text ones. Experimental results in several public datasets demonstrate that this scheme is efficient and effective.


