A Robust Symmetry-Based Method for Scene/Video Text Detection through Neural Network

摘要

Text detection in video/scene images has gained a significant attention in the field of image processing and document analysis due to the inherent challenges caused by variations in contrast, orientation, background, text type, font type, non-uniform illumination and so on. In this paper, we propose a novel text detection method to explore symmetry property and appearance features of text for improved accuracy and robustness. First, the proposed method explores Extremal Regions (ER) for detecting text candidates in images. Then we propose a novel feature named as Multi-domain Strokes Symmetry Histogram (MSSH) for each text candidate, which describes the inherent symmetry property of stroke pixel pairs in gray, gradient and frequency domains. Furthermore, deep convolutional features are extracted to describe the appearance for each text candidate. We further fuse them by Auto-Encoder network to define a more discriminative text descriptor for classification. Finally, the proposed method constructs text lines based on the classification results. We demonstrate the effectiveness and robustness detection results of our proposed method by testing on four different benchmark databases.

巫义锐
巫义锐
青年教授, CCF 高级会员

My research interests include Computer Vision, Artifical Intelligence, Multimedia Computing and Intelligent Water Conservancy.