PolarText: Single-stage Scene Text Detection with Polar Representation

摘要

Although deep learning has achieved great success in object detection recently, scene text detection remains a challenging task due to inherent difficulties in locating texts in complex scenes. Many approaches are inspired by segmentation methods to detect arbitrarily shaped scene text. However, most segmentation-based methods are computationally expensive and require significant refinements for accurate results. To address this issue, we propose PolarText, a novel single-stage method that detects text regions by generating contour points in polar coordinates. PolarText reduces computation costs by directly regressing contour points instead of pixels and better aligns with the intrinsic characteristics of text instances using centers and contours, mitigating boundary pixel mislabeling caused by pixel-level labeling. The network introduces Polar IoU loss and polar centerness to adapt effective paradigms from box representation for polar representation. Additionally, we incorporate a bounding box branch to handle text detection, as most text instances are approximately rectangular. Experimental results on CTW 1500 and ICDAR 2015 datasets show that PolarText achieves superior accuracy and efficiency compared to existing methods.

出版物
2021 IEEE 19th International Conference on Embedded and Ubiquitous Computing (EUC)
巫义锐
巫义锐
青年教授, CCF 高级会员

My research interests include Computer Vision, Artifical Intelligence, Multimedia Computing and Intelligent Water Conservancy.