Summary: | In the traditional text detection process, the text area of the small receptive field in the video image is easily ignored, the features that can be extracted are few, and the calculation is large. These problems are not conducive to the recognition of text information. In this paper, a lightweight network structure on the basis of the EAST algorithm, the Convolution Block Attention Module (CBAM), is proposed. It is suitable for the spatial and channel hybrid attention module of text feature extraction of the natural scene video images. The improved structure proposed in this paper can obtain deep network features of text and reduce the computation of text feature extraction. Additionally, a hybrid feature pyramid + BLSTM network is designed to improve the attention to the small acceptance domain text regions and the text sequence features of the region. The test results on the ICDAR2015 demonstrate that the improved construction can effectively boost the attention of small acceptance domain text regions and improve the sequence feature detection accuracy of small acceptance domain of long text regions without significantly increasing computation. At the same time, the proposed network constructions are superior to the traditional EAST algorithm and other improved algorithms in accuracy rate P, recall rate R, and F-value.
|