A Neural Network Model for Text Detection in Chinese Drug Package Insert

The text information in the medical photocopies is of great significance to the construction of medical digital platform. Text region detection, the very first step of extracting medical photocopies information, is functional to detect text area or locate text instance on the sample. Researchers hav...

Full description

Bibliographic Details
Main Authors: Haiwen Wu, Ri-Gui Zhou, Yaochong Li
Format: Article
Language:English
Published: IEEE 2021-01-01
Series:IEEE Access
Subjects:
Online Access:https://ieeexplore.ieee.org/document/9373364/
_version_ 1818727652851187712
author Haiwen Wu
Ri-Gui Zhou
Yaochong Li
author_facet Haiwen Wu
Ri-Gui Zhou
Yaochong Li
author_sort Haiwen Wu
collection DOAJ
description The text information in the medical photocopies is of great significance to the construction of medical digital platform. Text region detection, the very first step of extracting medical photocopies information, is functional to detect text area or locate text instance on the sample. Researchers have done a lot works on text area detection in natural scenes, yet few of them in turn pay attention to the medical photocopies scenario which is urgent to be settled. Here, a text line area detection dataset based on Chinese medical photocopies (CMPTD) are created and a fine-grained text line region detection model based on multi-scale feature extraction and fusion are proposed in this paper. The detection model consists of three parts. The first part is feature extraction module. Cspdarknet53 in You Only Look Once version 4 (YOLOv4) is used as the backbone network of our model, and the spatial pyramid pool strategy is used to extract multi-scale features to enhance the robustness of the model. The second part is feature fusion module. By referring to the PANet structure, the three effective feature layers in feature extraction module are fused repeatedly. The last part is prediction module. The network outputs a series of fine-grained text proposals by referring to the CTPN structure, which are connected into text lines by text line construction algorithm. We experimentally demonstrate the effectiveness of the detection model with the precision of 92.46% and the recall of 91.74% in the text detection task of the dataset CMPTD.
first_indexed 2024-12-17T22:17:31Z
format Article
id doaj.art-0429b84fc00d48039a4ed86c9cb5ca0f
institution Directory Open Access Journal
issn 2169-3536
language English
last_indexed 2024-12-17T22:17:31Z
publishDate 2021-01-01
publisher IEEE
record_format Article
series IEEE Access
spelling doaj.art-0429b84fc00d48039a4ed86c9cb5ca0f2022-12-21T21:30:34ZengIEEEIEEE Access2169-35362021-01-019397813979110.1109/ACCESS.2021.30645649373364A Neural Network Model for Text Detection in Chinese Drug Package InsertHaiwen Wu0https://orcid.org/0000-0003-3961-2429Ri-Gui Zhou1https://orcid.org/0000-0002-8894-8108Yaochong Li2https://orcid.org/0000-0002-1474-9800College of Information Engineering, Shanghai Maritime University, Shanghai, ChinaCollege of Information Engineering, Shanghai Maritime University, Shanghai, ChinaCollege of Information Engineering, Shanghai Maritime University, Shanghai, ChinaThe text information in the medical photocopies is of great significance to the construction of medical digital platform. Text region detection, the very first step of extracting medical photocopies information, is functional to detect text area or locate text instance on the sample. Researchers have done a lot works on text area detection in natural scenes, yet few of them in turn pay attention to the medical photocopies scenario which is urgent to be settled. Here, a text line area detection dataset based on Chinese medical photocopies (CMPTD) are created and a fine-grained text line region detection model based on multi-scale feature extraction and fusion are proposed in this paper. The detection model consists of three parts. The first part is feature extraction module. Cspdarknet53 in You Only Look Once version 4 (YOLOv4) is used as the backbone network of our model, and the spatial pyramid pool strategy is used to extract multi-scale features to enhance the robustness of the model. The second part is feature fusion module. By referring to the PANet structure, the three effective feature layers in feature extraction module are fused repeatedly. The last part is prediction module. The network outputs a series of fine-grained text proposals by referring to the CTPN structure, which are connected into text lines by text line construction algorithm. We experimentally demonstrate the effectiveness of the detection model with the precision of 92.46% and the recall of 91.74% in the text detection task of the dataset CMPTD.https://ieeexplore.ieee.org/document/9373364/Chinese medical photocopyingtext detectionYOLOv4text line construction algorithmconvolutional neural network
spellingShingle Haiwen Wu
Ri-Gui Zhou
Yaochong Li
A Neural Network Model for Text Detection in Chinese Drug Package Insert
IEEE Access
Chinese medical photocopying
text detection
YOLOv4
text line construction algorithm
convolutional neural network
title A Neural Network Model for Text Detection in Chinese Drug Package Insert
title_full A Neural Network Model for Text Detection in Chinese Drug Package Insert
title_fullStr A Neural Network Model for Text Detection in Chinese Drug Package Insert
title_full_unstemmed A Neural Network Model for Text Detection in Chinese Drug Package Insert
title_short A Neural Network Model for Text Detection in Chinese Drug Package Insert
title_sort neural network model for text detection in chinese drug package insert
topic Chinese medical photocopying
text detection
YOLOv4
text line construction algorithm
convolutional neural network
url https://ieeexplore.ieee.org/document/9373364/
work_keys_str_mv AT haiwenwu aneuralnetworkmodelfortextdetectioninchinesedrugpackageinsert
AT riguizhou aneuralnetworkmodelfortextdetectioninchinesedrugpackageinsert
AT yaochongli aneuralnetworkmodelfortextdetectioninchinesedrugpackageinsert
AT haiwenwu neuralnetworkmodelfortextdetectioninchinesedrugpackageinsert
AT riguizhou neuralnetworkmodelfortextdetectioninchinesedrugpackageinsert
AT yaochongli neuralnetworkmodelfortextdetectioninchinesedrugpackageinsert