Binarization of Degraded Document Images Using Convolutional Neural Networks and Wavelet-Based Multichannel Images

Convolutional neural networks (CNNs) have previously been broadly utilized to binarize document images. These methods have problems when faced with degraded historical documents. This paper proposes the utilization of CNNs to identify foreground pixels using novel input-generated multichannel images...

Full description

Bibliographic Details
Main Authors: Younes Akbari, Somaya Al-Maadeed, Kalthoum Adam
Format: Article
Language:English
Published: IEEE 2020-01-01
Series:IEEE Access
Subjects:
Online Access:https://ieeexplore.ieee.org/document/9171243/
_version_ 1818917574882099200
author Younes Akbari
Somaya Al-Maadeed
Kalthoum Adam
author_facet Younes Akbari
Somaya Al-Maadeed
Kalthoum Adam
author_sort Younes Akbari
collection DOAJ
description Convolutional neural networks (CNNs) have previously been broadly utilized to binarize document images. These methods have problems when faced with degraded historical documents. This paper proposes the utilization of CNNs to identify foreground pixels using novel input-generated multichannel images. To create the images, the original source image is decomposed into wavelet subbands. Then, the original image is approximated by each subband separately, and finally, the multichannel image is constituted by arranging the original source image (grayscale image) as the first channel and the approximated image by each subband as the remaining channels. To achieve the best results, two scenarios are considered, that is, two-channel and four-channel images, and then fed into two types of CNN architectures, namely, single and multiple streams. To investigate the effect of the multichannel images proposed as network inputs, the CNNs used in the architectures are three popular networks, namely, U-net, SegNet, and DeepLabv3+. The experimental results of the scenarios demonstrate that our method is more successful than the three CNNs when trained by the original source images and proves competitive performance in comparison with state-of-the-art results using the DIBCO database.
first_indexed 2024-12-20T00:36:14Z
format Article
id doaj.art-f2684625063e46a99bc57efdc7ac78b5
institution Directory Open Access Journal
issn 2169-3536
language English
last_indexed 2024-12-20T00:36:14Z
publishDate 2020-01-01
publisher IEEE
record_format Article
series IEEE Access
spelling doaj.art-f2684625063e46a99bc57efdc7ac78b52022-12-21T19:59:45ZengIEEEIEEE Access2169-35362020-01-01815351715353410.1109/ACCESS.2020.30177839171243Binarization of Degraded Document Images Using Convolutional Neural Networks and Wavelet-Based Multichannel ImagesYounes Akbari0https://orcid.org/0000-0001-7175-4326Somaya Al-Maadeed1https://orcid.org/0000-0002-0241-2899Kalthoum Adam2Department of Computer Science and Engineering, Qatar University, Doha, QatarDepartment of Computer Science and Engineering, Qatar University, Doha, QatarDepartment of Computer Science and Engineering, Qatar University, Doha, QatarConvolutional neural networks (CNNs) have previously been broadly utilized to binarize document images. These methods have problems when faced with degraded historical documents. This paper proposes the utilization of CNNs to identify foreground pixels using novel input-generated multichannel images. To create the images, the original source image is decomposed into wavelet subbands. Then, the original image is approximated by each subband separately, and finally, the multichannel image is constituted by arranging the original source image (grayscale image) as the first channel and the approximated image by each subband as the remaining channels. To achieve the best results, two scenarios are considered, that is, two-channel and four-channel images, and then fed into two types of CNN architectures, namely, single and multiple streams. To investigate the effect of the multichannel images proposed as network inputs, the CNNs used in the architectures are three popular networks, namely, U-net, SegNet, and DeepLabv3+. The experimental results of the scenarios demonstrate that our method is more successful than the three CNNs when trained by the original source images and proves competitive performance in comparison with state-of-the-art results using the DIBCO database.https://ieeexplore.ieee.org/document/9171243/Document image binarizationwavelet-based multichannel imagessingle and multiple CNNsSegNetU-netDeepLabv3+
spellingShingle Younes Akbari
Somaya Al-Maadeed
Kalthoum Adam
Binarization of Degraded Document Images Using Convolutional Neural Networks and Wavelet-Based Multichannel Images
IEEE Access
Document image binarization
wavelet-based multichannel images
single and multiple CNNs
SegNet
U-net
DeepLabv3+
title Binarization of Degraded Document Images Using Convolutional Neural Networks and Wavelet-Based Multichannel Images
title_full Binarization of Degraded Document Images Using Convolutional Neural Networks and Wavelet-Based Multichannel Images
title_fullStr Binarization of Degraded Document Images Using Convolutional Neural Networks and Wavelet-Based Multichannel Images
title_full_unstemmed Binarization of Degraded Document Images Using Convolutional Neural Networks and Wavelet-Based Multichannel Images
title_short Binarization of Degraded Document Images Using Convolutional Neural Networks and Wavelet-Based Multichannel Images
title_sort binarization of degraded document images using convolutional neural networks and wavelet based multichannel images
topic Document image binarization
wavelet-based multichannel images
single and multiple CNNs
SegNet
U-net
DeepLabv3+
url https://ieeexplore.ieee.org/document/9171243/
work_keys_str_mv AT younesakbari binarizationofdegradeddocumentimagesusingconvolutionalneuralnetworksandwaveletbasedmultichannelimages
AT somayaalmaadeed binarizationofdegradeddocumentimagesusingconvolutionalneuralnetworksandwaveletbasedmultichannelimages
AT kalthoumadam binarizationofdegradeddocumentimagesusingconvolutionalneuralnetworksandwaveletbasedmultichannelimages