Binarization of Degraded Document Images Using Convolutional Neural Networks and Wavelet-Based Multichannel Images
Convolutional neural networks (CNNs) have previously been broadly utilized to binarize document images. These methods have problems when faced with degraded historical documents. This paper proposes the utilization of CNNs to identify foreground pixels using novel input-generated multichannel images...
Main Authors: | , , |
---|---|
Format: | Article |
Language: | English |
Published: |
IEEE
2020-01-01
|
Series: | IEEE Access |
Subjects: | |
Online Access: | https://ieeexplore.ieee.org/document/9171243/ |
_version_ | 1818917574882099200 |
---|---|
author | Younes Akbari Somaya Al-Maadeed Kalthoum Adam |
author_facet | Younes Akbari Somaya Al-Maadeed Kalthoum Adam |
author_sort | Younes Akbari |
collection | DOAJ |
description | Convolutional neural networks (CNNs) have previously been broadly utilized to binarize document images. These methods have problems when faced with degraded historical documents. This paper proposes the utilization of CNNs to identify foreground pixels using novel input-generated multichannel images. To create the images, the original source image is decomposed into wavelet subbands. Then, the original image is approximated by each subband separately, and finally, the multichannel image is constituted by arranging the original source image (grayscale image) as the first channel and the approximated image by each subband as the remaining channels. To achieve the best results, two scenarios are considered, that is, two-channel and four-channel images, and then fed into two types of CNN architectures, namely, single and multiple streams. To investigate the effect of the multichannel images proposed as network inputs, the CNNs used in the architectures are three popular networks, namely, U-net, SegNet, and DeepLabv3+. The experimental results of the scenarios demonstrate that our method is more successful than the three CNNs when trained by the original source images and proves competitive performance in comparison with state-of-the-art results using the DIBCO database. |
first_indexed | 2024-12-20T00:36:14Z |
format | Article |
id | doaj.art-f2684625063e46a99bc57efdc7ac78b5 |
institution | Directory Open Access Journal |
issn | 2169-3536 |
language | English |
last_indexed | 2024-12-20T00:36:14Z |
publishDate | 2020-01-01 |
publisher | IEEE |
record_format | Article |
series | IEEE Access |
spelling | doaj.art-f2684625063e46a99bc57efdc7ac78b52022-12-21T19:59:45ZengIEEEIEEE Access2169-35362020-01-01815351715353410.1109/ACCESS.2020.30177839171243Binarization of Degraded Document Images Using Convolutional Neural Networks and Wavelet-Based Multichannel ImagesYounes Akbari0https://orcid.org/0000-0001-7175-4326Somaya Al-Maadeed1https://orcid.org/0000-0002-0241-2899Kalthoum Adam2Department of Computer Science and Engineering, Qatar University, Doha, QatarDepartment of Computer Science and Engineering, Qatar University, Doha, QatarDepartment of Computer Science and Engineering, Qatar University, Doha, QatarConvolutional neural networks (CNNs) have previously been broadly utilized to binarize document images. These methods have problems when faced with degraded historical documents. This paper proposes the utilization of CNNs to identify foreground pixels using novel input-generated multichannel images. To create the images, the original source image is decomposed into wavelet subbands. Then, the original image is approximated by each subband separately, and finally, the multichannel image is constituted by arranging the original source image (grayscale image) as the first channel and the approximated image by each subband as the remaining channels. To achieve the best results, two scenarios are considered, that is, two-channel and four-channel images, and then fed into two types of CNN architectures, namely, single and multiple streams. To investigate the effect of the multichannel images proposed as network inputs, the CNNs used in the architectures are three popular networks, namely, U-net, SegNet, and DeepLabv3+. The experimental results of the scenarios demonstrate that our method is more successful than the three CNNs when trained by the original source images and proves competitive performance in comparison with state-of-the-art results using the DIBCO database.https://ieeexplore.ieee.org/document/9171243/Document image binarizationwavelet-based multichannel imagessingle and multiple CNNsSegNetU-netDeepLabv3+ |
spellingShingle | Younes Akbari Somaya Al-Maadeed Kalthoum Adam Binarization of Degraded Document Images Using Convolutional Neural Networks and Wavelet-Based Multichannel Images IEEE Access Document image binarization wavelet-based multichannel images single and multiple CNNs SegNet U-net DeepLabv3+ |
title | Binarization of Degraded Document Images Using Convolutional Neural Networks and Wavelet-Based Multichannel Images |
title_full | Binarization of Degraded Document Images Using Convolutional Neural Networks and Wavelet-Based Multichannel Images |
title_fullStr | Binarization of Degraded Document Images Using Convolutional Neural Networks and Wavelet-Based Multichannel Images |
title_full_unstemmed | Binarization of Degraded Document Images Using Convolutional Neural Networks and Wavelet-Based Multichannel Images |
title_short | Binarization of Degraded Document Images Using Convolutional Neural Networks and Wavelet-Based Multichannel Images |
title_sort | binarization of degraded document images using convolutional neural networks and wavelet based multichannel images |
topic | Document image binarization wavelet-based multichannel images single and multiple CNNs SegNet U-net DeepLabv3+ |
url | https://ieeexplore.ieee.org/document/9171243/ |
work_keys_str_mv | AT younesakbari binarizationofdegradeddocumentimagesusingconvolutionalneuralnetworksandwaveletbasedmultichannelimages AT somayaalmaadeed binarizationofdegradeddocumentimagesusingconvolutionalneuralnetworksandwaveletbasedmultichannelimages AT kalthoumadam binarizationofdegradeddocumentimagesusingconvolutionalneuralnetworksandwaveletbasedmultichannelimages |