3D Convolutional Neural Networks Initialized from Pretrained 2D Convolutional Neural Networks for Classification of Industrial Parts

Deep learning methods have been successfully applied to image processing, mainly using 2D vision sensors. Recently, the rise of depth cameras and other similar 3D sensors has opened the field for new perception techniques. Nevertheless, 3D convolutional neural networks perform slightly worse than ot...

Full description

Bibliographic Details
Main Authors: Ibon Merino, Jon Azpiazu, Anthony Remazeilles, Basilio Sierra
Format: Article
Language:English
Published: MDPI AG 2021-02-01
Series:Sensors
Subjects:
Online Access:https://www.mdpi.com/1424-8220/21/4/1078
_version_ 1797414777960529920
author Ibon Merino
Jon Azpiazu
Anthony Remazeilles
Basilio Sierra
author_facet Ibon Merino
Jon Azpiazu
Anthony Remazeilles
Basilio Sierra
author_sort Ibon Merino
collection DOAJ
description Deep learning methods have been successfully applied to image processing, mainly using 2D vision sensors. Recently, the rise of depth cameras and other similar 3D sensors has opened the field for new perception techniques. Nevertheless, 3D convolutional neural networks perform slightly worse than other 3D deep learning methods, and even worse than their 2D version. In this paper, we propose to improve 3D deep learning results by transferring the pretrained weights learned in 2D networks to their corresponding 3D version. Using an industrial object recognition context, we have analyzed different combinations of 3D convolutional networks (VGG16, ResNet, Inception ResNet, and EfficientNet), comparing the recognition accuracy. The highest accuracy is obtained with EfficientNetB0 using extrusion with an accuracy of 0.9217, which gives comparable results to state-of-the art methods. We also observed that the transfer approach enabled to improve the accuracy of the Inception ResNet 3D version up to 18% with respect to the score of the 3D approach alone.
first_indexed 2024-03-09T05:38:45Z
format Article
id doaj.art-237c77384cc640479d350610fdfefcd4
institution Directory Open Access Journal
issn 1424-8220
language English
last_indexed 2024-03-09T05:38:45Z
publishDate 2021-02-01
publisher MDPI AG
record_format Article
series Sensors
spelling doaj.art-237c77384cc640479d350610fdfefcd42023-12-03T12:26:24ZengMDPI AGSensors1424-82202021-02-01214107810.3390/s210410783D Convolutional Neural Networks Initialized from Pretrained 2D Convolutional Neural Networks for Classification of Industrial PartsIbon Merino0Jon Azpiazu1Anthony Remazeilles2Basilio Sierra3TECNALIA, Basque Research and Technology Alliance (BRTA), Mikeletegi Pasealekua 7, 20009 Donostia-San Sebastián, SpainTECNALIA, Basque Research and Technology Alliance (BRTA), Mikeletegi Pasealekua 7, 20009 Donostia-San Sebastián, SpainTECNALIA, Basque Research and Technology Alliance (BRTA), Mikeletegi Pasealekua 7, 20009 Donostia-San Sebastián, SpainRobotics and Autonomous Systems Group, Universidad del País Vasco/Euskal Herriko Unibertsitatea, 48940 Basque, SpainDeep learning methods have been successfully applied to image processing, mainly using 2D vision sensors. Recently, the rise of depth cameras and other similar 3D sensors has opened the field for new perception techniques. Nevertheless, 3D convolutional neural networks perform slightly worse than other 3D deep learning methods, and even worse than their 2D version. In this paper, we propose to improve 3D deep learning results by transferring the pretrained weights learned in 2D networks to their corresponding 3D version. Using an industrial object recognition context, we have analyzed different combinations of 3D convolutional networks (VGG16, ResNet, Inception ResNet, and EfficientNet), comparing the recognition accuracy. The highest accuracy is obtained with EfficientNetB0 using extrusion with an accuracy of 0.9217, which gives comparable results to state-of-the art methods. We also observed that the transfer approach enabled to improve the accuracy of the Inception ResNet 3D version up to 18% with respect to the score of the 3D approach alone.https://www.mdpi.com/1424-8220/21/4/1078computer visiondeep learningtransfer learningobject recognition
spellingShingle Ibon Merino
Jon Azpiazu
Anthony Remazeilles
Basilio Sierra
3D Convolutional Neural Networks Initialized from Pretrained 2D Convolutional Neural Networks for Classification of Industrial Parts
Sensors
computer vision
deep learning
transfer learning
object recognition
title 3D Convolutional Neural Networks Initialized from Pretrained 2D Convolutional Neural Networks for Classification of Industrial Parts
title_full 3D Convolutional Neural Networks Initialized from Pretrained 2D Convolutional Neural Networks for Classification of Industrial Parts
title_fullStr 3D Convolutional Neural Networks Initialized from Pretrained 2D Convolutional Neural Networks for Classification of Industrial Parts
title_full_unstemmed 3D Convolutional Neural Networks Initialized from Pretrained 2D Convolutional Neural Networks for Classification of Industrial Parts
title_short 3D Convolutional Neural Networks Initialized from Pretrained 2D Convolutional Neural Networks for Classification of Industrial Parts
title_sort 3d convolutional neural networks initialized from pretrained 2d convolutional neural networks for classification of industrial parts
topic computer vision
deep learning
transfer learning
object recognition
url https://www.mdpi.com/1424-8220/21/4/1078
work_keys_str_mv AT ibonmerino 3dconvolutionalneuralnetworksinitializedfrompretrained2dconvolutionalneuralnetworksforclassificationofindustrialparts
AT jonazpiazu 3dconvolutionalneuralnetworksinitializedfrompretrained2dconvolutionalneuralnetworksforclassificationofindustrialparts
AT anthonyremazeilles 3dconvolutionalneuralnetworksinitializedfrompretrained2dconvolutionalneuralnetworksforclassificationofindustrialparts
AT basiliosierra 3dconvolutionalneuralnetworksinitializedfrompretrained2dconvolutionalneuralnetworksforclassificationofindustrialparts