Deep Learning-Based Object Classification and Position Estimation Pipeline for Potential Use in Robotized Pick-and-Place Operations

Accurate object classification and position estimation is a crucial part of executing autonomous pick-and-place operations by a robot and can be realized using RGB-D sensors becoming increasingly available for use in industrial applications. In this paper, we present a novel unified framework for ob...

Full description

Bibliographic Details
Main Authors:	Sergey Soltan, Artemiy Oleinikov, M. Fatih Demirci, Almas Shintemirov
Format:	Article
Language:	English
Published:	MDPI AG 2020-08-01
Series:	Robotics
Subjects:	object classification object position estimation deep learning neural network RGB-D image processing 3D point cloud processing
Online Access:	https://www.mdpi.com/2218-6581/9/3/63

_version_	1797557474857844736
author	Sergey Soltan Artemiy Oleinikov M. Fatih Demirci Almas Shintemirov
author_facet	Sergey Soltan Artemiy Oleinikov M. Fatih Demirci Almas Shintemirov
author_sort	Sergey Soltan
collection	DOAJ
description	Accurate object classification and position estimation is a crucial part of executing autonomous pick-and-place operations by a robot and can be realized using RGB-D sensors becoming increasingly available for use in industrial applications. In this paper, we present a novel unified framework for object detection and classification using a combination of point cloud processing and deep learning techniques. The proposed model uses two streams that recognize objects on RGB and depth data separately and combines the two in later stages to classify objects. Experimental evaluation of the proposed model including classification accuracy compared with previous works demonstrates its effectiveness and efficiency, making the model suitable for real-time applications. In particular, the experiments performed on the Washington RGB-D object dataset show that the proposed framework has 97.5% and 95% fewer parameters compared to the previous state-of-the-art multimodel neural networks Fus-CNN, CNN Features and VGG3D, respectively, with the cost of approximately 5% drop in classification accuracy. Moreover, the inference of the proposed framework takes 66.11%, 32.65%, and 28.77% less time on GPU and 86.91%, 51.12%, and 50.15% less time on CPU in comparison to VGG3D, Fus-CNN, and CNN Features. The potential applicability of the developed object classification and position estimation framework was then demonstrated on an experimental robot-manipulation setup realizing a simplified object pick-and-place scenario. In approximately 95% of test trials, the system was able to accurately position the robot over the detected objects of interest in an automatic mode, ensuring stable cyclic execution with no time delays.
first_indexed	2024-03-10T17:17:34Z
format	Article
id	doaj.art-c9f78bd2f96444d288a5652a84d3f635
institution	Directory Open Access Journal
issn	2218-6581
language	English
last_indexed	2024-03-10T17:17:34Z
publishDate	2020-08-01
publisher	MDPI AG
record_format	Article
series	Robotics
spelling	doaj.art-c9f78bd2f96444d288a5652a84d3f6352023-11-20T10:27:58ZengMDPI AGRobotics2218-65812020-08-01936310.3390/robotics9030063Deep Learning-Based Object Classification and Position Estimation Pipeline for Potential Use in Robotized Pick-and-Place OperationsSergey Soltan0Artemiy Oleinikov1M. Fatih Demirci2Almas Shintemirov3School of Engineering and Digital Sciences, Nazarbayev University, Nur-Sultan Z05H0P9, KazakhstanSchool of Engineering and Digital Sciences, Nazarbayev University, Nur-Sultan Z05H0P9, KazakhstanSchool of Engineering and Digital Sciences, Nazarbayev University, Nur-Sultan Z05H0P9, KazakhstanSchool of Engineering and Digital Sciences, Nazarbayev University, Nur-Sultan Z05H0P9, KazakhstanAccurate object classification and position estimation is a crucial part of executing autonomous pick-and-place operations by a robot and can be realized using RGB-D sensors becoming increasingly available for use in industrial applications. In this paper, we present a novel unified framework for object detection and classification using a combination of point cloud processing and deep learning techniques. The proposed model uses two streams that recognize objects on RGB and depth data separately and combines the two in later stages to classify objects. Experimental evaluation of the proposed model including classification accuracy compared with previous works demonstrates its effectiveness and efficiency, making the model suitable for real-time applications. In particular, the experiments performed on the Washington RGB-D object dataset show that the proposed framework has 97.5% and 95% fewer parameters compared to the previous state-of-the-art multimodel neural networks Fus-CNN, CNN Features and VGG3D, respectively, with the cost of approximately 5% drop in classification accuracy. Moreover, the inference of the proposed framework takes 66.11%, 32.65%, and 28.77% less time on GPU and 86.91%, 51.12%, and 50.15% less time on CPU in comparison to VGG3D, Fus-CNN, and CNN Features. The potential applicability of the developed object classification and position estimation framework was then demonstrated on an experimental robot-manipulation setup realizing a simplified object pick-and-place scenario. In approximately 95% of test trials, the system was able to accurately position the robot over the detected objects of interest in an automatic mode, ensuring stable cyclic execution with no time delays.https://www.mdpi.com/2218-6581/9/3/63object classificationobject position estimationdeep learningneural networkRGB-D image processing3D point cloud processing
spellingShingle	Sergey Soltan Artemiy Oleinikov M. Fatih Demirci Almas Shintemirov Deep Learning-Based Object Classification and Position Estimation Pipeline for Potential Use in Robotized Pick-and-Place Operations Robotics object classification object position estimation deep learning neural network RGB-D image processing 3D point cloud processing
title	Deep Learning-Based Object Classification and Position Estimation Pipeline for Potential Use in Robotized Pick-and-Place Operations
title_full	Deep Learning-Based Object Classification and Position Estimation Pipeline for Potential Use in Robotized Pick-and-Place Operations
title_fullStr	Deep Learning-Based Object Classification and Position Estimation Pipeline for Potential Use in Robotized Pick-and-Place Operations
title_full_unstemmed	Deep Learning-Based Object Classification and Position Estimation Pipeline for Potential Use in Robotized Pick-and-Place Operations
title_short	Deep Learning-Based Object Classification and Position Estimation Pipeline for Potential Use in Robotized Pick-and-Place Operations
title_sort	deep learning based object classification and position estimation pipeline for potential use in robotized pick and place operations
topic	object classification object position estimation deep learning neural network RGB-D image processing 3D point cloud processing
url	https://www.mdpi.com/2218-6581/9/3/63
work_keys_str_mv	AT sergeysoltan deeplearningbasedobjectclassificationandpositionestimationpipelineforpotentialuseinrobotizedpickandplaceoperations AT artemiyoleinikov deeplearningbasedobjectclassificationandpositionestimationpipelineforpotentialuseinrobotizedpickandplaceoperations AT mfatihdemirci deeplearningbasedobjectclassificationandpositionestimationpipelineforpotentialuseinrobotizedpickandplaceoperations AT almasshintemirov deeplearningbasedobjectclassificationandpositionestimationpipelineforpotentialuseinrobotizedpickandplaceoperations

Deep Learning-Based Object Classification and Position Estimation Pipeline for Potential Use in Robotized Pick-and-Place Operations

Similar Items