Occlusion-Robust Pallet Pose Estimation for Warehouse Automation
Accurate detection and estimation of pallet poses from color and depth data (RGB-D) are integral components many in advanced warehouse intelligent systems. State-of-the art object pose estimation methods follow a two-stage process, relying on off-the-shelf segmentation or object detection in the ini...
Main Authors: | , , , , , , , , , , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
IEEE
2024-01-01
|
Series: | IEEE Access |
Subjects: | |
Online Access: | https://ieeexplore.ieee.org/document/10378693/ |
_version_ | 1797361442249244672 |
---|---|
author | Van-Duc Vu Dinh-Dai Hoang Phan Xuan Tan Van-Thiep Nguyen Thu-Uyen Nguyen Ngoc-Anh Hoang Khanh-Toan Phan Duc-Thanh Tran Duy-Quang Vu Phuc-Quan Ngo Quang-Tri Duong Anh-Nhat Nguyen Dinh-Cuong Hoang |
author_facet | Van-Duc Vu Dinh-Dai Hoang Phan Xuan Tan Van-Thiep Nguyen Thu-Uyen Nguyen Ngoc-Anh Hoang Khanh-Toan Phan Duc-Thanh Tran Duy-Quang Vu Phuc-Quan Ngo Quang-Tri Duong Anh-Nhat Nguyen Dinh-Cuong Hoang |
author_sort | Van-Duc Vu |
collection | DOAJ |
description | Accurate detection and estimation of pallet poses from color and depth data (RGB-D) are integral components many in advanced warehouse intelligent systems. State-of-the art object pose estimation methods follow a two-stage process, relying on off-the-shelf segmentation or object detection in the initial stage and subsequently predicting the pose of objects using cropped images. The cropped patches may include both the target object and irrelevant information, such as background or other objects, leading to challenges in handling pallets in warehouse settings with heavy occlusions from loaded objects. In this study, we propose an innovative deep learning-based approach to address the occlusion problem in pallet pose estimation from RGB-D images. Inspired by the selective attention mechanism in human perception, our developed model learns to identify and attenuate the significance of features in occluded regions, focusing on the visible and informative areas for accurate pose estimation. Instead of directly estimating pallet poses from cropped patches as in existing methods, we introduce two feature map re-weighting modules with cross-modal attention. These modules effectively filter out features from occluded regions and background, enhancing pose estimation accuracy. Furthermore, we introduce a large-scale annotated pallet dataset specifically designed to capture occlusion scenarios in warehouse environments, facilitating comprehensive training and evaluation. Experimental results on the newly collected pallet dataset show that our proposed method increases accuracy by 13.5% compared to state-of-the-art methods. |
first_indexed | 2024-03-08T15:53:45Z |
format | Article |
id | doaj.art-85818809aecf417c88c5ae9dd093fb5d |
institution | Directory Open Access Journal |
issn | 2169-3536 |
language | English |
last_indexed | 2024-03-08T15:53:45Z |
publishDate | 2024-01-01 |
publisher | IEEE |
record_format | Article |
series | IEEE Access |
spelling | doaj.art-85818809aecf417c88c5ae9dd093fb5d2024-01-09T00:04:59ZengIEEEIEEE Access2169-35362024-01-01121927194210.1109/ACCESS.2023.334878110378693Occlusion-Robust Pallet Pose Estimation for Warehouse AutomationVan-Duc Vu0Dinh-Dai Hoang1Phan Xuan Tan2https://orcid.org/0000-0002-9592-0226Van-Thiep Nguyen3Thu-Uyen Nguyen4Ngoc-Anh Hoang5Khanh-Toan Phan6Duc-Thanh Tran7Duy-Quang Vu8https://orcid.org/0009-0008-8349-454XPhuc-Quan Ngo9Quang-Tri Duong10Anh-Nhat Nguyen11Dinh-Cuong Hoang12https://orcid.org/0000-0001-6058-2426ICT Department, FPT University, Hanoi, VietnamToyohashi University of Technology, Toyohashi, JapanCollege of Engineering, Shibaura Institute of Technology, Tokyo, JapanICT Department, FPT University, Hanoi, VietnamICT Department, FPT University, Hanoi, VietnamICT Department, FPT University, Hanoi, VietnamICT Department, FPT University, Hanoi, VietnamICT Department, FPT University, Hanoi, VietnamICT Department, FPT University, Hanoi, VietnamICT Department, FPT University, Hanoi, VietnamICT Department, FPT University, Hanoi, VietnamICT Department, FPT University, Hanoi, VietnamICT Department, FPT University, Hanoi, VietnamAccurate detection and estimation of pallet poses from color and depth data (RGB-D) are integral components many in advanced warehouse intelligent systems. State-of-the art object pose estimation methods follow a two-stage process, relying on off-the-shelf segmentation or object detection in the initial stage and subsequently predicting the pose of objects using cropped images. The cropped patches may include both the target object and irrelevant information, such as background or other objects, leading to challenges in handling pallets in warehouse settings with heavy occlusions from loaded objects. In this study, we propose an innovative deep learning-based approach to address the occlusion problem in pallet pose estimation from RGB-D images. Inspired by the selective attention mechanism in human perception, our developed model learns to identify and attenuate the significance of features in occluded regions, focusing on the visible and informative areas for accurate pose estimation. Instead of directly estimating pallet poses from cropped patches as in existing methods, we introduce two feature map re-weighting modules with cross-modal attention. These modules effectively filter out features from occluded regions and background, enhancing pose estimation accuracy. Furthermore, we introduce a large-scale annotated pallet dataset specifically designed to capture occlusion scenarios in warehouse environments, facilitating comprehensive training and evaluation. Experimental results on the newly collected pallet dataset show that our proposed method increases accuracy by 13.5% compared to state-of-the-art methods.https://ieeexplore.ieee.org/document/10378693/Pose estimationrobot vision systemsintelligent systemsdeep learningsupervised learningmachine vision |
spellingShingle | Van-Duc Vu Dinh-Dai Hoang Phan Xuan Tan Van-Thiep Nguyen Thu-Uyen Nguyen Ngoc-Anh Hoang Khanh-Toan Phan Duc-Thanh Tran Duy-Quang Vu Phuc-Quan Ngo Quang-Tri Duong Anh-Nhat Nguyen Dinh-Cuong Hoang Occlusion-Robust Pallet Pose Estimation for Warehouse Automation IEEE Access Pose estimation robot vision systems intelligent systems deep learning supervised learning machine vision |
title | Occlusion-Robust Pallet Pose Estimation for Warehouse Automation |
title_full | Occlusion-Robust Pallet Pose Estimation for Warehouse Automation |
title_fullStr | Occlusion-Robust Pallet Pose Estimation for Warehouse Automation |
title_full_unstemmed | Occlusion-Robust Pallet Pose Estimation for Warehouse Automation |
title_short | Occlusion-Robust Pallet Pose Estimation for Warehouse Automation |
title_sort | occlusion robust pallet pose estimation for warehouse automation |
topic | Pose estimation robot vision systems intelligent systems deep learning supervised learning machine vision |
url | https://ieeexplore.ieee.org/document/10378693/ |
work_keys_str_mv | AT vanducvu occlusionrobustpalletposeestimationforwarehouseautomation AT dinhdaihoang occlusionrobustpalletposeestimationforwarehouseautomation AT phanxuantan occlusionrobustpalletposeestimationforwarehouseautomation AT vanthiepnguyen occlusionrobustpalletposeestimationforwarehouseautomation AT thuuyennguyen occlusionrobustpalletposeestimationforwarehouseautomation AT ngocanhhoang occlusionrobustpalletposeestimationforwarehouseautomation AT khanhtoanphan occlusionrobustpalletposeestimationforwarehouseautomation AT ducthanhtran occlusionrobustpalletposeestimationforwarehouseautomation AT duyquangvu occlusionrobustpalletposeestimationforwarehouseautomation AT phucquanngo occlusionrobustpalletposeestimationforwarehouseautomation AT quangtriduong occlusionrobustpalletposeestimationforwarehouseautomation AT anhnhatnguyen occlusionrobustpalletposeestimationforwarehouseautomation AT dinhcuonghoang occlusionrobustpalletposeestimationforwarehouseautomation |