DMSC-Net: A deep Multi-Scale context network for 3D object detection of indoor point clouds

Indoor object detection has emerged as one of the key technologies for the success of numerous indoor system applications, such as autonomous navigation, accurate modeling of indoor environments, digital twin and terra Hertz (THz) communications. This paper first proposes a flexible and inter-operat...

Full description

Bibliographic Details
Main Authors: Zhenxin Zhang, Dixiang Xu, P. Takis Mathiopoulos, Qiang Wang, Liqiang Zhang, Zhihua Xu, Jincheng Jiang, Zhen Li
Format: Article
Language:English
Published: Elsevier 2023-08-01
Series:International Journal of Applied Earth Observations and Geoinformation
Subjects:
Online Access:http://www.sciencedirect.com/science/article/pii/S1569843223002789
_version_ 1797737973653962752
author Zhenxin Zhang
Dixiang Xu
P. Takis Mathiopoulos
Qiang Wang
Liqiang Zhang
Zhihua Xu
Jincheng Jiang
Zhen Li
author_facet Zhenxin Zhang
Dixiang Xu
P. Takis Mathiopoulos
Qiang Wang
Liqiang Zhang
Zhihua Xu
Jincheng Jiang
Zhen Li
author_sort Zhenxin Zhang
collection DOAJ
description Indoor object detection has emerged as one of the key technologies for the success of numerous indoor system applications, such as autonomous navigation, accurate modeling of indoor environments, digital twin and terra Hertz (THz) communications. This paper first proposes a flexible and inter-operational detection module, termed deep multi-scale context (DMSC) module, aiming at the development of efficient indoor object detection techniques using the point clouds. More specifically, by combining the deep contextual information of indoor objects and multi-scale features, a novel deep multi-scale contextual feature is designed. Furthermore, we introduce the decoder part of the vision transformer into the indoor object proposal generation by means of a multi-head attention (MHA) module from a three-dimensional (3D) point cloud to accurately extract object proposals generating high-quality bounding boxes. Extensive experiments have shown that, the effective interoperability of the proposed DMSC module with three object detection networks, namely VoteNet, GroupFree 3D and RBGNet, leads to improvements in their mAP@0.25 by 6.5%, 0.9% and 0.4% on the ScanNetV2 datasets, respectively. The proposed end-to-end network, termed as DMSC-Net, consists of an indoor point cloud feature learning backbone (FLB) unit, and three modules, namely the DMSC, a voting decision (VD) module, and an MHA module. Extensive experiments have shown that the DMSC-Net outperforms other advanced indoor 3D detection networks, such as RBGNet, by 1.1% and 0.9% of mAP@0.25 when applied on ScanNet and SUN RGB-D datasets, respectively. The developed code is publicly available at: https://github.com/CNU-DLandCV-lab/MHA_DMSC.
first_indexed 2024-03-12T13:37:11Z
format Article
id doaj.art-41097bcd1fe64827b7f8f87cafe5e6c3
institution Directory Open Access Journal
issn 1569-8432
language English
last_indexed 2024-03-12T13:37:11Z
publishDate 2023-08-01
publisher Elsevier
record_format Article
series International Journal of Applied Earth Observations and Geoinformation
spelling doaj.art-41097bcd1fe64827b7f8f87cafe5e6c32023-08-24T04:34:24ZengElsevierInternational Journal of Applied Earth Observations and Geoinformation1569-84322023-08-01122103454DMSC-Net: A deep Multi-Scale context network for 3D object detection of indoor point cloudsZhenxin Zhang0Dixiang Xu1P. Takis Mathiopoulos2Qiang Wang3Liqiang Zhang4Zhihua Xu5Jincheng Jiang6Zhen Li7Key Laboratory of 3D Information Acquisition and Application, MOE, Capital Normal University, Beijing 100048, China; College of Resource Environment and Tourism, Capital Normal University, Beijing 100048, ChinaKey Laboratory of 3D Information Acquisition and Application, MOE, Capital Normal University, Beijing 100048, China; College of Resource Environment and Tourism, Capital Normal University, Beijing 100048, China; Corresponding authors at: College of Resource Environment and Tourism, Capital Normal University, Beijing 100048, China and Shenzhen Institutes of Advanced Technology, Chinese Academy of Sciences, Shenzhen 518055, China.Department of Informatics and Telecommunications, National and Kapodistrian University of Athens, Athens 15784, GreeceSchool of Geographic and Environmental Sciences, Tianjin Normal University, Tianjin 300387, ChinaState Key Laboratory of Remote Sensing Science, Department of Geographical Science, Beijing Normal University, Beijing 100875, ChinaCollege of Geoscience and Surveying Engineering, China University of Mining and Technology (Beijing), Beijing 100083, ChinaShenzhen Institutes of Advanced Technology, Chinese Academy of Sciences, Shenzhen 518055, China; Corresponding authors at: College of Resource Environment and Tourism, Capital Normal University, Beijing 100048, China and Shenzhen Institutes of Advanced Technology, Chinese Academy of Sciences, Shenzhen 518055, China.Key Laboratory of 3D Information Acquisition and Application, MOE, Capital Normal University, Beijing 100048, China; College of Resource Environment and Tourism, Capital Normal University, Beijing 100048, ChinaIndoor object detection has emerged as one of the key technologies for the success of numerous indoor system applications, such as autonomous navigation, accurate modeling of indoor environments, digital twin and terra Hertz (THz) communications. This paper first proposes a flexible and inter-operational detection module, termed deep multi-scale context (DMSC) module, aiming at the development of efficient indoor object detection techniques using the point clouds. More specifically, by combining the deep contextual information of indoor objects and multi-scale features, a novel deep multi-scale contextual feature is designed. Furthermore, we introduce the decoder part of the vision transformer into the indoor object proposal generation by means of a multi-head attention (MHA) module from a three-dimensional (3D) point cloud to accurately extract object proposals generating high-quality bounding boxes. Extensive experiments have shown that, the effective interoperability of the proposed DMSC module with three object detection networks, namely VoteNet, GroupFree 3D and RBGNet, leads to improvements in their mAP@0.25 by 6.5%, 0.9% and 0.4% on the ScanNetV2 datasets, respectively. The proposed end-to-end network, termed as DMSC-Net, consists of an indoor point cloud feature learning backbone (FLB) unit, and three modules, namely the DMSC, a voting decision (VD) module, and an MHA module. Extensive experiments have shown that the DMSC-Net outperforms other advanced indoor 3D detection networks, such as RBGNet, by 1.1% and 0.9% of mAP@0.25 when applied on ScanNet and SUN RGB-D datasets, respectively. The developed code is publicly available at: https://github.com/CNU-DLandCV-lab/MHA_DMSC.http://www.sciencedirect.com/science/article/pii/S1569843223002789Indoor point cloudObject detectionMulti-head attention mechanismDeep multi-scale contextual featureDeep learning
spellingShingle Zhenxin Zhang
Dixiang Xu
P. Takis Mathiopoulos
Qiang Wang
Liqiang Zhang
Zhihua Xu
Jincheng Jiang
Zhen Li
DMSC-Net: A deep Multi-Scale context network for 3D object detection of indoor point clouds
International Journal of Applied Earth Observations and Geoinformation
Indoor point cloud
Object detection
Multi-head attention mechanism
Deep multi-scale contextual feature
Deep learning
title DMSC-Net: A deep Multi-Scale context network for 3D object detection of indoor point clouds
title_full DMSC-Net: A deep Multi-Scale context network for 3D object detection of indoor point clouds
title_fullStr DMSC-Net: A deep Multi-Scale context network for 3D object detection of indoor point clouds
title_full_unstemmed DMSC-Net: A deep Multi-Scale context network for 3D object detection of indoor point clouds
title_short DMSC-Net: A deep Multi-Scale context network for 3D object detection of indoor point clouds
title_sort dmsc net a deep multi scale context network for 3d object detection of indoor point clouds
topic Indoor point cloud
Object detection
Multi-head attention mechanism
Deep multi-scale contextual feature
Deep learning
url http://www.sciencedirect.com/science/article/pii/S1569843223002789
work_keys_str_mv AT zhenxinzhang dmscnetadeepmultiscalecontextnetworkfor3dobjectdetectionofindoorpointclouds
AT dixiangxu dmscnetadeepmultiscalecontextnetworkfor3dobjectdetectionofindoorpointclouds
AT ptakismathiopoulos dmscnetadeepmultiscalecontextnetworkfor3dobjectdetectionofindoorpointclouds
AT qiangwang dmscnetadeepmultiscalecontextnetworkfor3dobjectdetectionofindoorpointclouds
AT liqiangzhang dmscnetadeepmultiscalecontextnetworkfor3dobjectdetectionofindoorpointclouds
AT zhihuaxu dmscnetadeepmultiscalecontextnetworkfor3dobjectdetectionofindoorpointclouds
AT jinchengjiang dmscnetadeepmultiscalecontextnetworkfor3dobjectdetectionofindoorpointclouds
AT zhenli dmscnetadeepmultiscalecontextnetworkfor3dobjectdetectionofindoorpointclouds