DMSC-Net: A deep Multi-Scale context network for 3D object detection of indoor point clouds
Indoor object detection has emerged as one of the key technologies for the success of numerous indoor system applications, such as autonomous navigation, accurate modeling of indoor environments, digital twin and terra Hertz (THz) communications. This paper first proposes a flexible and inter-operat...
Main Authors: | , , , , , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
Elsevier
2023-08-01
|
Series: | International Journal of Applied Earth Observations and Geoinformation |
Subjects: | |
Online Access: | http://www.sciencedirect.com/science/article/pii/S1569843223002789 |
_version_ | 1797737973653962752 |
---|---|
author | Zhenxin Zhang Dixiang Xu P. Takis Mathiopoulos Qiang Wang Liqiang Zhang Zhihua Xu Jincheng Jiang Zhen Li |
author_facet | Zhenxin Zhang Dixiang Xu P. Takis Mathiopoulos Qiang Wang Liqiang Zhang Zhihua Xu Jincheng Jiang Zhen Li |
author_sort | Zhenxin Zhang |
collection | DOAJ |
description | Indoor object detection has emerged as one of the key technologies for the success of numerous indoor system applications, such as autonomous navigation, accurate modeling of indoor environments, digital twin and terra Hertz (THz) communications. This paper first proposes a flexible and inter-operational detection module, termed deep multi-scale context (DMSC) module, aiming at the development of efficient indoor object detection techniques using the point clouds. More specifically, by combining the deep contextual information of indoor objects and multi-scale features, a novel deep multi-scale contextual feature is designed. Furthermore, we introduce the decoder part of the vision transformer into the indoor object proposal generation by means of a multi-head attention (MHA) module from a three-dimensional (3D) point cloud to accurately extract object proposals generating high-quality bounding boxes. Extensive experiments have shown that, the effective interoperability of the proposed DMSC module with three object detection networks, namely VoteNet, GroupFree 3D and RBGNet, leads to improvements in their mAP@0.25 by 6.5%, 0.9% and 0.4% on the ScanNetV2 datasets, respectively. The proposed end-to-end network, termed as DMSC-Net, consists of an indoor point cloud feature learning backbone (FLB) unit, and three modules, namely the DMSC, a voting decision (VD) module, and an MHA module. Extensive experiments have shown that the DMSC-Net outperforms other advanced indoor 3D detection networks, such as RBGNet, by 1.1% and 0.9% of mAP@0.25 when applied on ScanNet and SUN RGB-D datasets, respectively. The developed code is publicly available at: https://github.com/CNU-DLandCV-lab/MHA_DMSC. |
first_indexed | 2024-03-12T13:37:11Z |
format | Article |
id | doaj.art-41097bcd1fe64827b7f8f87cafe5e6c3 |
institution | Directory Open Access Journal |
issn | 1569-8432 |
language | English |
last_indexed | 2024-03-12T13:37:11Z |
publishDate | 2023-08-01 |
publisher | Elsevier |
record_format | Article |
series | International Journal of Applied Earth Observations and Geoinformation |
spelling | doaj.art-41097bcd1fe64827b7f8f87cafe5e6c32023-08-24T04:34:24ZengElsevierInternational Journal of Applied Earth Observations and Geoinformation1569-84322023-08-01122103454DMSC-Net: A deep Multi-Scale context network for 3D object detection of indoor point cloudsZhenxin Zhang0Dixiang Xu1P. Takis Mathiopoulos2Qiang Wang3Liqiang Zhang4Zhihua Xu5Jincheng Jiang6Zhen Li7Key Laboratory of 3D Information Acquisition and Application, MOE, Capital Normal University, Beijing 100048, China; College of Resource Environment and Tourism, Capital Normal University, Beijing 100048, ChinaKey Laboratory of 3D Information Acquisition and Application, MOE, Capital Normal University, Beijing 100048, China; College of Resource Environment and Tourism, Capital Normal University, Beijing 100048, China; Corresponding authors at: College of Resource Environment and Tourism, Capital Normal University, Beijing 100048, China and Shenzhen Institutes of Advanced Technology, Chinese Academy of Sciences, Shenzhen 518055, China.Department of Informatics and Telecommunications, National and Kapodistrian University of Athens, Athens 15784, GreeceSchool of Geographic and Environmental Sciences, Tianjin Normal University, Tianjin 300387, ChinaState Key Laboratory of Remote Sensing Science, Department of Geographical Science, Beijing Normal University, Beijing 100875, ChinaCollege of Geoscience and Surveying Engineering, China University of Mining and Technology (Beijing), Beijing 100083, ChinaShenzhen Institutes of Advanced Technology, Chinese Academy of Sciences, Shenzhen 518055, China; Corresponding authors at: College of Resource Environment and Tourism, Capital Normal University, Beijing 100048, China and Shenzhen Institutes of Advanced Technology, Chinese Academy of Sciences, Shenzhen 518055, China.Key Laboratory of 3D Information Acquisition and Application, MOE, Capital Normal University, Beijing 100048, China; College of Resource Environment and Tourism, Capital Normal University, Beijing 100048, ChinaIndoor object detection has emerged as one of the key technologies for the success of numerous indoor system applications, such as autonomous navigation, accurate modeling of indoor environments, digital twin and terra Hertz (THz) communications. This paper first proposes a flexible and inter-operational detection module, termed deep multi-scale context (DMSC) module, aiming at the development of efficient indoor object detection techniques using the point clouds. More specifically, by combining the deep contextual information of indoor objects and multi-scale features, a novel deep multi-scale contextual feature is designed. Furthermore, we introduce the decoder part of the vision transformer into the indoor object proposal generation by means of a multi-head attention (MHA) module from a three-dimensional (3D) point cloud to accurately extract object proposals generating high-quality bounding boxes. Extensive experiments have shown that, the effective interoperability of the proposed DMSC module with three object detection networks, namely VoteNet, GroupFree 3D and RBGNet, leads to improvements in their mAP@0.25 by 6.5%, 0.9% and 0.4% on the ScanNetV2 datasets, respectively. The proposed end-to-end network, termed as DMSC-Net, consists of an indoor point cloud feature learning backbone (FLB) unit, and three modules, namely the DMSC, a voting decision (VD) module, and an MHA module. Extensive experiments have shown that the DMSC-Net outperforms other advanced indoor 3D detection networks, such as RBGNet, by 1.1% and 0.9% of mAP@0.25 when applied on ScanNet and SUN RGB-D datasets, respectively. The developed code is publicly available at: https://github.com/CNU-DLandCV-lab/MHA_DMSC.http://www.sciencedirect.com/science/article/pii/S1569843223002789Indoor point cloudObject detectionMulti-head attention mechanismDeep multi-scale contextual featureDeep learning |
spellingShingle | Zhenxin Zhang Dixiang Xu P. Takis Mathiopoulos Qiang Wang Liqiang Zhang Zhihua Xu Jincheng Jiang Zhen Li DMSC-Net: A deep Multi-Scale context network for 3D object detection of indoor point clouds International Journal of Applied Earth Observations and Geoinformation Indoor point cloud Object detection Multi-head attention mechanism Deep multi-scale contextual feature Deep learning |
title | DMSC-Net: A deep Multi-Scale context network for 3D object detection of indoor point clouds |
title_full | DMSC-Net: A deep Multi-Scale context network for 3D object detection of indoor point clouds |
title_fullStr | DMSC-Net: A deep Multi-Scale context network for 3D object detection of indoor point clouds |
title_full_unstemmed | DMSC-Net: A deep Multi-Scale context network for 3D object detection of indoor point clouds |
title_short | DMSC-Net: A deep Multi-Scale context network for 3D object detection of indoor point clouds |
title_sort | dmsc net a deep multi scale context network for 3d object detection of indoor point clouds |
topic | Indoor point cloud Object detection Multi-head attention mechanism Deep multi-scale contextual feature Deep learning |
url | http://www.sciencedirect.com/science/article/pii/S1569843223002789 |
work_keys_str_mv | AT zhenxinzhang dmscnetadeepmultiscalecontextnetworkfor3dobjectdetectionofindoorpointclouds AT dixiangxu dmscnetadeepmultiscalecontextnetworkfor3dobjectdetectionofindoorpointclouds AT ptakismathiopoulos dmscnetadeepmultiscalecontextnetworkfor3dobjectdetectionofindoorpointclouds AT qiangwang dmscnetadeepmultiscalecontextnetworkfor3dobjectdetectionofindoorpointclouds AT liqiangzhang dmscnetadeepmultiscalecontextnetworkfor3dobjectdetectionofindoorpointclouds AT zhihuaxu dmscnetadeepmultiscalecontextnetworkfor3dobjectdetectionofindoorpointclouds AT jinchengjiang dmscnetadeepmultiscalecontextnetworkfor3dobjectdetectionofindoorpointclouds AT zhenli dmscnetadeepmultiscalecontextnetworkfor3dobjectdetectionofindoorpointclouds |