Overall Understanding of Indoor Scenes by Fusing Multiframe Local RGB-D Data Based on Conditional Random Fields
Indoor mobile robots normally cannot capture the whole information of a scene by a single frame of perceptive data due to the limited sensor scope. The category of the current scene may be misjudged by robotics due to incomplete scene information, which leads to operation error. To address this prob...
Main Authors: | , , , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
IEEE
2020-01-01
|
Series: | IEEE Access |
Subjects: | |
Online Access: | https://ieeexplore.ieee.org/document/9055402/ |
_version_ | 1818935332011245568 |
---|---|
author | Haotian Chen Longfei Su Biao Zhang Fengchi Sun Jing Yuan Jie Liu |
author_facet | Haotian Chen Longfei Su Biao Zhang Fengchi Sun Jing Yuan Jie Liu |
author_sort | Haotian Chen |
collection | DOAJ |
description | Indoor mobile robots normally cannot capture the whole information of a scene by a single frame of perceptive data due to the limited sensor scope. The category of the current scene may be misjudged by robotics due to incomplete scene information, which leads to operation error. To address this problem, we propose an approach that leverages conditional random fields (CRFs) to fuse multiframe RGB and depth (RGB-D) visual data corresponding to the same scene. This method takes full advantage of prior knowledge that object categories significantly relate to the scene attributes. As a new image arrives, we incrementally integrate the current object detection results to update scene understanding by identifying duplicate objects between images, ranking available objects in terms of their relevance to the scene, and fusing new information with the existing CRF. With this approach, scene classification can be solved with higher precision based on multiview images than on single image frames sampled in the same places. Additionally, a configuration map of a scene is incrementally built into the above framework. The map includes identities of the recognized objects and various relations between them. This kind of map would not only benefit common robotic tasks but also offer a novel channel for human-robot interaction. We test the efficiency of our method on image sequences extracted from the NYU v2 dataset. The results show that our approach achieves the best performance against state-of-the-art baselines. |
first_indexed | 2024-12-20T05:18:29Z |
format | Article |
id | doaj.art-f02f67954a6b4058be5fad0cab2307d3 |
institution | Directory Open Access Journal |
issn | 2169-3536 |
language | English |
last_indexed | 2024-12-20T05:18:29Z |
publishDate | 2020-01-01 |
publisher | IEEE |
record_format | Article |
series | IEEE Access |
spelling | doaj.art-f02f67954a6b4058be5fad0cab2307d32022-12-21T19:52:05ZengIEEEIEEE Access2169-35362020-01-018650356504510.1109/ACCESS.2020.29852279055402Overall Understanding of Indoor Scenes by Fusing Multiframe Local RGB-D Data Based on Conditional Random FieldsHaotian Chen0https://orcid.org/0000-0001-6329-4091Longfei Su1https://orcid.org/0000-0002-9914-2185Biao Zhang2https://orcid.org/0000-0001-7867-445XFengchi Sun3https://orcid.org/0000-0001-6656-3664Jing Yuan4Jie Liu5College of Computer Science, Nankai University, Tianjin, ChinaCollege of Software, Nankai University, Tianjin, ChinaCollege of Computer Science, Nankai University, Tianjin, ChinaCollege of Software, Nankai University, Tianjin, ChinaCollege of Artificial Intelligence, Nankai University, Tianjin, ChinaCollege of Artificial Intelligence, Nankai University, Tianjin, ChinaIndoor mobile robots normally cannot capture the whole information of a scene by a single frame of perceptive data due to the limited sensor scope. The category of the current scene may be misjudged by robotics due to incomplete scene information, which leads to operation error. To address this problem, we propose an approach that leverages conditional random fields (CRFs) to fuse multiframe RGB and depth (RGB-D) visual data corresponding to the same scene. This method takes full advantage of prior knowledge that object categories significantly relate to the scene attributes. As a new image arrives, we incrementally integrate the current object detection results to update scene understanding by identifying duplicate objects between images, ranking available objects in terms of their relevance to the scene, and fusing new information with the existing CRF. With this approach, scene classification can be solved with higher precision based on multiview images than on single image frames sampled in the same places. Additionally, a configuration map of a scene is incrementally built into the above framework. The map includes identities of the recognized objects and various relations between them. This kind of map would not only benefit common robotic tasks but also offer a novel channel for human-robot interaction. We test the efficiency of our method on image sequences extracted from the NYU v2 dataset. The results show that our approach achieves the best performance against state-of-the-art baselines.https://ieeexplore.ieee.org/document/9055402/Conditional random fieldsmultiframe image fusionscene configuration mapscene understanding |
spellingShingle | Haotian Chen Longfei Su Biao Zhang Fengchi Sun Jing Yuan Jie Liu Overall Understanding of Indoor Scenes by Fusing Multiframe Local RGB-D Data Based on Conditional Random Fields IEEE Access Conditional random fields multiframe image fusion scene configuration map scene understanding |
title | Overall Understanding of Indoor Scenes by Fusing Multiframe Local RGB-D Data Based on Conditional Random Fields |
title_full | Overall Understanding of Indoor Scenes by Fusing Multiframe Local RGB-D Data Based on Conditional Random Fields |
title_fullStr | Overall Understanding of Indoor Scenes by Fusing Multiframe Local RGB-D Data Based on Conditional Random Fields |
title_full_unstemmed | Overall Understanding of Indoor Scenes by Fusing Multiframe Local RGB-D Data Based on Conditional Random Fields |
title_short | Overall Understanding of Indoor Scenes by Fusing Multiframe Local RGB-D Data Based on Conditional Random Fields |
title_sort | overall understanding of indoor scenes by fusing multiframe local rgb d data based on conditional random fields |
topic | Conditional random fields multiframe image fusion scene configuration map scene understanding |
url | https://ieeexplore.ieee.org/document/9055402/ |
work_keys_str_mv | AT haotianchen overallunderstandingofindoorscenesbyfusingmultiframelocalrgbddatabasedonconditionalrandomfields AT longfeisu overallunderstandingofindoorscenesbyfusingmultiframelocalrgbddatabasedonconditionalrandomfields AT biaozhang overallunderstandingofindoorscenesbyfusingmultiframelocalrgbddatabasedonconditionalrandomfields AT fengchisun overallunderstandingofindoorscenesbyfusingmultiframelocalrgbddatabasedonconditionalrandomfields AT jingyuan overallunderstandingofindoorscenesbyfusingmultiframelocalrgbddatabasedonconditionalrandomfields AT jieliu overallunderstandingofindoorscenesbyfusingmultiframelocalrgbddatabasedonconditionalrandomfields |