Interactive Semantic Map Representation for Skill-Based Visual Object Navigation

Visual object navigation is one of the key tasks in mobile robotics. One of the most important components of this task is the accurate semantic representation of the scene, which is needed to determine and reach a goal object. This paper introduces a new representation of a scene semantic map formed...

Full description

Bibliographic Details
Main Authors:	Tatiana Zemskova, Aleksei Staroverov, Kirill Muravyev, Dmitry A. Yudin, Aleksandr I. Panov
Format:	Article
Language:	English
Published:	IEEE 2024-01-01
Series:	IEEE Access
Subjects:	Semantic map navigation robotics reinforcement learning frontier-based exploration
Online Access:	https://ieeexplore.ieee.org/document/10477345/

_version_	1797231183541567488
author	Tatiana Zemskova Aleksei Staroverov Kirill Muravyev Dmitry A. Yudin Aleksandr I. Panov
author_facet	Tatiana Zemskova Aleksei Staroverov Kirill Muravyev Dmitry A. Yudin Aleksandr I. Panov
author_sort	Tatiana Zemskova
collection	DOAJ
description	Visual object navigation is one of the key tasks in mobile robotics. One of the most important components of this task is the accurate semantic representation of the scene, which is needed to determine and reach a goal object. This paper introduces a new representation of a scene semantic map formed during the embodied agent interaction with the indoor environment. It is based on a neural network method that adjusts the weights of the segmentation model with backpropagation of the predicted fusion loss values during inference on a regular (backward) or delayed (forward) image sequence. We implement this representation into a full-fledged navigation approach called SkillTron. The method can select robot skills from end-to-end policies based on reinforcement learning and classic map-based planning methods. The proposed approach makes it possible to form both intermediate goals for robot exploration and the final goal for object navigation. We conduct intensive experiments with the proposed approach in the Habitat environment, demonstrating its significant superiority over state-of-the-art approaches in terms of navigation quality metrics. The developed code and custom datasets are publicly available at github.com/AIRI-Institute/ skill-fusion.
first_indexed	2024-04-24T15:40:20Z
format	Article
id	doaj.art-1a6d61f320414488aecae6bbed9623f4
institution	Directory Open Access Journal
issn	2169-3536
language	English
last_indexed	2024-04-24T15:40:20Z
publishDate	2024-01-01
publisher	IEEE
record_format	Article
series	IEEE Access
spelling	doaj.art-1a6d61f320414488aecae6bbed9623f42024-04-01T23:00:21ZengIEEEIEEE Access2169-35362024-01-0112446284463910.1109/ACCESS.2024.338045010477345Interactive Semantic Map Representation for Skill-Based Visual Object NavigationTatiana Zemskova0https://orcid.org/0000-0003-4271-7336Aleksei Staroverov1https://orcid.org/0000-0002-4730-1543Kirill Muravyev2https://orcid.org/0000-0001-5897-0702Dmitry A. Yudin3Aleksandr I. Panov4https://orcid.org/0000-0002-9747-3837Artificial Intelligence Research Institute (AIRI), Moscow, RussiaArtificial Intelligence Research Institute (AIRI), Moscow, RussiaFederal Research Center “Computer Science and Control,”, Moscow, RussiaArtificial Intelligence Research Institute (AIRI), Moscow, RussiaArtificial Intelligence Research Institute (AIRI), Moscow, RussiaVisual object navigation is one of the key tasks in mobile robotics. One of the most important components of this task is the accurate semantic representation of the scene, which is needed to determine and reach a goal object. This paper introduces a new representation of a scene semantic map formed during the embodied agent interaction with the indoor environment. It is based on a neural network method that adjusts the weights of the segmentation model with backpropagation of the predicted fusion loss values during inference on a regular (backward) or delayed (forward) image sequence. We implement this representation into a full-fledged navigation approach called SkillTron. The method can select robot skills from end-to-end policies based on reinforcement learning and classic map-based planning methods. The proposed approach makes it possible to form both intermediate goals for robot exploration and the final goal for object navigation. We conduct intensive experiments with the proposed approach in the Habitat environment, demonstrating its significant superiority over state-of-the-art approaches in terms of navigation quality metrics. The developed code and custom datasets are publicly available at github.com/AIRI-Institute/ skill-fusion.https://ieeexplore.ieee.org/document/10477345/Semantic mapnavigationroboticsreinforcement learningfrontier-based exploration
spellingShingle	Tatiana Zemskova Aleksei Staroverov Kirill Muravyev Dmitry A. Yudin Aleksandr I. Panov Interactive Semantic Map Representation for Skill-Based Visual Object Navigation IEEE Access Semantic map navigation robotics reinforcement learning frontier-based exploration
title	Interactive Semantic Map Representation for Skill-Based Visual Object Navigation
title_full	Interactive Semantic Map Representation for Skill-Based Visual Object Navigation
title_fullStr	Interactive Semantic Map Representation for Skill-Based Visual Object Navigation
title_full_unstemmed	Interactive Semantic Map Representation for Skill-Based Visual Object Navigation
title_short	Interactive Semantic Map Representation for Skill-Based Visual Object Navigation
title_sort	interactive semantic map representation for skill based visual object navigation
topic	Semantic map navigation robotics reinforcement learning frontier-based exploration
url	https://ieeexplore.ieee.org/document/10477345/
work_keys_str_mv	AT tatianazemskova interactivesemanticmaprepresentationforskillbasedvisualobjectnavigation AT alekseistaroverov interactivesemanticmaprepresentationforskillbasedvisualobjectnavigation AT kirillmuravyev interactivesemanticmaprepresentationforskillbasedvisualobjectnavigation AT dmitryayudin interactivesemanticmaprepresentationforskillbasedvisualobjectnavigation AT aleksandripanov interactivesemanticmaprepresentationforskillbasedvisualobjectnavigation

Interactive Semantic Map Representation for Skill-Based Visual Object Navigation

Similar Items