Summary: | This work studies the task of indoor object goal navigation, a widely-studied task that requires the agent to navigate to an instance of a given object category in unseen indoor environments. Previous state-of-the-art methods to this task include mapfree end-to-end learning-based methods and methods that maintain and plan with spatial maps, but they both struggle to perform well in the task. Experiments show that the primary reasons for failures are poor exploration, agent getting trapped, and inaccurate object identification. For exploration strategy, we show that previous mapbased methods fail to use semantic clues effectively and present our semantic-agnostic exploration strategy that proves to perform much better. For object identification, we show that using cumulative information across multiple frames leads to higher accuracy in object identification. We additionally present our methods for decreasing the agent’s chance of getting stuck. The combination of our work leads to the winning entry on the leader board of the CVPR Habitat ObjectNav challenge.
|