Summary: | Many indoor robots operate in environments designed to support human activities. Understanding probable human actions in such surroundings is crucial for facilitating better human-robot interactions. This article presents an innovative approach to map unseen human actions in indoor environments by leveraging spatial affordances learned from geometric features extracted from point clouds captured by 3D cameras. Instead of directly observing real people to understand human context, the method utilizes virtual human models and their interactions with the environment to uncover hidden human affordances. This approach proves to be efficient for learning the affordance map, even when dealing with highly imbalanced datasets. To achieve this, we employ a supervised learning model optimized for the F1 score, using the Structured-SVM (S-SVM) architecture. We conducted experiments with actual 3D scenes, evaluating various affordance types both qualitatively and quantitatively. The results show that the proposed S-SVM-based method outperforms other models, demonstrating its effectiveness in efficiently mapping human context in indoor environments. The S-SVM-based method outperformed other models, demonstrating efficient human context mapping in indoor environments.
|