Multimodal Egocentric Analysis of Focused Interactions
Continuous detection of social interactions from wearable sensor data streams has a range of potential applications in domains, including health and social care, security, and assistive technology. We contribute an annotated, multimodal data set capturing such interactions using video, audio, GPS, a...
Main Authors: | , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
IEEE
2018-01-01
|
Series: | IEEE Access |
Subjects: | |
Online Access: | https://ieeexplore.ieee.org/document/8395274/ |
_version_ | 1818617920706576384 |
---|---|
author | Sophia Bano Tamas Suveges Jianguo Zhang Stephen J. Mckenna |
author_facet | Sophia Bano Tamas Suveges Jianguo Zhang Stephen J. Mckenna |
author_sort | Sophia Bano |
collection | DOAJ |
description | Continuous detection of social interactions from wearable sensor data streams has a range of potential applications in domains, including health and social care, security, and assistive technology. We contribute an annotated, multimodal data set capturing such interactions using video, audio, GPS, and inertial sensing. We present methods for automatic detection and temporal segmentation of focused interactions using support vector machines and recurrent neural networks with features extracted from both audio and video streams. The focused interaction occurs when the co-present individuals, having the mutual focus of attention, interact by first establishing the face-to-face engagement and direct conversation. We describe an evaluation protocol, including framewise, extended framewise, and event-based measures, and provide empirical evidence that the fusion of visual face track scores with audio voice activity scores provides an effective combination. The methods, contributed data set, and protocol together provide a benchmark for the future research on this problem. |
first_indexed | 2024-12-16T17:13:22Z |
format | Article |
id | doaj.art-831129b0688e48c3a08fb1d7f4db67cf |
institution | Directory Open Access Journal |
issn | 2169-3536 |
language | English |
last_indexed | 2024-12-16T17:13:22Z |
publishDate | 2018-01-01 |
publisher | IEEE |
record_format | Article |
series | IEEE Access |
spelling | doaj.art-831129b0688e48c3a08fb1d7f4db67cf2022-12-21T22:23:22ZengIEEEIEEE Access2169-35362018-01-016374933750510.1109/ACCESS.2018.28502848395274Multimodal Egocentric Analysis of Focused InteractionsSophia Bano0https://orcid.org/0000-0003-1329-4565Tamas Suveges1Jianguo Zhang2Stephen J. Mckenna3https://orcid.org/0000-0003-0530-2035Computer Vision and Image Processing Group, School of Science and Engineering, Queen Mother Building, University of Dundee, Dundee, U.K.Computer Vision and Image Processing Group, School of Science and Engineering, Queen Mother Building, University of Dundee, Dundee, U.K.Computer Vision and Image Processing Group, School of Science and Engineering, Queen Mother Building, University of Dundee, Dundee, U.K.Computer Vision and Image Processing Group, School of Science and Engineering, Queen Mother Building, University of Dundee, Dundee, U.K.Continuous detection of social interactions from wearable sensor data streams has a range of potential applications in domains, including health and social care, security, and assistive technology. We contribute an annotated, multimodal data set capturing such interactions using video, audio, GPS, and inertial sensing. We present methods for automatic detection and temporal segmentation of focused interactions using support vector machines and recurrent neural networks with features extracted from both audio and video streams. The focused interaction occurs when the co-present individuals, having the mutual focus of attention, interact by first establishing the face-to-face engagement and direct conversation. We describe an evaluation protocol, including framewise, extended framewise, and event-based measures, and provide empirical evidence that the fusion of visual face track scores with audio voice activity scores provides an effective combination. The methods, contributed data set, and protocol together provide a benchmark for the future research on this problem.https://ieeexplore.ieee.org/document/8395274/Social interactionegocentric sensingmultimodal analysistemporal segmentation |
spellingShingle | Sophia Bano Tamas Suveges Jianguo Zhang Stephen J. Mckenna Multimodal Egocentric Analysis of Focused Interactions IEEE Access Social interaction egocentric sensing multimodal analysis temporal segmentation |
title | Multimodal Egocentric Analysis of Focused Interactions |
title_full | Multimodal Egocentric Analysis of Focused Interactions |
title_fullStr | Multimodal Egocentric Analysis of Focused Interactions |
title_full_unstemmed | Multimodal Egocentric Analysis of Focused Interactions |
title_short | Multimodal Egocentric Analysis of Focused Interactions |
title_sort | multimodal egocentric analysis of focused interactions |
topic | Social interaction egocentric sensing multimodal analysis temporal segmentation |
url | https://ieeexplore.ieee.org/document/8395274/ |
work_keys_str_mv | AT sophiabano multimodalegocentricanalysisoffocusedinteractions AT tamassuveges multimodalegocentricanalysisoffocusedinteractions AT jianguozhang multimodalegocentricanalysisoffocusedinteractions AT stephenjmckenna multimodalegocentricanalysisoffocusedinteractions |