Time-dependent bag of words on manifolds for geodesic-based classification of video activities towards assisted living and healthcare

Abstract In this paper, we address the problem of classifying activities of daily living (ADL) in video. The basic idea of the proposed method is to treat each human activity in the video as a temporal sequence of points on a Riemannian manifold and classify such time series with a geodesic-based ke...

Full description

Bibliographic Details
Main Authors: Yixiao Yun, Irene Yu-Hua Gu
Format: Article
Language:English
Published: SpringerOpen 2017-11-01
Series:EURASIP Journal on Image and Video Processing
Subjects:
Online Access:http://link.springer.com/article/10.1186/s13640-017-0220-3
_version_ 1811218114189197312
author Yixiao Yun
Irene Yu-Hua Gu
author_facet Yixiao Yun
Irene Yu-Hua Gu
author_sort Yixiao Yun
collection DOAJ
description Abstract In this paper, we address the problem of classifying activities of daily living (ADL) in video. The basic idea of the proposed method is to treat each human activity in the video as a temporal sequence of points on a Riemannian manifold and classify such time series with a geodesic-based kernel. The main novelties of this paper are summarized as follows: (a) for each frame of a video, low-level features of body pose and human-object interaction are unified by a covariance matrix, i.e., a manifold point in the space of symmetric positive definite (SPD) matrices Sy m + d $Sym_{+}^{d}$ ; (b) a time-dependent bag-of-words (BoW+T) model is built, where its codebook is generated by clustering per-frame covariance matrices on Sy m + d $Sym_{+}^{d}$ ; (c) for each video, high-level BoW+T features are extracted from its corresponding sequence of per-frame covariance matrices; and (d) for activity classification, a positive definite kernel is formulated, taking into account the underlying geometry of our BoW+T features, i.e., the unit n-sphere. Experiments were conducted on two video datasets. The first dataset contains 8 activity classes with a total of 943 videos, and the second one contains 7 activity classes with a total of 224 videos. The proposed method achieved high accuracy (average 89.66%) and small false alarms (average 1.43%) on the first dataset. Comparison with six exisiting methods on the second dataset showed further evidence on the effectiveness of the proposed method.
first_indexed 2024-04-12T07:05:09Z
format Article
id doaj.art-a7098220d60347c491d46b9c83d6484a
institution Directory Open Access Journal
issn 1687-5281
language English
last_indexed 2024-04-12T07:05:09Z
publishDate 2017-11-01
publisher SpringerOpen
record_format Article
series EURASIP Journal on Image and Video Processing
spelling doaj.art-a7098220d60347c491d46b9c83d6484a2022-12-22T03:42:52ZengSpringerOpenEURASIP Journal on Image and Video Processing1687-52812017-11-012017111310.1186/s13640-017-0220-3Time-dependent bag of words on manifolds for geodesic-based classification of video activities towards assisted living and healthcareYixiao Yun0Irene Yu-Hua Gu1Department of Electrical Engineering, Chalmers University of TechnologyDepartment of Electrical Engineering, Chalmers University of TechnologyAbstract In this paper, we address the problem of classifying activities of daily living (ADL) in video. The basic idea of the proposed method is to treat each human activity in the video as a temporal sequence of points on a Riemannian manifold and classify such time series with a geodesic-based kernel. The main novelties of this paper are summarized as follows: (a) for each frame of a video, low-level features of body pose and human-object interaction are unified by a covariance matrix, i.e., a manifold point in the space of symmetric positive definite (SPD) matrices Sy m + d $Sym_{+}^{d}$ ; (b) a time-dependent bag-of-words (BoW+T) model is built, where its codebook is generated by clustering per-frame covariance matrices on Sy m + d $Sym_{+}^{d}$ ; (c) for each video, high-level BoW+T features are extracted from its corresponding sequence of per-frame covariance matrices; and (d) for activity classification, a positive definite kernel is formulated, taking into account the underlying geometry of our BoW+T features, i.e., the unit n-sphere. Experiments were conducted on two video datasets. The first dataset contains 8 activity classes with a total of 943 videos, and the second one contains 7 activity classes with a total of 224 videos. The proposed method achieved high accuracy (average 89.66%) and small false alarms (average 1.43%) on the first dataset. Comparison with six exisiting methods on the second dataset showed further evidence on the effectiveness of the proposed method.http://link.springer.com/article/10.1186/s13640-017-0220-3Activity of daily living (ADL)Riemannian manifoldsTime-dependent bag-of-words (BoW+T) modelAssisted livingHealthcare
spellingShingle Yixiao Yun
Irene Yu-Hua Gu
Time-dependent bag of words on manifolds for geodesic-based classification of video activities towards assisted living and healthcare
EURASIP Journal on Image and Video Processing
Activity of daily living (ADL)
Riemannian manifolds
Time-dependent bag-of-words (BoW+T) model
Assisted living
Healthcare
title Time-dependent bag of words on manifolds for geodesic-based classification of video activities towards assisted living and healthcare
title_full Time-dependent bag of words on manifolds for geodesic-based classification of video activities towards assisted living and healthcare
title_fullStr Time-dependent bag of words on manifolds for geodesic-based classification of video activities towards assisted living and healthcare
title_full_unstemmed Time-dependent bag of words on manifolds for geodesic-based classification of video activities towards assisted living and healthcare
title_short Time-dependent bag of words on manifolds for geodesic-based classification of video activities towards assisted living and healthcare
title_sort time dependent bag of words on manifolds for geodesic based classification of video activities towards assisted living and healthcare
topic Activity of daily living (ADL)
Riemannian manifolds
Time-dependent bag-of-words (BoW+T) model
Assisted living
Healthcare
url http://link.springer.com/article/10.1186/s13640-017-0220-3
work_keys_str_mv AT yixiaoyun timedependentbagofwordsonmanifoldsforgeodesicbasedclassificationofvideoactivitiestowardsassistedlivingandhealthcare
AT ireneyuhuagu timedependentbagofwordsonmanifoldsforgeodesicbasedclassificationofvideoactivitiestowardsassistedlivingandhealthcare