Selfie Segmentation in Video Using N-Frames Ensemble

Many camera apps and online video conference solutions support instant selfie segmentation or virtual background function for entertainment, aesthetic, privacy, and security reasons. A good number of studies show that Deep-Learning based segmentation model (DSM) is a reasonable choice for selfie seg...

Full description

Bibliographic Details
Main Authors: Yong-Woon Kim, Yung-Cheol Byun, Addapalli V. N. Krishna, Balachandran Krishnan
Format: Article
Language:English
Published: IEEE 2021-01-01
Series:IEEE Access
Subjects:
Online Access:https://ieeexplore.ieee.org/document/9638657/
_version_ 1819096964477747200
author Yong-Woon Kim
Yung-Cheol Byun
Addapalli V. N. Krishna
Balachandran Krishnan
author_facet Yong-Woon Kim
Yung-Cheol Byun
Addapalli V. N. Krishna
Balachandran Krishnan
author_sort Yong-Woon Kim
collection DOAJ
description Many camera apps and online video conference solutions support instant selfie segmentation or virtual background function for entertainment, aesthetic, privacy, and security reasons. A good number of studies show that Deep-Learning based segmentation model (DSM) is a reasonable choice for selfie segmentation, and the ensemble of multiple DSMs can improve the precision of the segmentation result. However, it is not fit well when we apply these approaches directly to the image segmentation in a video. This paper proposes an N-Frames (NF) ensemble approach for a selfie segmentation in a video using an ensemble of multiple DSMs to achieve a high-performance automatic segmentation. Unlike the N-Models (NM) ensemble which executes multiple DSMs at once for every single video frame, the proposed NF ensemble executes only one DSM upon a current video frame and combines segmentation results of previous frames to produce the final result. For the experiment, we use four state-of-the-art image segmentation models to make an ensemble. We evaluated the proposed approach using 81 videos dataset with a single-person view collected from publicly available websites. To measure the performance of segmentation models, Intersection over Union (IoU), IoU standard deviation, false prediction rate, Memory Efficiency Rate and Computing power Efficiency Rate parameters were considered. The average IoU values of the Two-Models NM ensemble, Two-Frames NF ensemble, Three-Models NM ensemble and Three-Frames NF ensemble were 95.1868%, 95.1253%, 95.3667% and 95.1734% each, whereas the average IoU value of single models was 92.9653%. The result shows that the proposed NF ensemble approach improves the accuracy of selfie segmentation by more than 2% on average. The result of cost efficiency measurement shows that the proposed method consumes less computing power like single models.
first_indexed 2024-12-22T00:07:34Z
format Article
id doaj.art-1b25365ee4534fae95beb74659001a19
institution Directory Open Access Journal
issn 2169-3536
language English
last_indexed 2024-12-22T00:07:34Z
publishDate 2021-01-01
publisher IEEE
record_format Article
series IEEE Access
spelling doaj.art-1b25365ee4534fae95beb74659001a192022-12-21T18:45:32ZengIEEEIEEE Access2169-35362021-01-01916334816336210.1109/ACCESS.2021.31332769638657Selfie Segmentation in Video Using N-Frames EnsembleYong-Woon Kim0https://orcid.org/0000-0002-4759-0138Yung-Cheol Byun1https://orcid.org/0000-0003-1107-9941Addapalli V. N. Krishna2https://orcid.org/0000-0002-3835-511XBalachandran Krishnan3https://orcid.org/0000-0002-9051-8801Centre for Digital Innovation, CHRIST University (Deemed to be University), Bengaluru, Karnataka, IndiaDepartment of Computer Engineering, Jeju National University, Jeju-si, South KoreaDepartment of Computer Science and Engineering, CHRIST University (Deemed to be University), Bengaluru, Karnataka, IndiaDepartment of Computer Science and Engineering, CHRIST University (Deemed to be University), Bengaluru, Karnataka, IndiaMany camera apps and online video conference solutions support instant selfie segmentation or virtual background function for entertainment, aesthetic, privacy, and security reasons. A good number of studies show that Deep-Learning based segmentation model (DSM) is a reasonable choice for selfie segmentation, and the ensemble of multiple DSMs can improve the precision of the segmentation result. However, it is not fit well when we apply these approaches directly to the image segmentation in a video. This paper proposes an N-Frames (NF) ensemble approach for a selfie segmentation in a video using an ensemble of multiple DSMs to achieve a high-performance automatic segmentation. Unlike the N-Models (NM) ensemble which executes multiple DSMs at once for every single video frame, the proposed NF ensemble executes only one DSM upon a current video frame and combines segmentation results of previous frames to produce the final result. For the experiment, we use four state-of-the-art image segmentation models to make an ensemble. We evaluated the proposed approach using 81 videos dataset with a single-person view collected from publicly available websites. To measure the performance of segmentation models, Intersection over Union (IoU), IoU standard deviation, false prediction rate, Memory Efficiency Rate and Computing power Efficiency Rate parameters were considered. The average IoU values of the Two-Models NM ensemble, Two-Frames NF ensemble, Three-Models NM ensemble and Three-Frames NF ensemble were 95.1868%, 95.1253%, 95.3667% and 95.1734% each, whereas the average IoU value of single models was 92.9653%. The result shows that the proposed NF ensemble approach improves the accuracy of selfie segmentation by more than 2% on average. The result of cost efficiency measurement shows that the proposed method consumes less computing power like single models.https://ieeexplore.ieee.org/document/9638657/Deep learningensembleimage segmentationmulti-framesneural networkselfie
spellingShingle Yong-Woon Kim
Yung-Cheol Byun
Addapalli V. N. Krishna
Balachandran Krishnan
Selfie Segmentation in Video Using N-Frames Ensemble
IEEE Access
Deep learning
ensemble
image segmentation
multi-frames
neural network
selfie
title Selfie Segmentation in Video Using N-Frames Ensemble
title_full Selfie Segmentation in Video Using N-Frames Ensemble
title_fullStr Selfie Segmentation in Video Using N-Frames Ensemble
title_full_unstemmed Selfie Segmentation in Video Using N-Frames Ensemble
title_short Selfie Segmentation in Video Using N-Frames Ensemble
title_sort selfie segmentation in video using n frames ensemble
topic Deep learning
ensemble
image segmentation
multi-frames
neural network
selfie
url https://ieeexplore.ieee.org/document/9638657/
work_keys_str_mv AT yongwoonkim selfiesegmentationinvideousingnframesensemble
AT yungcheolbyun selfiesegmentationinvideousingnframesensemble
AT addapallivnkrishna selfiesegmentationinvideousingnframesensemble
AT balachandrankrishnan selfiesegmentationinvideousingnframesensemble