Multi-Modal Visual Features-Based Video Shot Boundary Detection

One of the essential pre-processing steps of semantic video analysis is the video shot boundary detection (SBD). It is the primary step to segment the sequence of video frames into shots. Many SBD systems using supervised learning have been proposed for years; however, the training process still rem...

Full description

Bibliographic Details
Main Authors: Sawitchaya Tippaya, Suchada Sitjongsataporn, Tele Tan, Masood Mehmood Khan, Kosin Chamnongthai
Format: Article
Language:English
Published: IEEE 2017-01-01
Series:IEEE Access
Subjects:
Online Access:https://ieeexplore.ieee.org/document/7954599/
_version_ 1818876915794051072
author Sawitchaya Tippaya
Suchada Sitjongsataporn
Tele Tan
Masood Mehmood Khan
Kosin Chamnongthai
author_facet Sawitchaya Tippaya
Suchada Sitjongsataporn
Tele Tan
Masood Mehmood Khan
Kosin Chamnongthai
author_sort Sawitchaya Tippaya
collection DOAJ
description One of the essential pre-processing steps of semantic video analysis is the video shot boundary detection (SBD). It is the primary step to segment the sequence of video frames into shots. Many SBD systems using supervised learning have been proposed for years; however, the training process still remains its principal limitation. In this paper, a multi-modal visual features-based SBD framework is employed that aims to analyze the behaviors of visual representation in terms of the discontinuity signal. We adopt a candidate segment selection that performs without the threshold calculation but uses the cumulative moving average of the discontinuity signal to identify the position of shot boundaries and neglect the non-boundary video frames. The transition detection is structurally performed to distinguish candidate segment into a cut transition and a gradual transition, including fade in/out and logo occurrence. Experimental results are evaluated using the golf video clips and the TREC2001 documentary video data set. Results show that the proposed SBD framework can achieve good accuracy in both types of video data set compared with other proposed SBD methods.
first_indexed 2024-12-19T13:49:59Z
format Article
id doaj.art-f7064bb7a5ad47ebb150f5d9a0bd6900
institution Directory Open Access Journal
issn 2169-3536
language English
last_indexed 2024-12-19T13:49:59Z
publishDate 2017-01-01
publisher IEEE
record_format Article
series IEEE Access
spelling doaj.art-f7064bb7a5ad47ebb150f5d9a0bd69002022-12-21T20:18:46ZengIEEEIEEE Access2169-35362017-01-015125631257510.1109/ACCESS.2017.27179987954599Multi-Modal Visual Features-Based Video Shot Boundary DetectionSawitchaya Tippaya0Suchada Sitjongsataporn1Tele Tan2Masood Mehmood Khan3Kosin Chamnongthai4https://orcid.org/0000-0003-1509-5754Department of Electronics and Telecommunication Engineering, King Mongkut’s University of Technology Thonburi, Bangkok, ThailandDepartment of Electronic Engineering, Mahanakorn University of Technology, Bangkok, ThailandDepartment of Mechanical Engineering, Faculty of Science and Engineering, Curtin University, Bentley Campus, Perth, WA, AustraliaDepartment of Mechanical Engineering, Faculty of Science and Engineering, Curtin University, Bentley Campus, Perth, WA, AustraliaDepartment of Electronics and Telecommunication Engineering, King Mongkut’s University of Technology Thonburi, Bangkok, ThailandOne of the essential pre-processing steps of semantic video analysis is the video shot boundary detection (SBD). It is the primary step to segment the sequence of video frames into shots. Many SBD systems using supervised learning have been proposed for years; however, the training process still remains its principal limitation. In this paper, a multi-modal visual features-based SBD framework is employed that aims to analyze the behaviors of visual representation in terms of the discontinuity signal. We adopt a candidate segment selection that performs without the threshold calculation but uses the cumulative moving average of the discontinuity signal to identify the position of shot boundaries and neglect the non-boundary video frames. The transition detection is structurally performed to distinguish candidate segment into a cut transition and a gradual transition, including fade in/out and logo occurrence. Experimental results are evaluated using the golf video clips and the TREC2001 documentary video data set. Results show that the proposed SBD framework can achieve good accuracy in both types of video data set compared with other proposed SBD methods.https://ieeexplore.ieee.org/document/7954599/Cut transition detectiongradual transition detectiongolf video analysislogo transition detectiontransition pattern analysisvideo shot boundary detection
spellingShingle Sawitchaya Tippaya
Suchada Sitjongsataporn
Tele Tan
Masood Mehmood Khan
Kosin Chamnongthai
Multi-Modal Visual Features-Based Video Shot Boundary Detection
IEEE Access
Cut transition detection
gradual transition detection
golf video analysis
logo transition detection
transition pattern analysis
video shot boundary detection
title Multi-Modal Visual Features-Based Video Shot Boundary Detection
title_full Multi-Modal Visual Features-Based Video Shot Boundary Detection
title_fullStr Multi-Modal Visual Features-Based Video Shot Boundary Detection
title_full_unstemmed Multi-Modal Visual Features-Based Video Shot Boundary Detection
title_short Multi-Modal Visual Features-Based Video Shot Boundary Detection
title_sort multi modal visual features based video shot boundary detection
topic Cut transition detection
gradual transition detection
golf video analysis
logo transition detection
transition pattern analysis
video shot boundary detection
url https://ieeexplore.ieee.org/document/7954599/
work_keys_str_mv AT sawitchayatippaya multimodalvisualfeaturesbasedvideoshotboundarydetection
AT suchadasitjongsataporn multimodalvisualfeaturesbasedvideoshotboundarydetection
AT teletan multimodalvisualfeaturesbasedvideoshotboundarydetection
AT masoodmehmoodkhan multimodalvisualfeaturesbasedvideoshotboundarydetection
AT kosinchamnongthai multimodalvisualfeaturesbasedvideoshotboundarydetection