Automated Event Detection and Classification in Soccer: The Potential of Using Multiple Modalities

Detecting events in videos is a complex task, and many different approaches, aimed at a large variety of use-cases, have been proposed in the literature. Most approaches, however, are unimodal and only consider the visual information in the videos. This paper presents and evaluates different approac...

Full description

Bibliographic Details
Main Authors: Olav Andre Nergård Rongved, Markus Stige, Steven Alexander Hicks, Vajira Lasantha Thambawita, Cise Midoglu, Evi Zouganeli, Dag Johansen, Michael Alexander Riegler, Pål Halvorsen
Format: Article
Language:English
Published: MDPI AG 2021-12-01
Series:Machine Learning and Knowledge Extraction
Subjects:
Online Access:https://www.mdpi.com/2504-4990/3/4/51
_version_ 1797502850624913408
author Olav Andre Nergård Rongved
Markus Stige
Steven Alexander Hicks
Vajira Lasantha Thambawita
Cise Midoglu
Evi Zouganeli
Dag Johansen
Michael Alexander Riegler
Pål Halvorsen
author_facet Olav Andre Nergård Rongved
Markus Stige
Steven Alexander Hicks
Vajira Lasantha Thambawita
Cise Midoglu
Evi Zouganeli
Dag Johansen
Michael Alexander Riegler
Pål Halvorsen
author_sort Olav Andre Nergård Rongved
collection DOAJ
description Detecting events in videos is a complex task, and many different approaches, aimed at a large variety of use-cases, have been proposed in the literature. Most approaches, however, are unimodal and only consider the visual information in the videos. This paper presents and evaluates different approaches based on neural networks where we combine visual features with audio features to detect (spot) and classify events in soccer videos. We employ model fusion to combine different modalities such as video and audio, and test these combinations against different state-of-the-art models on the SoccerNet dataset. The results show that a multimodal approach is beneficial. We also analyze how the tolerance for delays in classification and spotting time, and the tolerance for prediction accuracy, influence the results. Our experiments show that using multiple modalities improves event detection performance for certain types of events.
first_indexed 2024-03-10T03:42:02Z
format Article
id doaj.art-71af2bc23ba04fd6a1a886b225bd4a89
institution Directory Open Access Journal
issn 2504-4990
language English
last_indexed 2024-03-10T03:42:02Z
publishDate 2021-12-01
publisher MDPI AG
record_format Article
series Machine Learning and Knowledge Extraction
spelling doaj.art-71af2bc23ba04fd6a1a886b225bd4a892023-11-23T09:17:44ZengMDPI AGMachine Learning and Knowledge Extraction2504-49902021-12-01341030105410.3390/make3040051Automated Event Detection and Classification in Soccer: The Potential of Using Multiple ModalitiesOlav Andre Nergård Rongved0Markus Stige1Steven Alexander Hicks2Vajira Lasantha Thambawita3Cise Midoglu4Evi Zouganeli5Dag Johansen6Michael Alexander Riegler7Pål Halvorsen8Department of Computer Science, Oslo Metropolitan University, 0167 Oslo, NorwayDepartment of Informatics, University of Oslo, 0373 Oslo, NorwayDepartment of Computer Science, Oslo Metropolitan University, 0167 Oslo, NorwayDepartment of Computer Science, Oslo Metropolitan University, 0167 Oslo, NorwaySimulaMet, 0167 Oslo, NorwayDepartment of Computer Science, Oslo Metropolitan University, 0167 Oslo, NorwayDepartment of Computer Science, UIT The Arctic University of Norway, 9037 Tromsø, NorwaySimulaMet, 0167 Oslo, NorwaySimulaMet, 0167 Oslo, NorwayDetecting events in videos is a complex task, and many different approaches, aimed at a large variety of use-cases, have been proposed in the literature. Most approaches, however, are unimodal and only consider the visual information in the videos. This paper presents and evaluates different approaches based on neural networks where we combine visual features with audio features to detect (spot) and classify events in soccer videos. We employ model fusion to combine different modalities such as video and audio, and test these combinations against different state-of-the-art models on the SoccerNet dataset. The results show that a multimodal approach is beneficial. We also analyze how the tolerance for delays in classification and spotting time, and the tolerance for prediction accuracy, influence the results. Our experiments show that using multiple modalities improves event detection performance for certain types of events.https://www.mdpi.com/2504-4990/3/4/51audiovideomultimodalityevent classificationevent detectionmachine learning
spellingShingle Olav Andre Nergård Rongved
Markus Stige
Steven Alexander Hicks
Vajira Lasantha Thambawita
Cise Midoglu
Evi Zouganeli
Dag Johansen
Michael Alexander Riegler
Pål Halvorsen
Automated Event Detection and Classification in Soccer: The Potential of Using Multiple Modalities
Machine Learning and Knowledge Extraction
audio
video
multimodality
event classification
event detection
machine learning
title Automated Event Detection and Classification in Soccer: The Potential of Using Multiple Modalities
title_full Automated Event Detection and Classification in Soccer: The Potential of Using Multiple Modalities
title_fullStr Automated Event Detection and Classification in Soccer: The Potential of Using Multiple Modalities
title_full_unstemmed Automated Event Detection and Classification in Soccer: The Potential of Using Multiple Modalities
title_short Automated Event Detection and Classification in Soccer: The Potential of Using Multiple Modalities
title_sort automated event detection and classification in soccer the potential of using multiple modalities
topic audio
video
multimodality
event classification
event detection
machine learning
url https://www.mdpi.com/2504-4990/3/4/51
work_keys_str_mv AT olavandrenergardrongved automatedeventdetectionandclassificationinsoccerthepotentialofusingmultiplemodalities
AT markusstige automatedeventdetectionandclassificationinsoccerthepotentialofusingmultiplemodalities
AT stevenalexanderhicks automatedeventdetectionandclassificationinsoccerthepotentialofusingmultiplemodalities
AT vajiralasanthathambawita automatedeventdetectionandclassificationinsoccerthepotentialofusingmultiplemodalities
AT cisemidoglu automatedeventdetectionandclassificationinsoccerthepotentialofusingmultiplemodalities
AT evizouganeli automatedeventdetectionandclassificationinsoccerthepotentialofusingmultiplemodalities
AT dagjohansen automatedeventdetectionandclassificationinsoccerthepotentialofusingmultiplemodalities
AT michaelalexanderriegler automatedeventdetectionandclassificationinsoccerthepotentialofusingmultiplemodalities
AT palhalvorsen automatedeventdetectionandclassificationinsoccerthepotentialofusingmultiplemodalities