Automated Event Detection and Classification in Soccer: The Potential of Using Multiple Modalities
Detecting events in videos is a complex task, and many different approaches, aimed at a large variety of use-cases, have been proposed in the literature. Most approaches, however, are unimodal and only consider the visual information in the videos. This paper presents and evaluates different approac...
Main Authors: | , , , , , , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
MDPI AG
2021-12-01
|
Series: | Machine Learning and Knowledge Extraction |
Subjects: | |
Online Access: | https://www.mdpi.com/2504-4990/3/4/51 |
_version_ | 1797502850624913408 |
---|---|
author | Olav Andre Nergård Rongved Markus Stige Steven Alexander Hicks Vajira Lasantha Thambawita Cise Midoglu Evi Zouganeli Dag Johansen Michael Alexander Riegler Pål Halvorsen |
author_facet | Olav Andre Nergård Rongved Markus Stige Steven Alexander Hicks Vajira Lasantha Thambawita Cise Midoglu Evi Zouganeli Dag Johansen Michael Alexander Riegler Pål Halvorsen |
author_sort | Olav Andre Nergård Rongved |
collection | DOAJ |
description | Detecting events in videos is a complex task, and many different approaches, aimed at a large variety of use-cases, have been proposed in the literature. Most approaches, however, are unimodal and only consider the visual information in the videos. This paper presents and evaluates different approaches based on neural networks where we combine visual features with audio features to detect (spot) and classify events in soccer videos. We employ model fusion to combine different modalities such as video and audio, and test these combinations against different state-of-the-art models on the SoccerNet dataset. The results show that a multimodal approach is beneficial. We also analyze how the tolerance for delays in classification and spotting time, and the tolerance for prediction accuracy, influence the results. Our experiments show that using multiple modalities improves event detection performance for certain types of events. |
first_indexed | 2024-03-10T03:42:02Z |
format | Article |
id | doaj.art-71af2bc23ba04fd6a1a886b225bd4a89 |
institution | Directory Open Access Journal |
issn | 2504-4990 |
language | English |
last_indexed | 2024-03-10T03:42:02Z |
publishDate | 2021-12-01 |
publisher | MDPI AG |
record_format | Article |
series | Machine Learning and Knowledge Extraction |
spelling | doaj.art-71af2bc23ba04fd6a1a886b225bd4a892023-11-23T09:17:44ZengMDPI AGMachine Learning and Knowledge Extraction2504-49902021-12-01341030105410.3390/make3040051Automated Event Detection and Classification in Soccer: The Potential of Using Multiple ModalitiesOlav Andre Nergård Rongved0Markus Stige1Steven Alexander Hicks2Vajira Lasantha Thambawita3Cise Midoglu4Evi Zouganeli5Dag Johansen6Michael Alexander Riegler7Pål Halvorsen8Department of Computer Science, Oslo Metropolitan University, 0167 Oslo, NorwayDepartment of Informatics, University of Oslo, 0373 Oslo, NorwayDepartment of Computer Science, Oslo Metropolitan University, 0167 Oslo, NorwayDepartment of Computer Science, Oslo Metropolitan University, 0167 Oslo, NorwaySimulaMet, 0167 Oslo, NorwayDepartment of Computer Science, Oslo Metropolitan University, 0167 Oslo, NorwayDepartment of Computer Science, UIT The Arctic University of Norway, 9037 Tromsø, NorwaySimulaMet, 0167 Oslo, NorwaySimulaMet, 0167 Oslo, NorwayDetecting events in videos is a complex task, and many different approaches, aimed at a large variety of use-cases, have been proposed in the literature. Most approaches, however, are unimodal and only consider the visual information in the videos. This paper presents and evaluates different approaches based on neural networks where we combine visual features with audio features to detect (spot) and classify events in soccer videos. We employ model fusion to combine different modalities such as video and audio, and test these combinations against different state-of-the-art models on the SoccerNet dataset. The results show that a multimodal approach is beneficial. We also analyze how the tolerance for delays in classification and spotting time, and the tolerance for prediction accuracy, influence the results. Our experiments show that using multiple modalities improves event detection performance for certain types of events.https://www.mdpi.com/2504-4990/3/4/51audiovideomultimodalityevent classificationevent detectionmachine learning |
spellingShingle | Olav Andre Nergård Rongved Markus Stige Steven Alexander Hicks Vajira Lasantha Thambawita Cise Midoglu Evi Zouganeli Dag Johansen Michael Alexander Riegler Pål Halvorsen Automated Event Detection and Classification in Soccer: The Potential of Using Multiple Modalities Machine Learning and Knowledge Extraction audio video multimodality event classification event detection machine learning |
title | Automated Event Detection and Classification in Soccer: The Potential of Using Multiple Modalities |
title_full | Automated Event Detection and Classification in Soccer: The Potential of Using Multiple Modalities |
title_fullStr | Automated Event Detection and Classification in Soccer: The Potential of Using Multiple Modalities |
title_full_unstemmed | Automated Event Detection and Classification in Soccer: The Potential of Using Multiple Modalities |
title_short | Automated Event Detection and Classification in Soccer: The Potential of Using Multiple Modalities |
title_sort | automated event detection and classification in soccer the potential of using multiple modalities |
topic | audio video multimodality event classification event detection machine learning |
url | https://www.mdpi.com/2504-4990/3/4/51 |
work_keys_str_mv | AT olavandrenergardrongved automatedeventdetectionandclassificationinsoccerthepotentialofusingmultiplemodalities AT markusstige automatedeventdetectionandclassificationinsoccerthepotentialofusingmultiplemodalities AT stevenalexanderhicks automatedeventdetectionandclassificationinsoccerthepotentialofusingmultiplemodalities AT vajiralasanthathambawita automatedeventdetectionandclassificationinsoccerthepotentialofusingmultiplemodalities AT cisemidoglu automatedeventdetectionandclassificationinsoccerthepotentialofusingmultiplemodalities AT evizouganeli automatedeventdetectionandclassificationinsoccerthepotentialofusingmultiplemodalities AT dagjohansen automatedeventdetectionandclassificationinsoccerthepotentialofusingmultiplemodalities AT michaelalexanderriegler automatedeventdetectionandclassificationinsoccerthepotentialofusingmultiplemodalities AT palhalvorsen automatedeventdetectionandclassificationinsoccerthepotentialofusingmultiplemodalities |