An Overview of the IberSpeech-RTVE 2022 Challenges on Speech Technologies

Evaluation campaigns provide a common framework with which the progress of speech technologies can be effectively measured. The aim of this paper is to present a detailed overview of the IberSpeech-RTVE 2022 Challenges, which were organized as part of the IberSpeech 2022 conference under the ongoing...

Full description

Bibliographic Details
Main Authors: Eduardo Lleida, Luis Javier Rodriguez-Fuentes, Javier Tejedor, Alfonso Ortega, Antonio Miguel, Virginia Bazán, Carmen Pérez, Alberto de Prada, Mikel Penagarikano, Amparo Varona, Germán Bordel, Doroteo Torre-Toledano, Aitor Álvarez, Haritz Arzelus
Format: Article
Language:English
Published: MDPI AG 2023-07-01
Series:Applied Sciences
Subjects:
Online Access:https://www.mdpi.com/2076-3417/13/15/8577
_version_ 1797587092701708288
author Eduardo Lleida
Luis Javier Rodriguez-Fuentes
Javier Tejedor
Alfonso Ortega
Antonio Miguel
Virginia Bazán
Carmen Pérez
Alberto de Prada
Mikel Penagarikano
Amparo Varona
Germán Bordel
Doroteo Torre-Toledano
Aitor Álvarez
Haritz Arzelus
author_facet Eduardo Lleida
Luis Javier Rodriguez-Fuentes
Javier Tejedor
Alfonso Ortega
Antonio Miguel
Virginia Bazán
Carmen Pérez
Alberto de Prada
Mikel Penagarikano
Amparo Varona
Germán Bordel
Doroteo Torre-Toledano
Aitor Álvarez
Haritz Arzelus
author_sort Eduardo Lleida
collection DOAJ
description Evaluation campaigns provide a common framework with which the progress of speech technologies can be effectively measured. The aim of this paper is to present a detailed overview of the IberSpeech-RTVE 2022 Challenges, which were organized as part of the IberSpeech 2022 conference under the ongoing series of Albayzin evaluation campaigns. In the 2022 edition, four challenges were launched: (1) speech-to-text transcription; (2) speaker diarization and identity assignment; (3) text and speech alignment; and (4) search on speech. Different databases that cover different domains (e.g., broadcast news, conference talks, parliament sessions) were released for those challenges. The submitted systems also cover a wide range of speech processing methods, which include hidden Markov model-based approaches, end-to-end neural network-based methods, hybrid approaches, etc. This paper describes the databases, the tasks and the performance metrics used in the four challenges. It also provides the most relevant features of the submitted systems and briefly presents and discusses the obtained results. Despite employing state-of-the-art technology, the relatively poor performance attained in some of the challenges reveals that there is still room for improvement. This encourages us to carry on with the Albayzin evaluation campaigns in the coming years.
first_indexed 2024-03-11T00:32:20Z
format Article
id doaj.art-75d032e326114ebca601ea06a1a3cd04
institution Directory Open Access Journal
issn 2076-3417
language English
last_indexed 2024-03-11T00:32:20Z
publishDate 2023-07-01
publisher MDPI AG
record_format Article
series Applied Sciences
spelling doaj.art-75d032e326114ebca601ea06a1a3cd042023-11-18T22:34:49ZengMDPI AGApplied Sciences2076-34172023-07-011315857710.3390/app13158577An Overview of the IberSpeech-RTVE 2022 Challenges on Speech TechnologiesEduardo Lleida0Luis Javier Rodriguez-Fuentes1Javier Tejedor2Alfonso Ortega3Antonio Miguel4Virginia Bazán5Carmen Pérez6Alberto de Prada7Mikel Penagarikano8Amparo Varona9Germán Bordel10Doroteo Torre-Toledano11Aitor Álvarez12Haritz Arzelus13Vivolab, Aragon Institute for Engineering Research (I3A), University of Zaragoza, 50018 Zaragoza, SpainDepartment of Electricity and Electronics, Faculty of Science and Technology, University of the Basque Country (UPV/EHU), Barrio Sarriena, 48940 Leioa, SpainInstitute of Technology, Universidad San Pablo-CEU, CEU Universities, Urbanización Montepríncipe, 28668 Boadilla del Monte, SpainVivolab, Aragon Institute for Engineering Research (I3A), University of Zaragoza, 50018 Zaragoza, SpainVivolab, Aragon Institute for Engineering Research (I3A), University of Zaragoza, 50018 Zaragoza, SpainCorporación Radiotelevisión Española, 28223 Madrid, SpainCorporación Radiotelevisión Española, 28223 Madrid, SpainCorporación Radiotelevisión Española, 28223 Madrid, SpainDepartment of Electricity and Electronics, Faculty of Science and Technology, University of the Basque Country (UPV/EHU), Barrio Sarriena, 48940 Leioa, SpainDepartment of Electricity and Electronics, Faculty of Science and Technology, University of the Basque Country (UPV/EHU), Barrio Sarriena, 48940 Leioa, SpainDepartment of Electricity and Electronics, Faculty of Science and Technology, University of the Basque Country (UPV/EHU), Barrio Sarriena, 48940 Leioa, SpainAUDIAS, Electronic and Communication Technology Department, Escuela Politécnica Superior, Universidad Autónoma de Madrid, Av. Francisco Tomás y Valiente, 11, 28049 Madrid, SpainFundación Vicomtech, Basque Research and Technology Alliance (BRTA), Mikeletegi 57, 20009 Donostia-San Sebastián, SpainFundación Vicomtech, Basque Research and Technology Alliance (BRTA), Mikeletegi 57, 20009 Donostia-San Sebastián, SpainEvaluation campaigns provide a common framework with which the progress of speech technologies can be effectively measured. The aim of this paper is to present a detailed overview of the IberSpeech-RTVE 2022 Challenges, which were organized as part of the IberSpeech 2022 conference under the ongoing series of Albayzin evaluation campaigns. In the 2022 edition, four challenges were launched: (1) speech-to-text transcription; (2) speaker diarization and identity assignment; (3) text and speech alignment; and (4) search on speech. Different databases that cover different domains (e.g., broadcast news, conference talks, parliament sessions) were released for those challenges. The submitted systems also cover a wide range of speech processing methods, which include hidden Markov model-based approaches, end-to-end neural network-based methods, hybrid approaches, etc. This paper describes the databases, the tasks and the performance metrics used in the four challenges. It also provides the most relevant features of the submitted systems and briefly presents and discusses the obtained results. Despite employing state-of-the-art technology, the relatively poor performance attained in some of the challenges reveals that there is still room for improvement. This encourages us to carry on with the Albayzin evaluation campaigns in the coming years.https://www.mdpi.com/2076-3417/13/15/8577IberSpeech ChallengeRTVE2022 databaseAlbayzin evaluationsspeech-to-text transcriptionspeaker diarization and identity assignmenttext and speech alignment
spellingShingle Eduardo Lleida
Luis Javier Rodriguez-Fuentes
Javier Tejedor
Alfonso Ortega
Antonio Miguel
Virginia Bazán
Carmen Pérez
Alberto de Prada
Mikel Penagarikano
Amparo Varona
Germán Bordel
Doroteo Torre-Toledano
Aitor Álvarez
Haritz Arzelus
An Overview of the IberSpeech-RTVE 2022 Challenges on Speech Technologies
Applied Sciences
IberSpeech Challenge
RTVE2022 database
Albayzin evaluations
speech-to-text transcription
speaker diarization and identity assignment
text and speech alignment
title An Overview of the IberSpeech-RTVE 2022 Challenges on Speech Technologies
title_full An Overview of the IberSpeech-RTVE 2022 Challenges on Speech Technologies
title_fullStr An Overview of the IberSpeech-RTVE 2022 Challenges on Speech Technologies
title_full_unstemmed An Overview of the IberSpeech-RTVE 2022 Challenges on Speech Technologies
title_short An Overview of the IberSpeech-RTVE 2022 Challenges on Speech Technologies
title_sort overview of the iberspeech rtve 2022 challenges on speech technologies
topic IberSpeech Challenge
RTVE2022 database
Albayzin evaluations
speech-to-text transcription
speaker diarization and identity assignment
text and speech alignment
url https://www.mdpi.com/2076-3417/13/15/8577
work_keys_str_mv AT eduardolleida anoverviewoftheiberspeechrtve2022challengesonspeechtechnologies
AT luisjavierrodriguezfuentes anoverviewoftheiberspeechrtve2022challengesonspeechtechnologies
AT javiertejedor anoverviewoftheiberspeechrtve2022challengesonspeechtechnologies
AT alfonsoortega anoverviewoftheiberspeechrtve2022challengesonspeechtechnologies
AT antoniomiguel anoverviewoftheiberspeechrtve2022challengesonspeechtechnologies
AT virginiabazan anoverviewoftheiberspeechrtve2022challengesonspeechtechnologies
AT carmenperez anoverviewoftheiberspeechrtve2022challengesonspeechtechnologies
AT albertodeprada anoverviewoftheiberspeechrtve2022challengesonspeechtechnologies
AT mikelpenagarikano anoverviewoftheiberspeechrtve2022challengesonspeechtechnologies
AT amparovarona anoverviewoftheiberspeechrtve2022challengesonspeechtechnologies
AT germanbordel anoverviewoftheiberspeechrtve2022challengesonspeechtechnologies
AT doroteotorretoledano anoverviewoftheiberspeechrtve2022challengesonspeechtechnologies
AT aitoralvarez anoverviewoftheiberspeechrtve2022challengesonspeechtechnologies
AT haritzarzelus anoverviewoftheiberspeechrtve2022challengesonspeechtechnologies
AT eduardolleida overviewoftheiberspeechrtve2022challengesonspeechtechnologies
AT luisjavierrodriguezfuentes overviewoftheiberspeechrtve2022challengesonspeechtechnologies
AT javiertejedor overviewoftheiberspeechrtve2022challengesonspeechtechnologies
AT alfonsoortega overviewoftheiberspeechrtve2022challengesonspeechtechnologies
AT antoniomiguel overviewoftheiberspeechrtve2022challengesonspeechtechnologies
AT virginiabazan overviewoftheiberspeechrtve2022challengesonspeechtechnologies
AT carmenperez overviewoftheiberspeechrtve2022challengesonspeechtechnologies
AT albertodeprada overviewoftheiberspeechrtve2022challengesonspeechtechnologies
AT mikelpenagarikano overviewoftheiberspeechrtve2022challengesonspeechtechnologies
AT amparovarona overviewoftheiberspeechrtve2022challengesonspeechtechnologies
AT germanbordel overviewoftheiberspeechrtve2022challengesonspeechtechnologies
AT doroteotorretoledano overviewoftheiberspeechrtve2022challengesonspeechtechnologies
AT aitoralvarez overviewoftheiberspeechrtve2022challengesonspeechtechnologies
AT haritzarzelus overviewoftheiberspeechrtve2022challengesonspeechtechnologies