An Overview of the IberSpeech-RTVE 2022 Challenges on Speech Technologies
Evaluation campaigns provide a common framework with which the progress of speech technologies can be effectively measured. The aim of this paper is to present a detailed overview of the IberSpeech-RTVE 2022 Challenges, which were organized as part of the IberSpeech 2022 conference under the ongoing...
Main Authors: | , , , , , , , , , , , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
MDPI AG
2023-07-01
|
Series: | Applied Sciences |
Subjects: | |
Online Access: | https://www.mdpi.com/2076-3417/13/15/8577 |
_version_ | 1797587092701708288 |
---|---|
author | Eduardo Lleida Luis Javier Rodriguez-Fuentes Javier Tejedor Alfonso Ortega Antonio Miguel Virginia Bazán Carmen Pérez Alberto de Prada Mikel Penagarikano Amparo Varona Germán Bordel Doroteo Torre-Toledano Aitor Álvarez Haritz Arzelus |
author_facet | Eduardo Lleida Luis Javier Rodriguez-Fuentes Javier Tejedor Alfonso Ortega Antonio Miguel Virginia Bazán Carmen Pérez Alberto de Prada Mikel Penagarikano Amparo Varona Germán Bordel Doroteo Torre-Toledano Aitor Álvarez Haritz Arzelus |
author_sort | Eduardo Lleida |
collection | DOAJ |
description | Evaluation campaigns provide a common framework with which the progress of speech technologies can be effectively measured. The aim of this paper is to present a detailed overview of the IberSpeech-RTVE 2022 Challenges, which were organized as part of the IberSpeech 2022 conference under the ongoing series of Albayzin evaluation campaigns. In the 2022 edition, four challenges were launched: (1) speech-to-text transcription; (2) speaker diarization and identity assignment; (3) text and speech alignment; and (4) search on speech. Different databases that cover different domains (e.g., broadcast news, conference talks, parliament sessions) were released for those challenges. The submitted systems also cover a wide range of speech processing methods, which include hidden Markov model-based approaches, end-to-end neural network-based methods, hybrid approaches, etc. This paper describes the databases, the tasks and the performance metrics used in the four challenges. It also provides the most relevant features of the submitted systems and briefly presents and discusses the obtained results. Despite employing state-of-the-art technology, the relatively poor performance attained in some of the challenges reveals that there is still room for improvement. This encourages us to carry on with the Albayzin evaluation campaigns in the coming years. |
first_indexed | 2024-03-11T00:32:20Z |
format | Article |
id | doaj.art-75d032e326114ebca601ea06a1a3cd04 |
institution | Directory Open Access Journal |
issn | 2076-3417 |
language | English |
last_indexed | 2024-03-11T00:32:20Z |
publishDate | 2023-07-01 |
publisher | MDPI AG |
record_format | Article |
series | Applied Sciences |
spelling | doaj.art-75d032e326114ebca601ea06a1a3cd042023-11-18T22:34:49ZengMDPI AGApplied Sciences2076-34172023-07-011315857710.3390/app13158577An Overview of the IberSpeech-RTVE 2022 Challenges on Speech TechnologiesEduardo Lleida0Luis Javier Rodriguez-Fuentes1Javier Tejedor2Alfonso Ortega3Antonio Miguel4Virginia Bazán5Carmen Pérez6Alberto de Prada7Mikel Penagarikano8Amparo Varona9Germán Bordel10Doroteo Torre-Toledano11Aitor Álvarez12Haritz Arzelus13Vivolab, Aragon Institute for Engineering Research (I3A), University of Zaragoza, 50018 Zaragoza, SpainDepartment of Electricity and Electronics, Faculty of Science and Technology, University of the Basque Country (UPV/EHU), Barrio Sarriena, 48940 Leioa, SpainInstitute of Technology, Universidad San Pablo-CEU, CEU Universities, Urbanización Montepríncipe, 28668 Boadilla del Monte, SpainVivolab, Aragon Institute for Engineering Research (I3A), University of Zaragoza, 50018 Zaragoza, SpainVivolab, Aragon Institute for Engineering Research (I3A), University of Zaragoza, 50018 Zaragoza, SpainCorporación Radiotelevisión Española, 28223 Madrid, SpainCorporación Radiotelevisión Española, 28223 Madrid, SpainCorporación Radiotelevisión Española, 28223 Madrid, SpainDepartment of Electricity and Electronics, Faculty of Science and Technology, University of the Basque Country (UPV/EHU), Barrio Sarriena, 48940 Leioa, SpainDepartment of Electricity and Electronics, Faculty of Science and Technology, University of the Basque Country (UPV/EHU), Barrio Sarriena, 48940 Leioa, SpainDepartment of Electricity and Electronics, Faculty of Science and Technology, University of the Basque Country (UPV/EHU), Barrio Sarriena, 48940 Leioa, SpainAUDIAS, Electronic and Communication Technology Department, Escuela Politécnica Superior, Universidad Autónoma de Madrid, Av. Francisco Tomás y Valiente, 11, 28049 Madrid, SpainFundación Vicomtech, Basque Research and Technology Alliance (BRTA), Mikeletegi 57, 20009 Donostia-San Sebastián, SpainFundación Vicomtech, Basque Research and Technology Alliance (BRTA), Mikeletegi 57, 20009 Donostia-San Sebastián, SpainEvaluation campaigns provide a common framework with which the progress of speech technologies can be effectively measured. The aim of this paper is to present a detailed overview of the IberSpeech-RTVE 2022 Challenges, which were organized as part of the IberSpeech 2022 conference under the ongoing series of Albayzin evaluation campaigns. In the 2022 edition, four challenges were launched: (1) speech-to-text transcription; (2) speaker diarization and identity assignment; (3) text and speech alignment; and (4) search on speech. Different databases that cover different domains (e.g., broadcast news, conference talks, parliament sessions) were released for those challenges. The submitted systems also cover a wide range of speech processing methods, which include hidden Markov model-based approaches, end-to-end neural network-based methods, hybrid approaches, etc. This paper describes the databases, the tasks and the performance metrics used in the four challenges. It also provides the most relevant features of the submitted systems and briefly presents and discusses the obtained results. Despite employing state-of-the-art technology, the relatively poor performance attained in some of the challenges reveals that there is still room for improvement. This encourages us to carry on with the Albayzin evaluation campaigns in the coming years.https://www.mdpi.com/2076-3417/13/15/8577IberSpeech ChallengeRTVE2022 databaseAlbayzin evaluationsspeech-to-text transcriptionspeaker diarization and identity assignmenttext and speech alignment |
spellingShingle | Eduardo Lleida Luis Javier Rodriguez-Fuentes Javier Tejedor Alfonso Ortega Antonio Miguel Virginia Bazán Carmen Pérez Alberto de Prada Mikel Penagarikano Amparo Varona Germán Bordel Doroteo Torre-Toledano Aitor Álvarez Haritz Arzelus An Overview of the IberSpeech-RTVE 2022 Challenges on Speech Technologies Applied Sciences IberSpeech Challenge RTVE2022 database Albayzin evaluations speech-to-text transcription speaker diarization and identity assignment text and speech alignment |
title | An Overview of the IberSpeech-RTVE 2022 Challenges on Speech Technologies |
title_full | An Overview of the IberSpeech-RTVE 2022 Challenges on Speech Technologies |
title_fullStr | An Overview of the IberSpeech-RTVE 2022 Challenges on Speech Technologies |
title_full_unstemmed | An Overview of the IberSpeech-RTVE 2022 Challenges on Speech Technologies |
title_short | An Overview of the IberSpeech-RTVE 2022 Challenges on Speech Technologies |
title_sort | overview of the iberspeech rtve 2022 challenges on speech technologies |
topic | IberSpeech Challenge RTVE2022 database Albayzin evaluations speech-to-text transcription speaker diarization and identity assignment text and speech alignment |
url | https://www.mdpi.com/2076-3417/13/15/8577 |
work_keys_str_mv | AT eduardolleida anoverviewoftheiberspeechrtve2022challengesonspeechtechnologies AT luisjavierrodriguezfuentes anoverviewoftheiberspeechrtve2022challengesonspeechtechnologies AT javiertejedor anoverviewoftheiberspeechrtve2022challengesonspeechtechnologies AT alfonsoortega anoverviewoftheiberspeechrtve2022challengesonspeechtechnologies AT antoniomiguel anoverviewoftheiberspeechrtve2022challengesonspeechtechnologies AT virginiabazan anoverviewoftheiberspeechrtve2022challengesonspeechtechnologies AT carmenperez anoverviewoftheiberspeechrtve2022challengesonspeechtechnologies AT albertodeprada anoverviewoftheiberspeechrtve2022challengesonspeechtechnologies AT mikelpenagarikano anoverviewoftheiberspeechrtve2022challengesonspeechtechnologies AT amparovarona anoverviewoftheiberspeechrtve2022challengesonspeechtechnologies AT germanbordel anoverviewoftheiberspeechrtve2022challengesonspeechtechnologies AT doroteotorretoledano anoverviewoftheiberspeechrtve2022challengesonspeechtechnologies AT aitoralvarez anoverviewoftheiberspeechrtve2022challengesonspeechtechnologies AT haritzarzelus anoverviewoftheiberspeechrtve2022challengesonspeechtechnologies AT eduardolleida overviewoftheiberspeechrtve2022challengesonspeechtechnologies AT luisjavierrodriguezfuentes overviewoftheiberspeechrtve2022challengesonspeechtechnologies AT javiertejedor overviewoftheiberspeechrtve2022challengesonspeechtechnologies AT alfonsoortega overviewoftheiberspeechrtve2022challengesonspeechtechnologies AT antoniomiguel overviewoftheiberspeechrtve2022challengesonspeechtechnologies AT virginiabazan overviewoftheiberspeechrtve2022challengesonspeechtechnologies AT carmenperez overviewoftheiberspeechrtve2022challengesonspeechtechnologies AT albertodeprada overviewoftheiberspeechrtve2022challengesonspeechtechnologies AT mikelpenagarikano overviewoftheiberspeechrtve2022challengesonspeechtechnologies AT amparovarona overviewoftheiberspeechrtve2022challengesonspeechtechnologies AT germanbordel overviewoftheiberspeechrtve2022challengesonspeechtechnologies AT doroteotorretoledano overviewoftheiberspeechrtve2022challengesonspeechtechnologies AT aitoralvarez overviewoftheiberspeechrtve2022challengesonspeechtechnologies AT haritzarzelus overviewoftheiberspeechrtve2022challengesonspeechtechnologies |