Automatic Speech-to-Phoneme Alignment of the Spoken British National Corpus

This poster describes the creation of an automatic word and phoneme alignment between the audio recordings of the Spoken British National Corpus (BNC) and their corresponding word-level transcriptions. The work presented here is part of the “Mining a Year of Speech” project which aim is to produce a...

תיאור מלא

מידע ביבליוגרפי
Main Authors: Grau Puerto, S, Baghai-Ravary, L, Kochanski, G, Coleman, J
פורמט: Conference item
יצא לאור: 2011
_version_ 1826266984420999168
author Grau Puerto, S
Baghai-Ravary, L
Kochanski, G
Coleman, J
author_facet Grau Puerto, S
Baghai-Ravary, L
Kochanski, G
Coleman, J
author_sort Grau Puerto, S
collection OXFORD
description This poster describes the creation of an automatic word and phoneme alignment between the audio recordings of the Spoken British National Corpus (BNC) and their corresponding word-level transcriptions. The work presented here is part of the “Mining a Year of Speech” project which aim is to produce automatic speech-to-phoneme alignments of an approximately one year of audio recordings.The Spoken BNC recordings consist of unscripted, spontaneous speech conversations in different recording conditions, accents and background noises. The range of topics covers from radio programs to family conversations, council meetings or chemistry courses. The Spoken BNC was originally recorded on analogue cassette tapes between 1991 and 1994. These tapes have been recently-digitised by the British Library. The resulting dataset is composed of approximately 2,000 digital audio files with an average duration of 45 minutes and their associated word-level transcriptions. This poster describes the dataset, the automatic alignment process, the results obtained and the difficulties encountered.
first_indexed 2024-03-06T20:47:15Z
format Conference item
id oxford-uuid:3652bbeb-24a6-4771-83cf-3f3aba260031
institution University of Oxford
last_indexed 2024-03-06T20:47:15Z
publishDate 2011
record_format dspace
spelling oxford-uuid:3652bbeb-24a6-4771-83cf-3f3aba2600312022-03-26T13:37:12ZAutomatic Speech-to-Phoneme Alignment of the Spoken British National CorpusConference itemhttp://purl.org/coar/resource_type/c_5794uuid:3652bbeb-24a6-4771-83cf-3f3aba260031http://symplectic.bodleian.ox.ac.uk:8080/fedora/objects/src:67ee8d63-0404-46da-b283-0d47e67151b72011Grau Puerto, SBaghai-Ravary, LKochanski, GColeman, JThis poster describes the creation of an automatic word and phoneme alignment between the audio recordings of the Spoken British National Corpus (BNC) and their corresponding word-level transcriptions. The work presented here is part of the “Mining a Year of Speech” project which aim is to produce automatic speech-to-phoneme alignments of an approximately one year of audio recordings.The Spoken BNC recordings consist of unscripted, spontaneous speech conversations in different recording conditions, accents and background noises. The range of topics covers from radio programs to family conversations, council meetings or chemistry courses. The Spoken BNC was originally recorded on analogue cassette tapes between 1991 and 1994. These tapes have been recently-digitised by the British Library. The resulting dataset is composed of approximately 2,000 digital audio files with an average duration of 45 minutes and their associated word-level transcriptions. This poster describes the dataset, the automatic alignment process, the results obtained and the difficulties encountered.
spellingShingle Grau Puerto, S
Baghai-Ravary, L
Kochanski, G
Coleman, J
Automatic Speech-to-Phoneme Alignment of the Spoken British National Corpus
title Automatic Speech-to-Phoneme Alignment of the Spoken British National Corpus
title_full Automatic Speech-to-Phoneme Alignment of the Spoken British National Corpus
title_fullStr Automatic Speech-to-Phoneme Alignment of the Spoken British National Corpus
title_full_unstemmed Automatic Speech-to-Phoneme Alignment of the Spoken British National Corpus
title_short Automatic Speech-to-Phoneme Alignment of the Spoken British National Corpus
title_sort automatic speech to phoneme alignment of the spoken british national corpus
work_keys_str_mv AT graupuertos automaticspeechtophonemealignmentofthespokenbritishnationalcorpus
AT baghairavaryl automaticspeechtophonemealignmentofthespokenbritishnationalcorpus
AT kochanskig automaticspeechtophonemealignmentofthespokenbritishnationalcorpus
AT colemanj automaticspeechtophonemealignmentofthespokenbritishnationalcorpus