A Situational Analysis of Current Speech-Synthesis Systems for Child Voices: A Scoping Review of Qualitative and Quantitative Evidence
(1) <i>Background:</i> Speech synthesis has customarily focused on adult speech, but with the rapid development of speech-synthesis technology, it is now possible to create child voices with a limited amount of child-speech data. This scoping review summarises the evidence base related t...
Main Authors: | , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
MDPI AG
2022-06-01
|
Series: | Applied Sciences |
Subjects: | |
Online Access: | https://www.mdpi.com/2076-3417/12/11/5623 |
_version_ | 1797494135400169472 |
---|---|
author | Camryn Terblanche Michal Harty Michelle Pascoe Benjamin V. Tucker |
author_facet | Camryn Terblanche Michal Harty Michelle Pascoe Benjamin V. Tucker |
author_sort | Camryn Terblanche |
collection | DOAJ |
description | (1) <i>Background:</i> Speech synthesis has customarily focused on adult speech, but with the rapid development of speech-synthesis technology, it is now possible to create child voices with a limited amount of child-speech data. This scoping review summarises the evidence base related to developing synthesised speech for children. (2) <i>Method:</i> The included studies were those that were (1) published between 2006 and 2021 and (2) included child participants or voices of children aged between 2–16 years old. (3) <i>Results:</i> 58 studies were identified. They were discussed based on the languages used, the speech-synthesis systems and/or methods used, the speech data used, the intelligibility of the speech and the ages of the voices. Based on the reviewed studies, relative to adult-speech synthesis, developing child-speech synthesis is notably more challenging. Child speech often presents with acoustic variability and articulatory errors. To account for this, researchers have most often attempted to adapt adult-speech models, using a variety of different adaptation techniques. (4) <i>Conclusions:</i> Adapting adult speech has proven successful in child-speech synthesis. It appears that the resulting quality can be improved by training a large amount of pre-selected speech data, aided by a neural-network classifier, to better match the children’s speech. We encourage future research surrounding individualised synthetic speech for children with CCN, with special attention to children who make use of low-resource languages. |
first_indexed | 2024-03-10T01:29:59Z |
format | Article |
id | doaj.art-af0fb8703a6848d7ac0b80abfad22424 |
institution | Directory Open Access Journal |
issn | 2076-3417 |
language | English |
last_indexed | 2024-03-10T01:29:59Z |
publishDate | 2022-06-01 |
publisher | MDPI AG |
record_format | Article |
series | Applied Sciences |
spelling | doaj.art-af0fb8703a6848d7ac0b80abfad224242023-11-23T13:45:00ZengMDPI AGApplied Sciences2076-34172022-06-011211562310.3390/app12115623A Situational Analysis of Current Speech-Synthesis Systems for Child Voices: A Scoping Review of Qualitative and Quantitative EvidenceCamryn Terblanche0Michal Harty1Michelle Pascoe2Benjamin V. Tucker3Department of Speech and Language Pathology, University of Cape Town, Cape Town 7700, South AfricaDepartment of Speech and Language Pathology, University of Cape Town, Cape Town 7700, South AfricaDepartment of Speech and Language Pathology, University of Cape Town, Cape Town 7700, South AfricaDepartment of Linguistics, University of Alberta, Edmonton, AB T6G 2R3, Canada(1) <i>Background:</i> Speech synthesis has customarily focused on adult speech, but with the rapid development of speech-synthesis technology, it is now possible to create child voices with a limited amount of child-speech data. This scoping review summarises the evidence base related to developing synthesised speech for children. (2) <i>Method:</i> The included studies were those that were (1) published between 2006 and 2021 and (2) included child participants or voices of children aged between 2–16 years old. (3) <i>Results:</i> 58 studies were identified. They were discussed based on the languages used, the speech-synthesis systems and/or methods used, the speech data used, the intelligibility of the speech and the ages of the voices. Based on the reviewed studies, relative to adult-speech synthesis, developing child-speech synthesis is notably more challenging. Child speech often presents with acoustic variability and articulatory errors. To account for this, researchers have most often attempted to adapt adult-speech models, using a variety of different adaptation techniques. (4) <i>Conclusions:</i> Adapting adult speech has proven successful in child-speech synthesis. It appears that the resulting quality can be improved by training a large amount of pre-selected speech data, aided by a neural-network classifier, to better match the children’s speech. We encourage future research surrounding individualised synthetic speech for children with CCN, with special attention to children who make use of low-resource languages.https://www.mdpi.com/2076-3417/12/11/5623augmentative and alternative communication (AAC)childrencomplex communication needsneural networksspeech synthesis |
spellingShingle | Camryn Terblanche Michal Harty Michelle Pascoe Benjamin V. Tucker A Situational Analysis of Current Speech-Synthesis Systems for Child Voices: A Scoping Review of Qualitative and Quantitative Evidence Applied Sciences augmentative and alternative communication (AAC) children complex communication needs neural networks speech synthesis |
title | A Situational Analysis of Current Speech-Synthesis Systems for Child Voices: A Scoping Review of Qualitative and Quantitative Evidence |
title_full | A Situational Analysis of Current Speech-Synthesis Systems for Child Voices: A Scoping Review of Qualitative and Quantitative Evidence |
title_fullStr | A Situational Analysis of Current Speech-Synthesis Systems for Child Voices: A Scoping Review of Qualitative and Quantitative Evidence |
title_full_unstemmed | A Situational Analysis of Current Speech-Synthesis Systems for Child Voices: A Scoping Review of Qualitative and Quantitative Evidence |
title_short | A Situational Analysis of Current Speech-Synthesis Systems for Child Voices: A Scoping Review of Qualitative and Quantitative Evidence |
title_sort | situational analysis of current speech synthesis systems for child voices a scoping review of qualitative and quantitative evidence |
topic | augmentative and alternative communication (AAC) children complex communication needs neural networks speech synthesis |
url | https://www.mdpi.com/2076-3417/12/11/5623 |
work_keys_str_mv | AT camrynterblanche asituationalanalysisofcurrentspeechsynthesissystemsforchildvoicesascopingreviewofqualitativeandquantitativeevidence AT michalharty asituationalanalysisofcurrentspeechsynthesissystemsforchildvoicesascopingreviewofqualitativeandquantitativeevidence AT michellepascoe asituationalanalysisofcurrentspeechsynthesissystemsforchildvoicesascopingreviewofqualitativeandquantitativeevidence AT benjaminvtucker asituationalanalysisofcurrentspeechsynthesissystemsforchildvoicesascopingreviewofqualitativeandquantitativeevidence AT camrynterblanche situationalanalysisofcurrentspeechsynthesissystemsforchildvoicesascopingreviewofqualitativeandquantitativeevidence AT michalharty situationalanalysisofcurrentspeechsynthesissystemsforchildvoicesascopingreviewofqualitativeandquantitativeevidence AT michellepascoe situationalanalysisofcurrentspeechsynthesissystemsforchildvoicesascopingreviewofqualitativeandquantitativeevidence AT benjaminvtucker situationalanalysisofcurrentspeechsynthesissystemsforchildvoicesascopingreviewofqualitativeandquantitativeevidence |