Punctuation-generation-inspired linguistic features for Mandarin prosody generation

Abstract This paper proposes two novel linguistic features extracted from text input for prosody generation in a Mandarin text-to-speech system. The first feature is the punctuation confidence (PC), which measures the likelihood that a major punctuation mark (MPM) can be inserted at a word boundary....

Full description

Bibliographic Details
Main Authors:	Chen-Yu Chiang, Yu-Ping Hung, Han-Yun Yeh, I-Bin Liao, Chen-Ming Pan
Format:	Article
Language:	English
Published:	SpringerOpen 2019-02-01
Series:	EURASIP Journal on Audio, Speech, and Music Processing
Subjects:	Conditional random field Multilayer perceptron Text-to-speech system Prosody generation Linguistic feature Speech synthesis
Online Access:	http://link.springer.com/article/10.1186/s13636-019-0147-y

_version_	1818189908009811968
author	Chen-Yu Chiang Yu-Ping Hung Han-Yun Yeh I-Bin Liao Chen-Ming Pan
author_facet	Chen-Yu Chiang Yu-Ping Hung Han-Yun Yeh I-Bin Liao Chen-Ming Pan
author_sort	Chen-Yu Chiang
collection	DOAJ
description	Abstract This paper proposes two novel linguistic features extracted from text input for prosody generation in a Mandarin text-to-speech system. The first feature is the punctuation confidence (PC), which measures the likelihood that a major punctuation mark (MPM) can be inserted at a word boundary. The second feature is the quotation confidence (QC), which measures the likelihood that a word string is quoted as a meaningful or emphasized unit. The proposed PC and QC features are influenced by the properties of automatic Chinese punctuation generation and linguistic characteristic of the Chinese punctuation system. Because MPMs are highly correlated with prosodic–acoustic features and quoted word strings serve crucial roles in human language understanding, the two features could potentially provide useful information for prosody generation. This idea was realized by employing conditional random-field-based models for predicting MPMs, quoted word string locations, and their associated confidences—that is, PC and QC—for each word boundary. The predicted punctuations and their confidences were then combined with traditional linguistic features to predict prosodic–acoustic features for performing speech synthesis using multilayer perceptrons. Both objective and subjective tests demonstrated that the prosody generated with the proposed linguistic features was superior to that generated without the proposed features. Therefore, the proposed PC and QC are identified as promising features for Mandarin prosody generation.
first_indexed	2024-12-11T23:50:17Z
format	Article
id	doaj.art-a82870608c964bcc945312fd7108fbae
institution	Directory Open Access Journal
issn	1687-4722
language	English
last_indexed	2024-12-11T23:50:17Z
publishDate	2019-02-01
publisher	SpringerOpen
record_format	Article
series	EURASIP Journal on Audio, Speech, and Music Processing
spelling	doaj.art-a82870608c964bcc945312fd7108fbae2022-12-22T00:45:29ZengSpringerOpenEURASIP Journal on Audio, Speech, and Music Processing1687-47222019-02-012019112210.1186/s13636-019-0147-yPunctuation-generation-inspired linguistic features for Mandarin prosody generationChen-Yu Chiang0Yu-Ping Hung1Han-Yun Yeh2I-Bin Liao3Chen-Ming Pan4Department of Communication Engineering, National Taipei UniversityDepartment of Communication Engineering, National Taipei UniversityDepartment of Communication Engineering, National Taipei UniversityTelecommunication Laboratories, Chunghwa TelecomTelecommunication Laboratories, Chunghwa TelecomAbstract This paper proposes two novel linguistic features extracted from text input for prosody generation in a Mandarin text-to-speech system. The first feature is the punctuation confidence (PC), which measures the likelihood that a major punctuation mark (MPM) can be inserted at a word boundary. The second feature is the quotation confidence (QC), which measures the likelihood that a word string is quoted as a meaningful or emphasized unit. The proposed PC and QC features are influenced by the properties of automatic Chinese punctuation generation and linguistic characteristic of the Chinese punctuation system. Because MPMs are highly correlated with prosodic–acoustic features and quoted word strings serve crucial roles in human language understanding, the two features could potentially provide useful information for prosody generation. This idea was realized by employing conditional random-field-based models for predicting MPMs, quoted word string locations, and their associated confidences—that is, PC and QC—for each word boundary. The predicted punctuations and their confidences were then combined with traditional linguistic features to predict prosodic–acoustic features for performing speech synthesis using multilayer perceptrons. Both objective and subjective tests demonstrated that the prosody generated with the proposed linguistic features was superior to that generated without the proposed features. Therefore, the proposed PC and QC are identified as promising features for Mandarin prosody generation.http://link.springer.com/article/10.1186/s13636-019-0147-yConditional random fieldMultilayer perceptronText-to-speech systemProsody generationLinguistic featureSpeech synthesis
spellingShingle	Chen-Yu Chiang Yu-Ping Hung Han-Yun Yeh I-Bin Liao Chen-Ming Pan Punctuation-generation-inspired linguistic features for Mandarin prosody generation EURASIP Journal on Audio, Speech, and Music Processing Conditional random field Multilayer perceptron Text-to-speech system Prosody generation Linguistic feature Speech synthesis
title	Punctuation-generation-inspired linguistic features for Mandarin prosody generation
title_full	Punctuation-generation-inspired linguistic features for Mandarin prosody generation
title_fullStr	Punctuation-generation-inspired linguistic features for Mandarin prosody generation
title_full_unstemmed	Punctuation-generation-inspired linguistic features for Mandarin prosody generation
title_short	Punctuation-generation-inspired linguistic features for Mandarin prosody generation
title_sort	punctuation generation inspired linguistic features for mandarin prosody generation
topic	Conditional random field Multilayer perceptron Text-to-speech system Prosody generation Linguistic feature Speech synthesis
url	http://link.springer.com/article/10.1186/s13636-019-0147-y
work_keys_str_mv	AT chenyuchiang punctuationgenerationinspiredlinguisticfeaturesformandarinprosodygeneration AT yupinghung punctuationgenerationinspiredlinguisticfeaturesformandarinprosodygeneration AT hanyunyeh punctuationgenerationinspiredlinguisticfeaturesformandarinprosodygeneration AT ibinliao punctuationgenerationinspiredlinguisticfeaturesformandarinprosodygeneration AT chenmingpan punctuationgenerationinspiredlinguisticfeaturesformandarinprosodygeneration

Punctuation-generation-inspired linguistic features for Mandarin prosody generation

Similar Items