Classification of Phonation Modes in Classical Singing Using Modulation Power Spectral Features

In singing, the perceptual term “voice quality” is used to describe expressed emotions and singing styles. In voice physiology research, specific voice qualities are discussed using the term phonation modes and are directly related to the voicing produced by the vocal folds. Th...

Full description

Bibliographic Details
Main Authors: Manuel Brandner, Paul Armin Bereuter, Sudarsana Reddy Kadiri, Alois Sontacchi
Format: Article
Language:English
Published: IEEE 2023-01-01
Series:IEEE Access
Subjects:
Online Access:https://ieeexplore.ieee.org/document/10078264/
_version_ 1797850995271663616
author Manuel Brandner
Paul Armin Bereuter
Sudarsana Reddy Kadiri
Alois Sontacchi
author_facet Manuel Brandner
Paul Armin Bereuter
Sudarsana Reddy Kadiri
Alois Sontacchi
author_sort Manuel Brandner
collection DOAJ
description In singing, the perceptual term “voice quality” is used to describe expressed emotions and singing styles. In voice physiology research, specific voice qualities are discussed using the term phonation modes and are directly related to the voicing produced by the vocal folds. The control and awareness of phonation modes is vital for professional singers to maintain a healthy voice. Most studies on phonation modes have investigated speech and have used glottal inverse filtering to compute features from an estimated excitation signal. The performance of this method is reported to decrease at high pitches, which limits its usability for the singing voice. To overcome this, this study proposes to use features derived from the modulation power spectrum for phonation mode classification in the singing voice. The exploration of the modulation power spectrum is motivated by the fact that, in singing, temporal modulations (known as vocal vibrato) and spectral modulations hold information of the vocal fold tension. Since there exists no large publicly available dataset of phonation modes in singing, we created a new dataset consisting of six female and four male classical singers, who sang five vowels at different pitches in three phonation modes (breathy, modal, and pressed). Experimental results with a support vector machine classifier reveal that the proposed features show better classification performance compared to state-of-the-art reference features. The performance for the current dataset is at least 10% higher compared to the performance of the reference features (such as glottal source features and MFCCs) in the case of target labels and around 6% higher in the case of perceptually assessed labels.
first_indexed 2024-04-09T19:10:33Z
format Article
id doaj.art-bf73d8e9de724c2eb3e583e3066af48d
institution Directory Open Access Journal
issn 2169-3536
language English
last_indexed 2024-04-09T19:10:33Z
publishDate 2023-01-01
publisher IEEE
record_format Article
series IEEE Access
spelling doaj.art-bf73d8e9de724c2eb3e583e3066af48d2023-04-06T23:00:27ZengIEEEIEEE Access2169-35362023-01-0111291492916110.1109/ACCESS.2023.326018710078264Classification of Phonation Modes in Classical Singing Using Modulation Power Spectral FeaturesManuel Brandner0https://orcid.org/0000-0002-3217-3497Paul Armin Bereuter1https://orcid.org/0009-0003-9530-337XSudarsana Reddy Kadiri2https://orcid.org/0000-0001-5806-3053Alois Sontacchi3https://orcid.org/0009-0008-9205-209XInstitute of Electronic Music and Acoustics, University of Music and Performing Arts Graz, Graz, AustriaInstitute of Electronic Music and Acoustics, University of Music and Performing Arts Graz, Graz, AustriaDepartment of Information and Communications Engineering, Aalto University, Espoo, FinlandInstitute of Electronic Music and Acoustics, University of Music and Performing Arts Graz, Graz, AustriaIn singing, the perceptual term “voice quality” is used to describe expressed emotions and singing styles. In voice physiology research, specific voice qualities are discussed using the term phonation modes and are directly related to the voicing produced by the vocal folds. The control and awareness of phonation modes is vital for professional singers to maintain a healthy voice. Most studies on phonation modes have investigated speech and have used glottal inverse filtering to compute features from an estimated excitation signal. The performance of this method is reported to decrease at high pitches, which limits its usability for the singing voice. To overcome this, this study proposes to use features derived from the modulation power spectrum for phonation mode classification in the singing voice. The exploration of the modulation power spectrum is motivated by the fact that, in singing, temporal modulations (known as vocal vibrato) and spectral modulations hold information of the vocal fold tension. Since there exists no large publicly available dataset of phonation modes in singing, we created a new dataset consisting of six female and four male classical singers, who sang five vowels at different pitches in three phonation modes (breathy, modal, and pressed). Experimental results with a support vector machine classifier reveal that the proposed features show better classification performance compared to state-of-the-art reference features. The performance for the current dataset is at least 10% higher compared to the performance of the reference features (such as glottal source features and MFCCs) in the case of target labels and around 6% higher in the case of perceptually assessed labels.https://ieeexplore.ieee.org/document/10078264/Modulation power spectrumphonation modessinging voice analysisvoice qualities
spellingShingle Manuel Brandner
Paul Armin Bereuter
Sudarsana Reddy Kadiri
Alois Sontacchi
Classification of Phonation Modes in Classical Singing Using Modulation Power Spectral Features
IEEE Access
Modulation power spectrum
phonation modes
singing voice analysis
voice qualities
title Classification of Phonation Modes in Classical Singing Using Modulation Power Spectral Features
title_full Classification of Phonation Modes in Classical Singing Using Modulation Power Spectral Features
title_fullStr Classification of Phonation Modes in Classical Singing Using Modulation Power Spectral Features
title_full_unstemmed Classification of Phonation Modes in Classical Singing Using Modulation Power Spectral Features
title_short Classification of Phonation Modes in Classical Singing Using Modulation Power Spectral Features
title_sort classification of phonation modes in classical singing using modulation power spectral features
topic Modulation power spectrum
phonation modes
singing voice analysis
voice qualities
url https://ieeexplore.ieee.org/document/10078264/
work_keys_str_mv AT manuelbrandner classificationofphonationmodesinclassicalsingingusingmodulationpowerspectralfeatures
AT paularminbereuter classificationofphonationmodesinclassicalsingingusingmodulationpowerspectralfeatures
AT sudarsanareddykadiri classificationofphonationmodesinclassicalsingingusingmodulationpowerspectralfeatures
AT aloissontacchi classificationofphonationmodesinclassicalsingingusingmodulationpowerspectralfeatures