Ambulatory Phonation Monitoring With Wireless Microphones Based on the Speech Energy Envelope: Algorithm Development and Validation
BackgroundVoice disorders mainly result from chronic overuse or abuse, particularly in occupational voice users such as teachers. Previous studies proposed a contact microphone attached to the anterior neck for ambulatory voice monitoring; however, the inconvenience associated with taping and wiring...
Main Authors: | , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
JMIR Publications
2020-12-01
|
Series: | JMIR mHealth and uHealth |
Online Access: | https://mhealth.jmir.org/2020/12/e16746 |
_version_ | 1818902460358459392 |
---|---|
author | Wang, Chi-Te Han, Ji-Yan Fang, Shih-Hau Lai, Ying-Hui |
author_facet | Wang, Chi-Te Han, Ji-Yan Fang, Shih-Hau Lai, Ying-Hui |
author_sort | Wang, Chi-Te |
collection | DOAJ |
description | BackgroundVoice disorders mainly result from chronic overuse or abuse, particularly in occupational voice users such as teachers. Previous studies proposed a contact microphone attached to the anterior neck for ambulatory voice monitoring; however, the inconvenience associated with taping and wiring, along with the lack of real-time processing, has limited its clinical application.
ObjectiveThis study aims to (1) propose an automatic speech detection system using wireless microphones for real-time ambulatory voice monitoring, (2) examine the detection accuracy under controlled environment and noisy conditions, and (3) report the results of the phonation ratio in practical scenarios.
MethodsWe designed an adaptive threshold function to detect the presence of speech based on the energy envelope. We invited 10 teachers to participate in this study and tested the performance of the proposed automatic speech detection system regarding detection accuracy and phonation ratio. Moreover, we investigated whether the unsupervised noise reduction algorithm (ie, log minimum mean square error) can overcome the influence of environmental noise in the proposed system.
ResultsThe proposed system exhibited an average accuracy of speech detection of 89.9%, ranging from 81.0% (67,357/83,157 frames) to 95.0% (199,201/209,685 frames). Subsequent analyses revealed a phonation ratio between 44.0% (33,019/75,044 frames) and 78.0% (68,785/88,186 frames) during teaching sessions of 40-60 minutes; the durations of most of the phonation segments were less than 10 seconds. The presence of background noise reduced the accuracy of the automatic speech detection system, and an adjuvant noise reduction function could effectively improve the accuracy, especially under stable noise conditions.
ConclusionsThis study demonstrated an average detection accuracy of 89.9% in the proposed automatic speech detection system with wireless microphones. The preliminary results for the phonation ratio were comparable to those of previous studies. Although the wireless microphones are susceptible to background noise, an additional noise reduction function can alleviate this limitation. These results indicate that the proposed system can be applied for ambulatory voice monitoring in occupational voice users. |
first_indexed | 2024-12-19T20:36:00Z |
format | Article |
id | doaj.art-9c9c0ec56bb14402874fdf1be92f489f |
institution | Directory Open Access Journal |
issn | 2291-5222 |
language | English |
last_indexed | 2024-12-19T20:36:00Z |
publishDate | 2020-12-01 |
publisher | JMIR Publications |
record_format | Article |
series | JMIR mHealth and uHealth |
spelling | doaj.art-9c9c0ec56bb14402874fdf1be92f489f2022-12-21T20:06:32ZengJMIR PublicationsJMIR mHealth and uHealth2291-52222020-12-01812e1674610.2196/16746Ambulatory Phonation Monitoring With Wireless Microphones Based on the Speech Energy Envelope: Algorithm Development and ValidationWang, Chi-TeHan, Ji-YanFang, Shih-HauLai, Ying-HuiBackgroundVoice disorders mainly result from chronic overuse or abuse, particularly in occupational voice users such as teachers. Previous studies proposed a contact microphone attached to the anterior neck for ambulatory voice monitoring; however, the inconvenience associated with taping and wiring, along with the lack of real-time processing, has limited its clinical application. ObjectiveThis study aims to (1) propose an automatic speech detection system using wireless microphones for real-time ambulatory voice monitoring, (2) examine the detection accuracy under controlled environment and noisy conditions, and (3) report the results of the phonation ratio in practical scenarios. MethodsWe designed an adaptive threshold function to detect the presence of speech based on the energy envelope. We invited 10 teachers to participate in this study and tested the performance of the proposed automatic speech detection system regarding detection accuracy and phonation ratio. Moreover, we investigated whether the unsupervised noise reduction algorithm (ie, log minimum mean square error) can overcome the influence of environmental noise in the proposed system. ResultsThe proposed system exhibited an average accuracy of speech detection of 89.9%, ranging from 81.0% (67,357/83,157 frames) to 95.0% (199,201/209,685 frames). Subsequent analyses revealed a phonation ratio between 44.0% (33,019/75,044 frames) and 78.0% (68,785/88,186 frames) during teaching sessions of 40-60 minutes; the durations of most of the phonation segments were less than 10 seconds. The presence of background noise reduced the accuracy of the automatic speech detection system, and an adjuvant noise reduction function could effectively improve the accuracy, especially under stable noise conditions. ConclusionsThis study demonstrated an average detection accuracy of 89.9% in the proposed automatic speech detection system with wireless microphones. The preliminary results for the phonation ratio were comparable to those of previous studies. Although the wireless microphones are susceptible to background noise, an additional noise reduction function can alleviate this limitation. These results indicate that the proposed system can be applied for ambulatory voice monitoring in occupational voice users.https://mhealth.jmir.org/2020/12/e16746 |
spellingShingle | Wang, Chi-Te Han, Ji-Yan Fang, Shih-Hau Lai, Ying-Hui Ambulatory Phonation Monitoring With Wireless Microphones Based on the Speech Energy Envelope: Algorithm Development and Validation JMIR mHealth and uHealth |
title | Ambulatory Phonation Monitoring With Wireless Microphones Based on the Speech Energy Envelope: Algorithm Development and Validation |
title_full | Ambulatory Phonation Monitoring With Wireless Microphones Based on the Speech Energy Envelope: Algorithm Development and Validation |
title_fullStr | Ambulatory Phonation Monitoring With Wireless Microphones Based on the Speech Energy Envelope: Algorithm Development and Validation |
title_full_unstemmed | Ambulatory Phonation Monitoring With Wireless Microphones Based on the Speech Energy Envelope: Algorithm Development and Validation |
title_short | Ambulatory Phonation Monitoring With Wireless Microphones Based on the Speech Energy Envelope: Algorithm Development and Validation |
title_sort | ambulatory phonation monitoring with wireless microphones based on the speech energy envelope algorithm development and validation |
url | https://mhealth.jmir.org/2020/12/e16746 |
work_keys_str_mv | AT wangchite ambulatoryphonationmonitoringwithwirelessmicrophonesbasedonthespeechenergyenvelopealgorithmdevelopmentandvalidation AT hanjiyan ambulatoryphonationmonitoringwithwirelessmicrophonesbasedonthespeechenergyenvelopealgorithmdevelopmentandvalidation AT fangshihhau ambulatoryphonationmonitoringwithwirelessmicrophonesbasedonthespeechenergyenvelopealgorithmdevelopmentandvalidation AT laiyinghui ambulatoryphonationmonitoringwithwirelessmicrophonesbasedonthespeechenergyenvelopealgorithmdevelopmentandvalidation |