Enabling Real-Time On-Chip Audio Super Resolution for Bone-Conduction Microphones

Voice communication using an air-conduction microphone in noisy environments suffers from the degradation of speech audibility. Bone-conduction microphones (BCM) are robust against ambient noises but suffer from limited effective bandwidth due to their sensing mechanism. Although existing audio supe...

Full description

Bibliographic Details
Main Authors: Yuang Li, Yuntao Wang, Xin Liu, Yuanchun Shi, Shwetak Patel, Shao-Fu Shih
Format: Article
Language:English
Published: MDPI AG 2022-12-01
Series:Sensors
Subjects:
Online Access:https://www.mdpi.com/1424-8220/23/1/35
_version_ 1827617249925529600
author Yuang Li
Yuntao Wang
Xin Liu
Yuanchun Shi
Shwetak Patel
Shao-Fu Shih
author_facet Yuang Li
Yuntao Wang
Xin Liu
Yuanchun Shi
Shwetak Patel
Shao-Fu Shih
author_sort Yuang Li
collection DOAJ
description Voice communication using an air-conduction microphone in noisy environments suffers from the degradation of speech audibility. Bone-conduction microphones (BCM) are robust against ambient noises but suffer from limited effective bandwidth due to their sensing mechanism. Although existing audio super-resolution algorithms can recover the high-frequency loss to achieve high-fidelity audio, they require considerably more computational resources than is available in low-power hearable devices. This paper proposes the first-ever real-time on-chip speech audio super-resolution system for BCM. To accomplish this, we built and compared a series of lightweight audio super-resolution deep-learning models. Among all these models, ATS-UNet was the most cost-efficient because the proposed novel Audio Temporal Shift Module (ATSM) reduces the network’s dimensionality while maintaining sufficient temporal features from speech audio. Then, we quantized and deployed the ATS-UNet to low-end ARM micro-controller units for a real-time embedded prototype. The evaluation results show that our system achieved real-time inference speed on Cortex-M7 and higher quality compared with the baseline audio super-resolution method. Finally, we conducted a user study with ten experts and ten amateur listeners to evaluate our method’s effectiveness to human ears. Both groups perceived a significantly higher speech quality with our method when compared to the solutions with the original BCM or air-conduction microphone with cutting-edge noise-reduction algorithms.
first_indexed 2024-03-09T09:42:13Z
format Article
id doaj.art-f1d67f4223114ff788e260cbbb852be3
institution Directory Open Access Journal
issn 1424-8220
language English
last_indexed 2024-03-09T09:42:13Z
publishDate 2022-12-01
publisher MDPI AG
record_format Article
series Sensors
spelling doaj.art-f1d67f4223114ff788e260cbbb852be32023-12-02T00:52:46ZengMDPI AGSensors1424-82202022-12-012313510.3390/s23010035Enabling Real-Time On-Chip Audio Super Resolution for Bone-Conduction MicrophonesYuang Li0Yuntao Wang1Xin Liu2Yuanchun Shi3Shwetak Patel4Shao-Fu Shih5Key Laboratory of Pervasive Computing, Ministry of Education, Department of Commputer Science and Technology, Tsinghua University, Beijing 100084, ChinaKey Laboratory of Pervasive Computing, Ministry of Education, Department of Commputer Science and Technology, Tsinghua University, Beijing 100084, ChinaDepartment of Computer Science and Engineering, Paul G. Allen School of Computer, University of Washington, Seattle, WA 98195, USAKey Laboratory of Pervasive Computing, Ministry of Education, Department of Commputer Science and Technology, Tsinghua University, Beijing 100084, ChinaDepartment of Computer Science and Engineering, Paul G. Allen School of Computer, University of Washington, Seattle, WA 98195, USAGoogle Inc., Mountain View, CA 94043, USAVoice communication using an air-conduction microphone in noisy environments suffers from the degradation of speech audibility. Bone-conduction microphones (BCM) are robust against ambient noises but suffer from limited effective bandwidth due to their sensing mechanism. Although existing audio super-resolution algorithms can recover the high-frequency loss to achieve high-fidelity audio, they require considerably more computational resources than is available in low-power hearable devices. This paper proposes the first-ever real-time on-chip speech audio super-resolution system for BCM. To accomplish this, we built and compared a series of lightweight audio super-resolution deep-learning models. Among all these models, ATS-UNet was the most cost-efficient because the proposed novel Audio Temporal Shift Module (ATSM) reduces the network’s dimensionality while maintaining sufficient temporal features from speech audio. Then, we quantized and deployed the ATS-UNet to low-end ARM micro-controller units for a real-time embedded prototype. The evaluation results show that our system achieved real-time inference speed on Cortex-M7 and higher quality compared with the baseline audio super-resolution method. Finally, we conducted a user study with ten experts and ten amateur listeners to evaluate our method’s effectiveness to human ears. Both groups perceived a significantly higher speech quality with our method when compared to the solutions with the original BCM or air-conduction microphone with cutting-edge noise-reduction algorithms.https://www.mdpi.com/1424-8220/23/1/35audio super-resolutionbone-conduction microphonereal-time systemconvolutional neural network
spellingShingle Yuang Li
Yuntao Wang
Xin Liu
Yuanchun Shi
Shwetak Patel
Shao-Fu Shih
Enabling Real-Time On-Chip Audio Super Resolution for Bone-Conduction Microphones
Sensors
audio super-resolution
bone-conduction microphone
real-time system
convolutional neural network
title Enabling Real-Time On-Chip Audio Super Resolution for Bone-Conduction Microphones
title_full Enabling Real-Time On-Chip Audio Super Resolution for Bone-Conduction Microphones
title_fullStr Enabling Real-Time On-Chip Audio Super Resolution for Bone-Conduction Microphones
title_full_unstemmed Enabling Real-Time On-Chip Audio Super Resolution for Bone-Conduction Microphones
title_short Enabling Real-Time On-Chip Audio Super Resolution for Bone-Conduction Microphones
title_sort enabling real time on chip audio super resolution for bone conduction microphones
topic audio super-resolution
bone-conduction microphone
real-time system
convolutional neural network
url https://www.mdpi.com/1424-8220/23/1/35
work_keys_str_mv AT yuangli enablingrealtimeonchipaudiosuperresolutionforboneconductionmicrophones
AT yuntaowang enablingrealtimeonchipaudiosuperresolutionforboneconductionmicrophones
AT xinliu enablingrealtimeonchipaudiosuperresolutionforboneconductionmicrophones
AT yuanchunshi enablingrealtimeonchipaudiosuperresolutionforboneconductionmicrophones
AT shwetakpatel enablingrealtimeonchipaudiosuperresolutionforboneconductionmicrophones
AT shaofushih enablingrealtimeonchipaudiosuperresolutionforboneconductionmicrophones