A FPGA-based ultrasonic voice detection platform for whisper to voice reconstruction

Whisper speech is common for people in private conversion, but is also a side effect for Laryngectomy patients who have had part of, or all of their larynx removed. A novel engineering approach building upon a Code Excited Linear Prediction (CELP) codec was developed [1] to turn whispers to voice. T...

Full description

Bibliographic Details
Main Author:	Fan, Gaofeng
Other Authors:	Goh Wooi Boon
Format:	Thesis
Language:	English
Published:	2013
Subjects:	DRNTU::Engineering::Computer science and engineering::Hardware::Logic design DRNTU::Engineering::Computer science and engineering::Hardware::Input/output and data communications
Online Access:	http://hdl.handle.net/10356/51912

_version_	1824455516202467328
author	Fan, Gaofeng
author2	Goh Wooi Boon
author_facet	Goh Wooi Boon Fan, Gaofeng
author_sort	Fan, Gaofeng
collection	NTU
description	Whisper speech is common for people in private conversion, but is also a side effect for Laryngectomy patients who have had part of, or all of their larynx removed. A novel engineering approach building upon a Code Excited Linear Prediction (CELP) codec was developed [1] to turn whispers to voice. The intention was to help post-laryngectomized patients regain the ability to speak. The current project is devoted to the hardware implementation of the techniques used. Most importantly, an ultrasound voice activity detection system has been developed from this framework and implemented on FPGA. It includes 14-21 kHz low frequency ultrasonic generation as a source signal emitted toward the talker. The waveform reflected back contains vital information about the speech, like mouth opening/closing. Those message could be further processed in real time on the FPGA and facilitate the rebuilding of voice from whispers. The scope of the current project is to build the hardware platform and evaluate the function of mouth state detection (which is an important method of synchronizing any speech codec and of rejecting acoustic background noise). The frequency range chosen for the ultrasound (14-21 KHz) is predominantly unnoticeable for most adults while it is compatible with a large number of standard microphones/speakers (thus not requiring specialized acoustic equipment). As a result, the system is also capable of assisting mobile communication for whisper-to-voice conversion when the speakers do not desire to speak loudly, or where speech-like background babble is present during a conversation. The hardware platform, comprising audio codec and FPGA, has been completed under the work of this thesis, including design and construction of a miniature high quality low-frequency ultrasonic audio processing board. Low-frequency ultrasonics (14-21 KHz) generation and mouth status detection digital systems were also designed and implemented on FPGA. The work builds the foundation for future implementation of the whole modified CELP algorithm and the whisper to voice reconstruction system on FPGA.
first_indexed	2025-02-19T03:39:27Z
format	Thesis
id	ntu-10356/51912
institution	Nanyang Technological University
language	English
last_indexed	2025-02-19T03:39:27Z
publishDate	2013
record_format	dspace
spelling	ntu-10356/519122023-03-04T00:34:19Z A FPGA-based ultrasonic voice detection platform for whisper to voice reconstruction Fan, Gaofeng Goh Wooi Boon School of Computer Engineering Xilinx Centre for High Performance Embedded Systems Ian Vince McLoughlin Goh Wooi Boon DRNTU::Engineering::Computer science and engineering::Hardware::Logic design DRNTU::Engineering::Computer science and engineering::Hardware::Input/output and data communications Whisper speech is common for people in private conversion, but is also a side effect for Laryngectomy patients who have had part of, or all of their larynx removed. A novel engineering approach building upon a Code Excited Linear Prediction (CELP) codec was developed [1] to turn whispers to voice. The intention was to help post-laryngectomized patients regain the ability to speak. The current project is devoted to the hardware implementation of the techniques used. Most importantly, an ultrasound voice activity detection system has been developed from this framework and implemented on FPGA. It includes 14-21 kHz low frequency ultrasonic generation as a source signal emitted toward the talker. The waveform reflected back contains vital information about the speech, like mouth opening/closing. Those message could be further processed in real time on the FPGA and facilitate the rebuilding of voice from whispers. The scope of the current project is to build the hardware platform and evaluate the function of mouth state detection (which is an important method of synchronizing any speech codec and of rejecting acoustic background noise). The frequency range chosen for the ultrasound (14-21 KHz) is predominantly unnoticeable for most adults while it is compatible with a large number of standard microphones/speakers (thus not requiring specialized acoustic equipment). As a result, the system is also capable of assisting mobile communication for whisper-to-voice conversion when the speakers do not desire to speak loudly, or where speech-like background babble is present during a conversation. The hardware platform, comprising audio codec and FPGA, has been completed under the work of this thesis, including design and construction of a miniature high quality low-frequency ultrasonic audio processing board. Low-frequency ultrasonics (14-21 KHz) generation and mouth status detection digital systems were also designed and implemented on FPGA. The work builds the foundation for future implementation of the whole modified CELP algorithm and the whisper to voice reconstruction system on FPGA. Master of Engineering (SCE) 2013-04-15T08:28:44Z 2013-04-15T08:28:44Z 2013 2013 Thesis http://hdl.handle.net/10356/51912 en 152p. application/pdf
spellingShingle	DRNTU::Engineering::Computer science and engineering::Hardware::Logic design DRNTU::Engineering::Computer science and engineering::Hardware::Input/output and data communications Fan, Gaofeng A FPGA-based ultrasonic voice detection platform for whisper to voice reconstruction
title	A FPGA-based ultrasonic voice detection platform for whisper to voice reconstruction
title_full	A FPGA-based ultrasonic voice detection platform for whisper to voice reconstruction
title_fullStr	A FPGA-based ultrasonic voice detection platform for whisper to voice reconstruction
title_full_unstemmed	A FPGA-based ultrasonic voice detection platform for whisper to voice reconstruction
title_short	A FPGA-based ultrasonic voice detection platform for whisper to voice reconstruction
title_sort	fpga based ultrasonic voice detection platform for whisper to voice reconstruction
topic	DRNTU::Engineering::Computer science and engineering::Hardware::Logic design DRNTU::Engineering::Computer science and engineering::Hardware::Input/output and data communications
url	http://hdl.handle.net/10356/51912
work_keys_str_mv	AT fangaofeng afpgabasedultrasonicvoicedetectionplatformforwhispertovoicereconstruction AT fangaofeng fpgabasedultrasonicvoicedetectionplatformforwhispertovoicereconstruction

A FPGA-based ultrasonic voice detection platform for whisper to voice reconstruction

Similar Items