Voice keyword recognition based on spiking convolutional neural network for human-machine interface

In this paper, a spiking convolutional neural network (SCNN) model for voice keyword recognition is presented. The model consists of an input pre-processing layer, a spiking neural network (SNN) layer with build-in filter bank and the convolutional neural network (CNN) layers. A 16-channel infinite...

Mô tả đầy đủ

Chi tiết về thư mục
Những tác giả chính:	Hu, Jinhai, Goh, Wang Ling, Zhang, Zhongyi, Gao, Yuan
Tác giả khác:	School of Electrical and Electronic Engineering
Định dạng:	Conference Paper
Ngôn ngữ:	English
Được phát hành:	2024
Những chủ đề:	Engineering Spiking convolutional neural networks Voice keyword recognition
Truy cập trực tuyến:	https://hdl.handle.net/10356/179088 https://ieeexplore.ieee.org/abstract/document/9081859

_version_	1826122571999870976
author	Hu, Jinhai Goh, Wang Ling Zhang, Zhongyi Gao, Yuan
author2	School of Electrical and Electronic Engineering
author_facet	School of Electrical and Electronic Engineering Hu, Jinhai Goh, Wang Ling Zhang, Zhongyi Gao, Yuan
author_sort	Hu, Jinhai
collection	NTU
description	In this paper, a spiking convolutional neural network (SCNN) model for voice keyword recognition is presented. The model consists of an input pre-processing layer, a spiking neural network (SNN) layer with build-in filter bank and the convolutional neural network (CNN) layers. A 16-channel infinite impulse response (IIR) filter bank with energy detector extracts power from the voice signal band and converts it to spikes via the SNN layer. The spiking rate in a defined time window is used as the inputs to the following CNN layers for classification. The network is trained using a voice digit dataset, while the weights of the convolutional layers are adjusted through the training of spike-integration results obtained from the spiking layer. This model has been implemented for voice keyword recognition and achieved 96.0 % accuracy. The combination of SNN and CNN reduces the overall number of layer and neuron in the system without compromise in classification accuracy. It is suitable for low-power hardware implementation in edge devices for human machine interface (HMI) applications.
first_indexed	2024-10-01T05:50:40Z
format	Conference Paper
id	ntu-10356/179088
institution	Nanyang Technological University
language	English
last_indexed	2024-10-01T05:50:40Z
publishDate	2024
record_format	dspace
spelling	ntu-10356/1790882024-07-19T15:39:03Z Voice keyword recognition based on spiking convolutional neural network for human-machine interface Hu, Jinhai Goh, Wang Ling Zhang, Zhongyi Gao, Yuan School of Electrical and Electronic Engineering 2020 3rd International Conference on Intelligent Autonomous Systems (ICoIAS) Institute of Microelectronics, ASTAR Engineering Spiking convolutional neural networks Voice keyword recognition In this paper, a spiking convolutional neural network (SCNN) model for voice keyword recognition is presented. The model consists of an input pre-processing layer, a spiking neural network (SNN) layer with build-in filter bank and the convolutional neural network (CNN) layers. A 16-channel infinite impulse response (IIR) filter bank with energy detector extracts power from the voice signal band and converts it to spikes via the SNN layer. The spiking rate in a defined time window is used as the inputs to the following CNN layers for classification. The network is trained using a voice digit dataset, while the weights of the convolutional layers are adjusted through the training of spike-integration results obtained from the spiking layer. This model has been implemented for voice keyword recognition and achieved 96.0 % accuracy. The combination of SNN and CNN reduces the overall number of layer and neuron in the system without compromise in classification accuracy. It is suitable for low-power hardware implementation in edge devices for human machine interface (HMI) applications. Agency for Science, Technology and Research (ASTAR) Submitted/Accepted version This work is funded by A*STAR (Agency for Science, Technology and Research), Singapore under RIE2020 Advanced Manufacturing and Engineering (AME) Programmatic Grant (A18A4b0055). 2024-07-19T05:04:09Z 2024-07-19T05:04:09Z 2020 Conference Paper Hu, J., Goh, W. L., Zhang, Z. & Gao, Y. (2020). Voice keyword recognition based on spiking convolutional neural network for human-machine interface. 2020 3rd International Conference on Intelligent Autonomous Systems (ICoIAS), 77-82. https://dx.doi.org/10.1109/ICoIAS49312.2020.9081859 978-1-7281-6078-8 https://hdl.handle.net/10356/179088 10.1109/ICoIAS49312.2020.9081859 https://ieeexplore.ieee.org/abstract/document/9081859 77 82 en A18A4b0055 © 2020 IEEE. All rights reserved. This article may be downloaded for personal use only. Any other use requires prior permission of the copyright holder. The Version of Record is available online at http://doi.org/10.1109/ICoIAS49312.2020.9081859. application/pdf
spellingShingle	Engineering Spiking convolutional neural networks Voice keyword recognition Hu, Jinhai Goh, Wang Ling Zhang, Zhongyi Gao, Yuan Voice keyword recognition based on spiking convolutional neural network for human-machine interface
title	Voice keyword recognition based on spiking convolutional neural network for human-machine interface
title_full	Voice keyword recognition based on spiking convolutional neural network for human-machine interface
title_fullStr	Voice keyword recognition based on spiking convolutional neural network for human-machine interface
title_full_unstemmed	Voice keyword recognition based on spiking convolutional neural network for human-machine interface
title_short	Voice keyword recognition based on spiking convolutional neural network for human-machine interface
title_sort	voice keyword recognition based on spiking convolutional neural network for human machine interface
topic	Engineering Spiking convolutional neural networks Voice keyword recognition
url	https://hdl.handle.net/10356/179088 https://ieeexplore.ieee.org/abstract/document/9081859
work_keys_str_mv	AT hujinhai voicekeywordrecognitionbasedonspikingconvolutionalneuralnetworkforhumanmachineinterface AT gohwangling voicekeywordrecognitionbasedonspikingconvolutionalneuralnetworkforhumanmachineinterface AT zhangzhongyi voicekeywordrecognitionbasedonspikingconvolutionalneuralnetworkforhumanmachineinterface AT gaoyuan voicekeywordrecognitionbasedonspikingconvolutionalneuralnetworkforhumanmachineinterface

Voice keyword recognition based on spiking convolutional neural network for human-machine interface

Những quyển sách tương tự