Neuromorphic machine learning for audio processing : from bio-inspiration to biomedical applications

The recent success of Deep Neural Networks (DNN) has renewed interest in machine learning and, in particular, bio-inspired machine learning algorithms. DNN refers to neural networks with multiple layers (typically two or more) where the neurons are interconnected using tunable weights. Although thes...

Full description

Bibliographic Details
Main Author: Acharya, Jyotibdha
Other Authors: Arindam Basu
Format: Thesis-Doctor of Philosophy
Language:English
Published: Nanyang Technological University 2020
Subjects:
Online Access:https://hdl.handle.net/10356/142608
Description
Summary:The recent success of Deep Neural Networks (DNN) has renewed interest in machine learning and, in particular, bio-inspired machine learning algorithms. DNN refers to neural networks with multiple layers (typically two or more) where the neurons are interconnected using tunable weights. Although these architectures are not new, availability of massive amount of data, huge computing power and new training techniques have led to its great success in recent times. DNN has been applied to a variety of fields such as image classification, face recognition in images, word recognition in speech, natural language processing, game playing etc. and the success stories of DNN continue to increase every day. With the progress in software, there has been a concomitant push to develop better hardware architectures to support the deployment as well as training of these algorithms. While these methods are loosely inspired by the brain, in terms of actual implementation, the similarity between mammalian brain and these algorithms is merely superficial. More often than not, these algorithms require huge energy for real world tasks due to their computation and memory heavy nature, which limits their potential application in energy constrained scenarios such as IoT or wearables. Internet of Things (IoT) is a rapidly growing phenomenon where millions of connected sensors are distributed to improve a variety of applications ranging from precision agriculture to smart factories. In recent years, there has also been a large shift in biomedical industry towards reliable wearable devices for monitoring of health conditions and early detection of diseases. To make IoT systems scalable to millions of nodes/sensors, one has to overcome limits of data rate and energy dissipation. A possible solution is edge computing where part of the processing is done at the sensor (at the edge of the network) instead of shifting all processing to the cloud. The common challenge for wide scale adaptation of edge computing in Internet of Things (IoT) and wearable applications is the constraints posed by the limited energy and memory available in these devices. Neuromorphic engineering is a possible solution to this problem where different approaches such as analog or physical based processing, non von-Neumann architectures, low-precision digital datapath and event or spike based processing are used to overcome energy and memory bottlenecks. Therefore, it is no surprise that Neuromorphic engineering was recently voted as one of the top ten emerging technologies by the World Economic Forum and the market for neuromorphic hardware is expected to grow to ~$1.8B by 2023/2025. However, cross-layer innovations on neuromorphic algorithms, architectures, circuits, and devices are required to enable adaptive intelligence especially on embedded systems with severe power and area constraints. Since the success story of deep learning began with massive improvements in computer vision tasks offered by deep neural networks, the same trend is repeated in neuromorphic engineering. Spiking neural networks are already reaching performance close to their traditional DL counterparts and several post-CMOS neuromorphic platforms have been shown to perform basic computer vision tasks such as digit recognition. The primary focus of this thesis is a less explored cognitive task from neuromorphic perspective, audio processing. To this end, neuromorphic audio systems have been explored from a diverse set of perspectives, neuromorphic audio sensors, novel neuromorphic nano-devices as well as potential biomedical application areas for such systems. In the first work, low power feature extraction and data preprocessing techniques customized towards neuromorphic audio sensors were explored. The developments of neuromorphic spiking cochlea sensors and population encoding based ELM hardware were brought together to design a real-time, power and memory efficient neuromorphic speech recognition system. The proposed hybrid feature extraction strategies were also extended to neuromorphic vision sensor based object tracking. In the second work, a unique neuromorphic computing platform comprising of photo-excitable neuristors (PENs) is proposed that expands the potential of memristive-based implementations to advance beyond simple pattern matching to complex cognitive tasks such as speech recognition. Combining optical writing and electrical erasing schemes, a new method is developed to transfer offline learnt weights of a deep recurrent neural network on to the memristive device resulting in highly energy efficient speech classification due to electrical mode inference and in-memory computing. In the third work, different feature extraction and classification strategies are implemented for audio based respiratory anomaly detection and then low precision representations of the proposed networks are explored for memory efficient wearable implementation. In the final work, the viability of spiking neural networks and bio-realistic features are examined for wider biomedical applications. By employing spiking neural networks for audio based detection of cardiac anomalies, it is demonstrated that spiking neural networks can achieve similar level of performance as their ANN counterpart at a fractional computational cost. Overall, the primary goal of this work is to device brain-inspired strategies and algorithms to enable power and memory efficient audio processing that can be beneficial to a wide area of resource constrained applications ranging from speech processing to audio based biomedical monitoring.