Urban sound analysis and synthesis using artificial intelligence

With the advent of artificial intelligence and machine learning, multiple industries have gone through different kinds of revolution. For example, convolutional neural networks has drastically changed the conventional ways for computer to capture features of image and video also known as computer vi...

Full description

Bibliographic Details
Main Author:	Guo, Zixun
Other Authors:	Gan Woon Seng
Format:	Final Year Project (FYP)
Language:	English
Published:	Nanyang Technological University 2020
Subjects:	Engineering::Electrical and electronic engineering
Online Access:	https://hdl.handle.net/10356/141355

_version_	1826113410349137920
author	Guo, Zixun
author2	Gan Woon Seng
author_facet	Gan Woon Seng Guo, Zixun
author_sort	Guo, Zixun
collection	NTU
description	With the advent of artificial intelligence and machine learning, multiple industries have gone through different kinds of revolution. For example, convolutional neural networks has drastically changed the conventional ways for computer to capture features of image and video also known as computer vision. In the audio domain, artificial intelligence has been widely used in areas such as sound classification, speech to text conversion etc. In this work, I will mainly focus on the use of artificial intelligence in urban sound analysis and processing which was shown to have much better performance than conventional methods. Unlike images or videos, analog sound has to be sampled and quantized in order to be stored in digital format. In this work, only digital sound is concerned since neural networks can only pick up digital values. Digital sound also has its unique sets of features such as sampling frequency, bit depth. Various research work has also utilized sound features in the frequency domain such as bandwidth. One important feature of digital sound, sampling frequency, is normally beyond 8kHz. This would bring up some issues in audio processing since one second of audio would contain at least thousands of discrete digital values. In order to process large amounts of sound samples in a sequential manner, the focus of this work will be on recurrent neural networks, a type of network structure with its own memory mechanism that can deal with long-term dependency. In this work I will focus on two topics: audio captioning and audio synthesis. Firstly, captioning using AI has been widely used in the field of computer vision. Meanwhile, audio captioning would be useful for those people who may have hearing issues to perceive sound information. Secondly, audio data collection could be time-consuming and costly. However by learning audio patterns and inter-dependencies, sound synthesis would generate sound more efficiently.
first_indexed	2024-10-01T03:22:47Z
format	Final Year Project (FYP)
id	ntu-10356/141355
institution	Nanyang Technological University
language	English
last_indexed	2024-10-01T03:22:47Z
publishDate	2020
publisher	Nanyang Technological University
record_format	dspace
spelling	ntu-10356/1413552023-07-07T18:38:57Z Urban sound analysis and synthesis using artificial intelligence Guo, Zixun Gan Woon Seng School of Electrical and Electronic Engineering Smart Nation TRANS Lab Information Communication Institute of Singapore Furi Andi Karnapi EWSGAN@ntu.edu.sg, furi@ntu.edu.sg Engineering::Electrical and electronic engineering With the advent of artificial intelligence and machine learning, multiple industries have gone through different kinds of revolution. For example, convolutional neural networks has drastically changed the conventional ways for computer to capture features of image and video also known as computer vision. In the audio domain, artificial intelligence has been widely used in areas such as sound classification, speech to text conversion etc. In this work, I will mainly focus on the use of artificial intelligence in urban sound analysis and processing which was shown to have much better performance than conventional methods. Unlike images or videos, analog sound has to be sampled and quantized in order to be stored in digital format. In this work, only digital sound is concerned since neural networks can only pick up digital values. Digital sound also has its unique sets of features such as sampling frequency, bit depth. Various research work has also utilized sound features in the frequency domain such as bandwidth. One important feature of digital sound, sampling frequency, is normally beyond 8kHz. This would bring up some issues in audio processing since one second of audio would contain at least thousands of discrete digital values. In order to process large amounts of sound samples in a sequential manner, the focus of this work will be on recurrent neural networks, a type of network structure with its own memory mechanism that can deal with long-term dependency. In this work I will focus on two topics: audio captioning and audio synthesis. Firstly, captioning using AI has been widely used in the field of computer vision. Meanwhile, audio captioning would be useful for those people who may have hearing issues to perceive sound information. Secondly, audio data collection could be time-consuming and costly. However by learning audio patterns and inter-dependencies, sound synthesis would generate sound more efficiently. Bachelor of Engineering (Electrical and Electronic Engineering) 2020-06-08T02:17:26Z 2020-06-08T02:17:26Z 2020 Final Year Project (FYP) https://hdl.handle.net/10356/141355 en A3090-191 application/pdf Nanyang Technological University
spellingShingle	Engineering::Electrical and electronic engineering Guo, Zixun Urban sound analysis and synthesis using artificial intelligence
title	Urban sound analysis and synthesis using artificial intelligence
title_full	Urban sound analysis and synthesis using artificial intelligence
title_fullStr	Urban sound analysis and synthesis using artificial intelligence
title_full_unstemmed	Urban sound analysis and synthesis using artificial intelligence
title_short	Urban sound analysis and synthesis using artificial intelligence
title_sort	urban sound analysis and synthesis using artificial intelligence
topic	Engineering::Electrical and electronic engineering
url	https://hdl.handle.net/10356/141355
work_keys_str_mv	AT guozixun urbansoundanalysisandsynthesisusingartificialintelligence

Urban sound analysis and synthesis using artificial intelligence

Similar Items