Machine learning / deep learning approach to soundscape analysis

Visual understanding of the soundscape environment is an enabling factor for a wide range of applications in studying how humans perceive sounds. Audiovisual scene decomposition allows further understanding of soundscape. This project will be focusing on the decomposition of urban soundscapes such a...

Full description

Bibliographic Details
Main Author:	Koh, Cheng Yong
Other Authors:	Gan Woon Seng
Format:	Final Year Project (FYP)
Language:	English
Published:	Nanyang Technological University 2022
Subjects:	Engineering::Electrical and electronic engineering
Online Access:	https://hdl.handle.net/10356/158057

Description
Summary:	Visual understanding of the soundscape environment is an enabling factor for a wide range of applications in studying how humans perceive sounds. Audiovisual scene decomposition allows further understanding of soundscape. This project will be focusing on the decomposition of urban soundscapes such as parks, plazas, streets, etc. As water sounds are a prominent sound source in urban landscapes, this project will add a new waterbody class to the segmentation model which do not currently exist in most multiclass urban semantic segmentation model. This project proposes the use of the DeepLabV3+ model, with a ResNet50 backbone, trained on an improved Cityscapes dataset to perform semantic segmentation for urban scene decomposition. The training dataset will include additional waterbody images on top of the original Cityscapes images.

Machine learning / deep learning approach to soundscape analysis

Similar Items