Self-Supervised Transfer Learning from Natural Images for Sound Classification

We propose the implementation of transfer learning from natural images to audio-based images using self-supervised learning schemes. Through self-supervised learning, convolutional neural networks (CNNs) can learn the general representation of natural images without labels. In this study, a convolut...

সম্পূর্ণ বিবরণ

গ্রন্থ-পঞ্জীর বিবরন
প্রধান লেখক: Sungho Shin, Jongwon Kim, Yeonguk Yu, Seongju Lee, Kyoobin Lee
বিন্যাস: প্রবন্ধ
ভাষা:English
প্রকাশিত: MDPI AG 2021-03-01
মালা:Applied Sciences
বিষয়গুলি:
অনলাইন ব্যবহার করুন:https://www.mdpi.com/2076-3417/11/7/3043
বিবরন
সংক্ষিপ্ত:We propose the implementation of transfer learning from natural images to audio-based images using self-supervised learning schemes. Through self-supervised learning, convolutional neural networks (CNNs) can learn the general representation of natural images without labels. In this study, a convolutional neural network was pre-trained with natural images (ImageNet) via self-supervised learning; subsequently, it was fine-tuned on the target audio samples. Pre-training with the self-supervised learning scheme significantly improved the sound classification performance when validated on the following benchmarks: <i>ESC-50</i>, <i>UrbanSound8k</i>, and <i>GTZAN</i>. The network pre-trained via self-supervised learning achieved a similar level of accuracy as those pre-trained using a supervised method that require labels. Therefore, we demonstrated that transfer learning from natural images contributes to improvements in audio-related tasks, and self-supervised learning with natural images is adequate for pre-training scheme in terms of simplicity and effectiveness.
আইএসএসএন:2076-3417