Extended performance analysis of deep-learning algorithms for mice vocalization segmentation

Abstract Ultrasonic vocalizations (USVs) analysis represents a fundamental tool to study animal communication. It can be used to perform a behavioral investigation of mice for ethological studies and in the field of neuroscience and neuropharmacology. The USVs are usually recorded with a microphone...

Full description

Bibliographic Details
Main Authors:	Daniele Baggi, Marika Premoli, Alessandro Gnutti, Sara Anna Bonini, Riccardo Leonardi, Maurizio Memo, Pierangelo Migliorati
Format:	Article
Language:	English
Published:	Nature Portfolio 2023-07-01
Series:	Scientific Reports
Online Access:	https://doi.org/10.1038/s41598-023-38186-7

_version_	1797778937082806272
author	Daniele Baggi Marika Premoli Alessandro Gnutti Sara Anna Bonini Riccardo Leonardi Maurizio Memo Pierangelo Migliorati
author_facet	Daniele Baggi Marika Premoli Alessandro Gnutti Sara Anna Bonini Riccardo Leonardi Maurizio Memo Pierangelo Migliorati
author_sort	Daniele Baggi
collection	DOAJ
description	Abstract Ultrasonic vocalizations (USVs) analysis represents a fundamental tool to study animal communication. It can be used to perform a behavioral investigation of mice for ethological studies and in the field of neuroscience and neuropharmacology. The USVs are usually recorded with a microphone sensitive to ultrasound frequencies and then processed by specific software, which help the operator to identify and characterize different families of calls. Recently, many automated systems have been proposed for automatically performing both the detection and the classification of the USVs. Of course, the USV segmentation represents the crucial step for the general framework, since the quality of the call processing strictly depends on how accurately the call itself has been previously detected. In this paper, we investigate the performance of three supervised deep learning methods for automated USV segmentation: an Auto-Encoder Neural Network (AE), a U-NET Neural Network (UNET) and a Recurrent Neural Network (RNN). The proposed models receive as input the spectrogram associated with the recorded audio track and return as output the regions in which the USV calls have been detected. To evaluate the performance of the models, we have built a dataset by recording several audio tracks and manually segmenting the corresponding USV spectrograms generated with the Avisoft software, producing in this way the ground-truth (GT) used for training. All three proposed architectures demonstrated precision and recall scores exceeding $$90\%$$ 90 % , with UNET and AE achieving values above $$95\%$$ 95 % , surpassing other state-of-the-art methods that were considered for comparison in this study. Additionally, the evaluation was extended to an external dataset, where UNET once again exhibited the highest performance. We suggest that our experimental results may represent a valuable benchmark for future works.
first_indexed	2024-03-12T23:24:43Z
format	Article
id	doaj.art-eeec51018a1d4586ba6a21825a3449b2
institution	Directory Open Access Journal
issn	2045-2322
language	English
last_indexed	2024-03-12T23:24:43Z
publishDate	2023-07-01
publisher	Nature Portfolio
record_format	Article
series	Scientific Reports
spelling	doaj.art-eeec51018a1d4586ba6a21825a3449b22023-07-16T11:15:14ZengNature PortfolioScientific Reports2045-23222023-07-0113111410.1038/s41598-023-38186-7Extended performance analysis of deep-learning algorithms for mice vocalization segmentationDaniele Baggi0Marika Premoli1Alessandro Gnutti2Sara Anna Bonini3Riccardo Leonardi4Maurizio Memo5Pierangelo Migliorati6Department of Information Engineering, University of BresciaDepartment of Molecular and Translational Medicine, University of BresciaDepartment of Information Engineering, University of BresciaDepartment of Molecular and Translational Medicine, University of BresciaDepartment of Information Engineering, University of BresciaDepartment of Molecular and Translational Medicine, University of BresciaDepartment of Information Engineering, University of BresciaAbstract Ultrasonic vocalizations (USVs) analysis represents a fundamental tool to study animal communication. It can be used to perform a behavioral investigation of mice for ethological studies and in the field of neuroscience and neuropharmacology. The USVs are usually recorded with a microphone sensitive to ultrasound frequencies and then processed by specific software, which help the operator to identify and characterize different families of calls. Recently, many automated systems have been proposed for automatically performing both the detection and the classification of the USVs. Of course, the USV segmentation represents the crucial step for the general framework, since the quality of the call processing strictly depends on how accurately the call itself has been previously detected. In this paper, we investigate the performance of three supervised deep learning methods for automated USV segmentation: an Auto-Encoder Neural Network (AE), a U-NET Neural Network (UNET) and a Recurrent Neural Network (RNN). The proposed models receive as input the spectrogram associated with the recorded audio track and return as output the regions in which the USV calls have been detected. To evaluate the performance of the models, we have built a dataset by recording several audio tracks and manually segmenting the corresponding USV spectrograms generated with the Avisoft software, producing in this way the ground-truth (GT) used for training. All three proposed architectures demonstrated precision and recall scores exceeding $$90\%$$ 90 % , with UNET and AE achieving values above $$95\%$$ 95 % , surpassing other state-of-the-art methods that were considered for comparison in this study. Additionally, the evaluation was extended to an external dataset, where UNET once again exhibited the highest performance. We suggest that our experimental results may represent a valuable benchmark for future works.https://doi.org/10.1038/s41598-023-38186-7
spellingShingle	Daniele Baggi Marika Premoli Alessandro Gnutti Sara Anna Bonini Riccardo Leonardi Maurizio Memo Pierangelo Migliorati Extended performance analysis of deep-learning algorithms for mice vocalization segmentation Scientific Reports
title	Extended performance analysis of deep-learning algorithms for mice vocalization segmentation
title_full	Extended performance analysis of deep-learning algorithms for mice vocalization segmentation
title_fullStr	Extended performance analysis of deep-learning algorithms for mice vocalization segmentation
title_full_unstemmed	Extended performance analysis of deep-learning algorithms for mice vocalization segmentation
title_short	Extended performance analysis of deep-learning algorithms for mice vocalization segmentation
title_sort	extended performance analysis of deep learning algorithms for mice vocalization segmentation
url	https://doi.org/10.1038/s41598-023-38186-7
work_keys_str_mv	AT danielebaggi extendedperformanceanalysisofdeeplearningalgorithmsformicevocalizationsegmentation AT marikapremoli extendedperformanceanalysisofdeeplearningalgorithmsformicevocalizationsegmentation AT alessandrognutti extendedperformanceanalysisofdeeplearningalgorithmsformicevocalizationsegmentation AT saraannabonini extendedperformanceanalysisofdeeplearningalgorithmsformicevocalizationsegmentation AT riccardoleonardi extendedperformanceanalysisofdeeplearningalgorithmsformicevocalizationsegmentation AT mauriziomemo extendedperformanceanalysisofdeeplearningalgorithmsformicevocalizationsegmentation AT pierangelomigliorati extendedperformanceanalysisofdeeplearningalgorithmsformicevocalizationsegmentation

Extended performance analysis of deep-learning algorithms for mice vocalization segmentation

Similar Items