Automatic Recognition of Giant Panda Attributes from Their Vocalizations Based on Squeeze-and-Excitation Network

The giant panda (<i>Ailuropoda melanoleuca</i>) has long attracted the attention of conservationists as a flagship and umbrella species. Collecting attribute information on the age structure and sex ratio of the wild giant panda populations can support our understanding of their status a...

Full description

Bibliographic Details
Main Authors: Qijun Zhao, Yanqiu Zhang, Rong Hou, Mengnan He, Peng Liu, Ping Xu, Zhihe Zhang, Peng Chen
Format: Article
Language:English
Published: MDPI AG 2022-10-01
Series:Sensors
Subjects:
Online Access:https://www.mdpi.com/1424-8220/22/20/8015
_version_ 1797469945747996672
author Qijun Zhao
Yanqiu Zhang
Rong Hou
Mengnan He
Peng Liu
Ping Xu
Zhihe Zhang
Peng Chen
author_facet Qijun Zhao
Yanqiu Zhang
Rong Hou
Mengnan He
Peng Liu
Ping Xu
Zhihe Zhang
Peng Chen
author_sort Qijun Zhao
collection DOAJ
description The giant panda (<i>Ailuropoda melanoleuca</i>) has long attracted the attention of conservationists as a flagship and umbrella species. Collecting attribute information on the age structure and sex ratio of the wild giant panda populations can support our understanding of their status and the design of more effective conservation schemes. In view of the shortcomings of traditional methods, which cannot automatically recognize the age and sex of giant pandas, we designed a SENet (Squeeze-and-Excitation Network)-based model to automatically recognize the attributes of giant pandas from their vocalizations. We focused on the recognition of age groups (juvenile and adult) and sex of giant pandas. The reason for using vocalizations is that among the modes of animal communication, sound has the advantages of long transmission distances, strong penetrating power, and rich information. We collected a dataset of calls from 28 captive giant panda individuals, with a total duration of 1298.02 s of recordings. We used MFCC (Mel-frequency Cepstral Coefficients), which is an acoustic feature, as inputs for the SENet. Considering that small datasets are not conducive to convergence in the training process, we increased the size of the training data via SpecAugment. In addition, we used focal loss to reduce the impact of data imbalance. Our results showed that the F1 scores of our method for recognizing age group and sex reached 96.46% ± 5.71% and 85.85% ± 7.99%, respectively, demonstrating that the automatic recognition of giant panda attributes based on their vocalizations is feasible and effective. This more convenient, quick, timesaving, and laborsaving attribute recognition method can be used in the investigation of wild giant pandas in the future.
first_indexed 2024-03-09T19:30:07Z
format Article
id doaj.art-cf9d3c40dc664878b4502f293f0628be
institution Directory Open Access Journal
issn 1424-8220
language English
last_indexed 2024-03-09T19:30:07Z
publishDate 2022-10-01
publisher MDPI AG
record_format Article
series Sensors
spelling doaj.art-cf9d3c40dc664878b4502f293f0628be2023-11-24T02:30:23ZengMDPI AGSensors1424-82202022-10-012220801510.3390/s22208015Automatic Recognition of Giant Panda Attributes from Their Vocalizations Based on Squeeze-and-Excitation NetworkQijun Zhao0Yanqiu Zhang1Rong Hou2Mengnan He3Peng Liu4Ping Xu5Zhihe Zhang6Peng Chen7National Key Laboratory of Fundamental Science on Synthetic Vision, Sichuan University, Chengdu 610065, ChinaNational Key Laboratory of Fundamental Science on Synthetic Vision, Sichuan University, Chengdu 610065, ChinaChengdu Research Base of Giant Panda Breeding, Sichuan Key Laboratory of Conservation Biology for Endangered Wildlife, Chengdu 610086, ChinaChengdu Research Base of Giant Panda Breeding, Sichuan Key Laboratory of Conservation Biology for Endangered Wildlife, Chengdu 610086, ChinaChengdu Research Base of Giant Panda Breeding, Sichuan Key Laboratory of Conservation Biology for Endangered Wildlife, Chengdu 610086, ChinaGiant Panda National Park Chengdu Administration, Chengdu 610086, ChinaSichuan Academy of Giant Panda, Chengdu 610086, ChinaChengdu Research Base of Giant Panda Breeding, Sichuan Key Laboratory of Conservation Biology for Endangered Wildlife, Chengdu 610086, ChinaThe giant panda (<i>Ailuropoda melanoleuca</i>) has long attracted the attention of conservationists as a flagship and umbrella species. Collecting attribute information on the age structure and sex ratio of the wild giant panda populations can support our understanding of their status and the design of more effective conservation schemes. In view of the shortcomings of traditional methods, which cannot automatically recognize the age and sex of giant pandas, we designed a SENet (Squeeze-and-Excitation Network)-based model to automatically recognize the attributes of giant pandas from their vocalizations. We focused on the recognition of age groups (juvenile and adult) and sex of giant pandas. The reason for using vocalizations is that among the modes of animal communication, sound has the advantages of long transmission distances, strong penetrating power, and rich information. We collected a dataset of calls from 28 captive giant panda individuals, with a total duration of 1298.02 s of recordings. We used MFCC (Mel-frequency Cepstral Coefficients), which is an acoustic feature, as inputs for the SENet. Considering that small datasets are not conducive to convergence in the training process, we increased the size of the training data via SpecAugment. In addition, we used focal loss to reduce the impact of data imbalance. Our results showed that the F1 scores of our method for recognizing age group and sex reached 96.46% ± 5.71% and 85.85% ± 7.99%, respectively, demonstrating that the automatic recognition of giant panda attributes based on their vocalizations is feasible and effective. This more convenient, quick, timesaving, and laborsaving attribute recognition method can be used in the investigation of wild giant pandas in the future.https://www.mdpi.com/1424-8220/22/20/8015giant pandaattribute recognitionbioacousticsspecies conservationdeep learningSENet
spellingShingle Qijun Zhao
Yanqiu Zhang
Rong Hou
Mengnan He
Peng Liu
Ping Xu
Zhihe Zhang
Peng Chen
Automatic Recognition of Giant Panda Attributes from Their Vocalizations Based on Squeeze-and-Excitation Network
Sensors
giant panda
attribute recognition
bioacoustics
species conservation
deep learning
SENet
title Automatic Recognition of Giant Panda Attributes from Their Vocalizations Based on Squeeze-and-Excitation Network
title_full Automatic Recognition of Giant Panda Attributes from Their Vocalizations Based on Squeeze-and-Excitation Network
title_fullStr Automatic Recognition of Giant Panda Attributes from Their Vocalizations Based on Squeeze-and-Excitation Network
title_full_unstemmed Automatic Recognition of Giant Panda Attributes from Their Vocalizations Based on Squeeze-and-Excitation Network
title_short Automatic Recognition of Giant Panda Attributes from Their Vocalizations Based on Squeeze-and-Excitation Network
title_sort automatic recognition of giant panda attributes from their vocalizations based on squeeze and excitation network
topic giant panda
attribute recognition
bioacoustics
species conservation
deep learning
SENet
url https://www.mdpi.com/1424-8220/22/20/8015
work_keys_str_mv AT qijunzhao automaticrecognitionofgiantpandaattributesfromtheirvocalizationsbasedonsqueezeandexcitationnetwork
AT yanqiuzhang automaticrecognitionofgiantpandaattributesfromtheirvocalizationsbasedonsqueezeandexcitationnetwork
AT ronghou automaticrecognitionofgiantpandaattributesfromtheirvocalizationsbasedonsqueezeandexcitationnetwork
AT mengnanhe automaticrecognitionofgiantpandaattributesfromtheirvocalizationsbasedonsqueezeandexcitationnetwork
AT pengliu automaticrecognitionofgiantpandaattributesfromtheirvocalizationsbasedonsqueezeandexcitationnetwork
AT pingxu automaticrecognitionofgiantpandaattributesfromtheirvocalizationsbasedonsqueezeandexcitationnetwork
AT zhihezhang automaticrecognitionofgiantpandaattributesfromtheirvocalizationsbasedonsqueezeandexcitationnetwork
AT pengchen automaticrecognitionofgiantpandaattributesfromtheirvocalizationsbasedonsqueezeandexcitationnetwork