Improving the Accuracy of Species Identification by Combining Deep Learning With Field Occurrence Records

Citizen science is essential for nationwide ecological surveys of species distribution. While the accuracy of the information collected by beginner participants is not guaranteed, it is important to develop an automated system to assist species identification. Deep learning techniques for image reco...

Full description

Bibliographic Details
Main Authors: Jianqiang Sun, Ryo Futahashi, Takehiko Yamanaka
Format: Article
Language:English
Published: Frontiers Media S.A. 2021-12-01
Series:Frontiers in Ecology and Evolution
Subjects:
Online Access:https://www.frontiersin.org/articles/10.3389/fevo.2021.762173/full
_version_ 1819095504982638592
author Jianqiang Sun
Ryo Futahashi
Takehiko Yamanaka
author_facet Jianqiang Sun
Ryo Futahashi
Takehiko Yamanaka
author_sort Jianqiang Sun
collection DOAJ
description Citizen science is essential for nationwide ecological surveys of species distribution. While the accuracy of the information collected by beginner participants is not guaranteed, it is important to develop an automated system to assist species identification. Deep learning techniques for image recognition have been successfully applied in many fields and may contribute to species identification. However, deep learning techniques have not been utilized in ecological surveys of citizen science, because they require the collection of a large number of images, which is time-consuming and labor-intensive. To counter these issues, we propose a simple and effective strategy to construct species identification systems using fewer images. As an example, we collected 4,571 images of 204 species of Japanese dragonflies and damselflies from open-access websites (i.e., web scraping) and scanned 4,005 images from books and specimens for species identification. In addition, we obtained field occurrence records (i.e., range of distribution) of all species of dragonflies and damselflies from the National Biodiversity Center, Japan. Using the images and records, we developed a species identification system for Japanese dragonflies and damselflies. We validated that the accuracy of the species identification system was improved by combining web-scraped and scanned images; the top-1 accuracy of the system was 0.324 when trained using only web-scraped images, whereas it improved to 0.546 when trained using both web-scraped and scanned images. In addition, the combination of images and field occurrence records further improved the top-1 accuracy to 0.668. The values of top-3 accuracy under the three conditions were 0.565, 0.768, and 0.873, respectively. Thus, combining images with field occurrence records markedly improved the accuracy of the species identification system. The strategy of species identification proposed in this study can be applied to any group of organisms. Furthermore, it has the potential to strike a balance between continuously recruiting beginner participants and updating the data accuracy of citizen science.
first_indexed 2024-12-21T23:44:22Z
format Article
id doaj.art-e73005eadade4cdc93a7900ff44005b1
institution Directory Open Access Journal
issn 2296-701X
language English
last_indexed 2024-12-21T23:44:22Z
publishDate 2021-12-01
publisher Frontiers Media S.A.
record_format Article
series Frontiers in Ecology and Evolution
spelling doaj.art-e73005eadade4cdc93a7900ff44005b12022-12-21T18:46:09ZengFrontiers Media S.A.Frontiers in Ecology and Evolution2296-701X2021-12-01910.3389/fevo.2021.762173762173Improving the Accuracy of Species Identification by Combining Deep Learning With Field Occurrence RecordsJianqiang Sun0Ryo Futahashi1Takehiko Yamanaka2Research Center for Agricultural Information Technology, National Agriculture and Food Research Organization, Tsukuba, JapanBioproduction Research Institute, National Institute of Advanced Industrial Science and Technology, Tsukuba, JapanResearch Center for Agricultural Information Technology, National Agriculture and Food Research Organization, Tsukuba, JapanCitizen science is essential for nationwide ecological surveys of species distribution. While the accuracy of the information collected by beginner participants is not guaranteed, it is important to develop an automated system to assist species identification. Deep learning techniques for image recognition have been successfully applied in many fields and may contribute to species identification. However, deep learning techniques have not been utilized in ecological surveys of citizen science, because they require the collection of a large number of images, which is time-consuming and labor-intensive. To counter these issues, we propose a simple and effective strategy to construct species identification systems using fewer images. As an example, we collected 4,571 images of 204 species of Japanese dragonflies and damselflies from open-access websites (i.e., web scraping) and scanned 4,005 images from books and specimens for species identification. In addition, we obtained field occurrence records (i.e., range of distribution) of all species of dragonflies and damselflies from the National Biodiversity Center, Japan. Using the images and records, we developed a species identification system for Japanese dragonflies and damselflies. We validated that the accuracy of the species identification system was improved by combining web-scraped and scanned images; the top-1 accuracy of the system was 0.324 when trained using only web-scraped images, whereas it improved to 0.546 when trained using both web-scraped and scanned images. In addition, the combination of images and field occurrence records further improved the top-1 accuracy to 0.668. The values of top-3 accuracy under the three conditions were 0.565, 0.768, and 0.873, respectively. Thus, combining images with field occurrence records markedly improved the accuracy of the species identification system. The strategy of species identification proposed in this study can be applied to any group of organisms. Furthermore, it has the potential to strike a balance between continuously recruiting beginner participants and updating the data accuracy of citizen science.https://www.frontiersin.org/articles/10.3389/fevo.2021.762173/fullcitizen sciencespecies identificationdragonflydamselflydeep learningimage recognition
spellingShingle Jianqiang Sun
Ryo Futahashi
Takehiko Yamanaka
Improving the Accuracy of Species Identification by Combining Deep Learning With Field Occurrence Records
Frontiers in Ecology and Evolution
citizen science
species identification
dragonfly
damselfly
deep learning
image recognition
title Improving the Accuracy of Species Identification by Combining Deep Learning With Field Occurrence Records
title_full Improving the Accuracy of Species Identification by Combining Deep Learning With Field Occurrence Records
title_fullStr Improving the Accuracy of Species Identification by Combining Deep Learning With Field Occurrence Records
title_full_unstemmed Improving the Accuracy of Species Identification by Combining Deep Learning With Field Occurrence Records
title_short Improving the Accuracy of Species Identification by Combining Deep Learning With Field Occurrence Records
title_sort improving the accuracy of species identification by combining deep learning with field occurrence records
topic citizen science
species identification
dragonfly
damselfly
deep learning
image recognition
url https://www.frontiersin.org/articles/10.3389/fevo.2021.762173/full
work_keys_str_mv AT jianqiangsun improvingtheaccuracyofspeciesidentificationbycombiningdeeplearningwithfieldoccurrencerecords
AT ryofutahashi improvingtheaccuracyofspeciesidentificationbycombiningdeeplearningwithfieldoccurrencerecords
AT takehikoyamanaka improvingtheaccuracyofspeciesidentificationbycombiningdeeplearningwithfieldoccurrencerecords