Improving the Accuracy of Species Identification by Combining Deep Learning With Field Occurrence Records
Citizen science is essential for nationwide ecological surveys of species distribution. While the accuracy of the information collected by beginner participants is not guaranteed, it is important to develop an automated system to assist species identification. Deep learning techniques for image reco...
Main Authors: | , , |
---|---|
Format: | Article |
Language: | English |
Published: |
Frontiers Media S.A.
2021-12-01
|
Series: | Frontiers in Ecology and Evolution |
Subjects: | |
Online Access: | https://www.frontiersin.org/articles/10.3389/fevo.2021.762173/full |
_version_ | 1819095504982638592 |
---|---|
author | Jianqiang Sun Ryo Futahashi Takehiko Yamanaka |
author_facet | Jianqiang Sun Ryo Futahashi Takehiko Yamanaka |
author_sort | Jianqiang Sun |
collection | DOAJ |
description | Citizen science is essential for nationwide ecological surveys of species distribution. While the accuracy of the information collected by beginner participants is not guaranteed, it is important to develop an automated system to assist species identification. Deep learning techniques for image recognition have been successfully applied in many fields and may contribute to species identification. However, deep learning techniques have not been utilized in ecological surveys of citizen science, because they require the collection of a large number of images, which is time-consuming and labor-intensive. To counter these issues, we propose a simple and effective strategy to construct species identification systems using fewer images. As an example, we collected 4,571 images of 204 species of Japanese dragonflies and damselflies from open-access websites (i.e., web scraping) and scanned 4,005 images from books and specimens for species identification. In addition, we obtained field occurrence records (i.e., range of distribution) of all species of dragonflies and damselflies from the National Biodiversity Center, Japan. Using the images and records, we developed a species identification system for Japanese dragonflies and damselflies. We validated that the accuracy of the species identification system was improved by combining web-scraped and scanned images; the top-1 accuracy of the system was 0.324 when trained using only web-scraped images, whereas it improved to 0.546 when trained using both web-scraped and scanned images. In addition, the combination of images and field occurrence records further improved the top-1 accuracy to 0.668. The values of top-3 accuracy under the three conditions were 0.565, 0.768, and 0.873, respectively. Thus, combining images with field occurrence records markedly improved the accuracy of the species identification system. The strategy of species identification proposed in this study can be applied to any group of organisms. Furthermore, it has the potential to strike a balance between continuously recruiting beginner participants and updating the data accuracy of citizen science. |
first_indexed | 2024-12-21T23:44:22Z |
format | Article |
id | doaj.art-e73005eadade4cdc93a7900ff44005b1 |
institution | Directory Open Access Journal |
issn | 2296-701X |
language | English |
last_indexed | 2024-12-21T23:44:22Z |
publishDate | 2021-12-01 |
publisher | Frontiers Media S.A. |
record_format | Article |
series | Frontiers in Ecology and Evolution |
spelling | doaj.art-e73005eadade4cdc93a7900ff44005b12022-12-21T18:46:09ZengFrontiers Media S.A.Frontiers in Ecology and Evolution2296-701X2021-12-01910.3389/fevo.2021.762173762173Improving the Accuracy of Species Identification by Combining Deep Learning With Field Occurrence RecordsJianqiang Sun0Ryo Futahashi1Takehiko Yamanaka2Research Center for Agricultural Information Technology, National Agriculture and Food Research Organization, Tsukuba, JapanBioproduction Research Institute, National Institute of Advanced Industrial Science and Technology, Tsukuba, JapanResearch Center for Agricultural Information Technology, National Agriculture and Food Research Organization, Tsukuba, JapanCitizen science is essential for nationwide ecological surveys of species distribution. While the accuracy of the information collected by beginner participants is not guaranteed, it is important to develop an automated system to assist species identification. Deep learning techniques for image recognition have been successfully applied in many fields and may contribute to species identification. However, deep learning techniques have not been utilized in ecological surveys of citizen science, because they require the collection of a large number of images, which is time-consuming and labor-intensive. To counter these issues, we propose a simple and effective strategy to construct species identification systems using fewer images. As an example, we collected 4,571 images of 204 species of Japanese dragonflies and damselflies from open-access websites (i.e., web scraping) and scanned 4,005 images from books and specimens for species identification. In addition, we obtained field occurrence records (i.e., range of distribution) of all species of dragonflies and damselflies from the National Biodiversity Center, Japan. Using the images and records, we developed a species identification system for Japanese dragonflies and damselflies. We validated that the accuracy of the species identification system was improved by combining web-scraped and scanned images; the top-1 accuracy of the system was 0.324 when trained using only web-scraped images, whereas it improved to 0.546 when trained using both web-scraped and scanned images. In addition, the combination of images and field occurrence records further improved the top-1 accuracy to 0.668. The values of top-3 accuracy under the three conditions were 0.565, 0.768, and 0.873, respectively. Thus, combining images with field occurrence records markedly improved the accuracy of the species identification system. The strategy of species identification proposed in this study can be applied to any group of organisms. Furthermore, it has the potential to strike a balance between continuously recruiting beginner participants and updating the data accuracy of citizen science.https://www.frontiersin.org/articles/10.3389/fevo.2021.762173/fullcitizen sciencespecies identificationdragonflydamselflydeep learningimage recognition |
spellingShingle | Jianqiang Sun Ryo Futahashi Takehiko Yamanaka Improving the Accuracy of Species Identification by Combining Deep Learning With Field Occurrence Records Frontiers in Ecology and Evolution citizen science species identification dragonfly damselfly deep learning image recognition |
title | Improving the Accuracy of Species Identification by Combining Deep Learning With Field Occurrence Records |
title_full | Improving the Accuracy of Species Identification by Combining Deep Learning With Field Occurrence Records |
title_fullStr | Improving the Accuracy of Species Identification by Combining Deep Learning With Field Occurrence Records |
title_full_unstemmed | Improving the Accuracy of Species Identification by Combining Deep Learning With Field Occurrence Records |
title_short | Improving the Accuracy of Species Identification by Combining Deep Learning With Field Occurrence Records |
title_sort | improving the accuracy of species identification by combining deep learning with field occurrence records |
topic | citizen science species identification dragonfly damselfly deep learning image recognition |
url | https://www.frontiersin.org/articles/10.3389/fevo.2021.762173/full |
work_keys_str_mv | AT jianqiangsun improvingtheaccuracyofspeciesidentificationbycombiningdeeplearningwithfieldoccurrencerecords AT ryofutahashi improvingtheaccuracyofspeciesidentificationbycombiningdeeplearningwithfieldoccurrencerecords AT takehikoyamanaka improvingtheaccuracyofspeciesidentificationbycombiningdeeplearningwithfieldoccurrencerecords |