ICPTC: Iranian commercial pistachio tree cultivars standard dataset

This paper contains datasets related to the “An efficient deep learning model for cultivar identification of a pistachio tree” [1]. There are about 11 species of pistachio that often have a high commercial and economic value in Iran and United States. The ability to identify pistachio tree cultivars...

Full description

Bibliographic Details
Main Authors: Ahmad Heidary-Sharifabad, Mohsen Sardari Zarchi, Gholamreza Zarei
Format: Article
Language:English
Published: Elsevier 2021-10-01
Series:Data in Brief
Subjects:
Online Access:http://www.sciencedirect.com/science/article/pii/S2352340921006326
Description
Summary:This paper contains datasets related to the “An efficient deep learning model for cultivar identification of a pistachio tree” [1]. There are about 11 species of pistachio that often have a high commercial and economic value in Iran and United States. The ability to identify pistachio tree cultivars, due to differences in the characteristics/traits of these species, is crucial for harvest the optimal yields, cost reduction, and damage prevention. For this purpose, identification of pistachio tree cultivars in their natural habitat is necessary. The cultivar identification relying on its appearance is a challenging vision task and can be facilitated by deep learning. The feasibility of applying deep learning algorithms to identify Pistachio tree cultivars depends on access to the appropriate relevant dataset.Therefore, ICPTC dataset was collected from natural habitats of different trees of Pistachio cultivars, in real-world conditions from pistachio orchard farms of Chah-Afzal region in Ardakan County, Yazd, Iran. This imbalanced dataset is compiled of 526 RGB color images from 4 Pistachio tree cultivars, each cultivar 109-171 images. The tree of Iranian commercial pistachio cultivars, with names like Jumbo (Kalle-Ghuchi), Long (Ahmad-Aghaei), Round (O'hadi), and Super-long (Akbari) have distinctive branch expansion, leaf patterns, leaf shapes and colors. Imaging is performed from multiple trees for each cultivar, with different camera-to-target distances, viewpoints, angles, and natural sunlight during April and May in the spring. The collected images are not pre-processed, only grouped into their respective class (Jumbo, Long, Round, and Super long). The images in each class are separated by 20% for testing, 17% for validation, and 63% for training. Test images are selected from trees different from the training set. Then training and validation images are randomly separated from the remaining images in each category.The ICPTC dataset is publicly and freely available at https://data.mendeley.com/datasets/6mmjjkpd5m/draft?a=af46a232-df30-4cf1-b303-6071d90ac8ad
ISSN:2352-3409