A GIS Pipeline to Produce GeoAI Datasets from Drone Overhead Imagery

Drone imagery is becoming the main source of overhead information to support decisions in many different fields, especially with deep learning integration. Datasets to train object detection and semantic segmentation models to solve geospatial data analysis are called GeoAI datasets. They are compos...

Full description

Bibliographic Details
Main Authors: John R. Ballesteros, German Sanchez-Torres, John W. Branch-Bedoya
Format: Article
Language:English
Published: MDPI AG 2022-09-01
Series:ISPRS International Journal of Geo-Information
Subjects:
Online Access:https://www.mdpi.com/2220-9964/11/10/508
_version_ 1797472909077250048
author John R. Ballesteros
German Sanchez-Torres
John W. Branch-Bedoya
author_facet John R. Ballesteros
German Sanchez-Torres
John W. Branch-Bedoya
author_sort John R. Ballesteros
collection DOAJ
description Drone imagery is becoming the main source of overhead information to support decisions in many different fields, especially with deep learning integration. Datasets to train object detection and semantic segmentation models to solve geospatial data analysis are called GeoAI datasets. They are composed of images and corresponding labels represented by full-size masks typically obtained by manual digitizing. GIS software is made of a set of tools that can be used to automate tasks using geo-referenced raster and vector layers. This work describes a workflow using GIS tools to produce GeoAI datasets. In particular, it mentions the steps to obtain ground truth data from OSM and use methods for geometric and spectral augmentation and the data fusion of drone imagery. A method semi-automatically produces masks for point and line objects, calculating an optimum buffer distance. Tessellation into chips, pairing and imbalance checking is performed over the image–mask pairs. Dataset splitting into train–validation–test data is done randomly. All of the code for the different methods are provided in the paper, as well as point and road datasets produced as examples of point and line geometries, and the original drone orthomosaic images produced during the research. Semantic segmentation results performed over the point and line datasets using a classical U-Net show that the semi-automatically produced masks, called primitive masks, obtained a higher mIoU compared to other equal-size masks, and almost the same mIoU metric compared to full-size manual masks.
first_indexed 2024-03-09T20:07:48Z
format Article
id doaj.art-6f07b55ba53d4de4af0bf043deeea2c2
institution Directory Open Access Journal
issn 2220-9964
language English
last_indexed 2024-03-09T20:07:48Z
publishDate 2022-09-01
publisher MDPI AG
record_format Article
series ISPRS International Journal of Geo-Information
spelling doaj.art-6f07b55ba53d4de4af0bf043deeea2c22023-11-24T00:27:16ZengMDPI AGISPRS International Journal of Geo-Information2220-99642022-09-01111050810.3390/ijgi11100508A GIS Pipeline to Produce GeoAI Datasets from Drone Overhead ImageryJohn R. Ballesteros0German Sanchez-Torres1John W. Branch-Bedoya2Facultad de Minas, Universidad Nacional de Colombia, Medellín 050041, ColombiaFacultad de Ingeniería, Universidad del Magdalena, Santa Marta 470001, ColombiaFacultad de Minas, Universidad Nacional de Colombia, Medellín 050041, ColombiaDrone imagery is becoming the main source of overhead information to support decisions in many different fields, especially with deep learning integration. Datasets to train object detection and semantic segmentation models to solve geospatial data analysis are called GeoAI datasets. They are composed of images and corresponding labels represented by full-size masks typically obtained by manual digitizing. GIS software is made of a set of tools that can be used to automate tasks using geo-referenced raster and vector layers. This work describes a workflow using GIS tools to produce GeoAI datasets. In particular, it mentions the steps to obtain ground truth data from OSM and use methods for geometric and spectral augmentation and the data fusion of drone imagery. A method semi-automatically produces masks for point and line objects, calculating an optimum buffer distance. Tessellation into chips, pairing and imbalance checking is performed over the image–mask pairs. Dataset splitting into train–validation–test data is done randomly. All of the code for the different methods are provided in the paper, as well as point and road datasets produced as examples of point and line geometries, and the original drone orthomosaic images produced during the research. Semantic segmentation results performed over the point and line datasets using a classical U-Net show that the semi-automatically produced masks, called primitive masks, obtained a higher mIoU compared to other equal-size masks, and almost the same mIoU metric compared to full-size manual masks.https://www.mdpi.com/2220-9964/11/10/508GeoAIGISdatasetdroneorthomosaicsU-Net
spellingShingle John R. Ballesteros
German Sanchez-Torres
John W. Branch-Bedoya
A GIS Pipeline to Produce GeoAI Datasets from Drone Overhead Imagery
ISPRS International Journal of Geo-Information
GeoAI
GIS
dataset
drone
orthomosaics
U-Net
title A GIS Pipeline to Produce GeoAI Datasets from Drone Overhead Imagery
title_full A GIS Pipeline to Produce GeoAI Datasets from Drone Overhead Imagery
title_fullStr A GIS Pipeline to Produce GeoAI Datasets from Drone Overhead Imagery
title_full_unstemmed A GIS Pipeline to Produce GeoAI Datasets from Drone Overhead Imagery
title_short A GIS Pipeline to Produce GeoAI Datasets from Drone Overhead Imagery
title_sort gis pipeline to produce geoai datasets from drone overhead imagery
topic GeoAI
GIS
dataset
drone
orthomosaics
U-Net
url https://www.mdpi.com/2220-9964/11/10/508
work_keys_str_mv AT johnrballesteros agispipelinetoproducegeoaidatasetsfromdroneoverheadimagery
AT germansancheztorres agispipelinetoproducegeoaidatasetsfromdroneoverheadimagery
AT johnwbranchbedoya agispipelinetoproducegeoaidatasetsfromdroneoverheadimagery
AT johnrballesteros gispipelinetoproducegeoaidatasetsfromdroneoverheadimagery
AT germansancheztorres gispipelinetoproducegeoaidatasetsfromdroneoverheadimagery
AT johnwbranchbedoya gispipelinetoproducegeoaidatasetsfromdroneoverheadimagery