Developing the Raster Big Data Benchmark: A Comparison of Raster Analysis on Big Data Platforms

Technologies around the world produce and interact with geospatial data instantaneously, from mobile web applications to satellite imagery that is collected and processed across the globe daily. Big raster data allow researchers to integrate and uncover new knowledge about geospatial patterns and pr...

Full description

Bibliographic Details
Main Authors: David Haynes, Philip Mitchell, Eric Shook
Format: Article
Language:English
Published: MDPI AG 2020-11-01
Series:ISPRS International Journal of Geo-Information
Subjects:
Online Access:https://www.mdpi.com/2220-9964/9/11/690
_version_ 1827701927843987456
author David Haynes
Philip Mitchell
Eric Shook
author_facet David Haynes
Philip Mitchell
Eric Shook
author_sort David Haynes
collection DOAJ
description Technologies around the world produce and interact with geospatial data instantaneously, from mobile web applications to satellite imagery that is collected and processed across the globe daily. Big raster data allow researchers to integrate and uncover new knowledge about geospatial patterns and processes. However, we are at a critical moment, as we have an ever-growing number of big data platforms that are being co-opted to support spatial analysis. A gap in the literature is the lack of a robust assessment comparing the efficiency of raster data analysis on big data platforms. This research begins to address this issue by establishing a raster data benchmark that employs freely accessible datasets to provide a comprehensive performance evaluation and comparison of raster operations on big data platforms. The benchmark is critical for evaluating the performance of spatial operations on big data platforms. The benchmarking datasets and operations are applied to three big data platforms. We report computing times and performance bottlenecks so that GIScientists can make informed choices regarding the performance of each platform. Each platform is evaluated for five raster operations: pixel count, reclassification, raster add, focal averaging, and zonal statistics using three raster different datasets.
first_indexed 2024-03-10T14:45:08Z
format Article
id doaj.art-c161b11650e543c18de53f266bf462b5
institution Directory Open Access Journal
issn 2220-9964
language English
last_indexed 2024-03-10T14:45:08Z
publishDate 2020-11-01
publisher MDPI AG
record_format Article
series ISPRS International Journal of Geo-Information
spelling doaj.art-c161b11650e543c18de53f266bf462b52023-11-20T21:30:45ZengMDPI AGISPRS International Journal of Geo-Information2220-99642020-11-0191169010.3390/ijgi9110690Developing the Raster Big Data Benchmark: A Comparison of Raster Analysis on Big Data PlatformsDavid Haynes0Philip Mitchell1Eric Shook2Institute for Health Informatics, University of Minnesota, Minneapolis, MN 55455, USAAli I. Al-Naimi Petroleum Engineering Research Center, King Abdullah University of Science and Technology, Thuwal 23955, Saudi ArabiaGeography Environment and Society, University of Minnesota, Minneapolis, MN 55455, USATechnologies around the world produce and interact with geospatial data instantaneously, from mobile web applications to satellite imagery that is collected and processed across the globe daily. Big raster data allow researchers to integrate and uncover new knowledge about geospatial patterns and processes. However, we are at a critical moment, as we have an ever-growing number of big data platforms that are being co-opted to support spatial analysis. A gap in the literature is the lack of a robust assessment comparing the efficiency of raster data analysis on big data platforms. This research begins to address this issue by establishing a raster data benchmark that employs freely accessible datasets to provide a comprehensive performance evaluation and comparison of raster operations on big data platforms. The benchmark is critical for evaluating the performance of spatial operations on big data platforms. The benchmarking datasets and operations are applied to three big data platforms. We report computing times and performance bottlenecks so that GIScientists can make informed choices regarding the performance of each platform. Each platform is evaluated for five raster operations: pixel count, reclassification, raster add, focal averaging, and zonal statistics using three raster different datasets.https://www.mdpi.com/2220-9964/9/11/690geospatialcomputationspatial benchmarkcybergis
spellingShingle David Haynes
Philip Mitchell
Eric Shook
Developing the Raster Big Data Benchmark: A Comparison of Raster Analysis on Big Data Platforms
ISPRS International Journal of Geo-Information
geospatial
computation
spatial benchmark
cybergis
title Developing the Raster Big Data Benchmark: A Comparison of Raster Analysis on Big Data Platforms
title_full Developing the Raster Big Data Benchmark: A Comparison of Raster Analysis on Big Data Platforms
title_fullStr Developing the Raster Big Data Benchmark: A Comparison of Raster Analysis on Big Data Platforms
title_full_unstemmed Developing the Raster Big Data Benchmark: A Comparison of Raster Analysis on Big Data Platforms
title_short Developing the Raster Big Data Benchmark: A Comparison of Raster Analysis on Big Data Platforms
title_sort developing the raster big data benchmark a comparison of raster analysis on big data platforms
topic geospatial
computation
spatial benchmark
cybergis
url https://www.mdpi.com/2220-9964/9/11/690
work_keys_str_mv AT davidhaynes developingtherasterbigdatabenchmarkacomparisonofrasteranalysisonbigdataplatforms
AT philipmitchell developingtherasterbigdatabenchmarkacomparisonofrasteranalysisonbigdataplatforms
AT ericshook developingtherasterbigdatabenchmarkacomparisonofrasteranalysisonbigdataplatforms