Concept and benchmark results for Big Data energy forecasting based on Apache Spark

Abstract The present article describes a concept for the creation and application of energy forecasting models in a distributed environment. Additionally, a benchmark comparing the time required for the training and application of data-driven forecasting models on a single computer and a computing c...

Full description

Bibliographic Details
Main Authors: Jorge Ángel González Ordiano, Andreas Bartschat, Nicole Ludwig, Eric Braun, Simon Waczowicz, Nicolas Renkamp, Nico Peter, Clemens Düpmeier, Ralf Mikut, Veit Hagenmeyer
Format: Article
Language:English
Published: SpringerOpen 2018-03-01
Series:Journal of Big Data
Subjects:
Online Access:http://link.springer.com/article/10.1186/s40537-018-0119-6
Description
Summary:Abstract The present article describes a concept for the creation and application of energy forecasting models in a distributed environment. Additionally, a benchmark comparing the time required for the training and application of data-driven forecasting models on a single computer and a computing cluster is presented. This comparison is based on a simulated dataset and both R and Apache Spark are used. Furthermore, the obtained results show certain points in which the utilization of distributed computing based on Spark may be advantageous.
ISSN:2196-1115