Forecasting East and West Coast Gasoline Prices with Tree-Based Machine Learning Algorithms

This study aims to forecast New York and Los Angeles gasoline spot prices on a daily frequency. The dataset includes gasoline prices and a big set of 128 other relevant variables spanning the period from 17 February 2004 to 26 March 2022. These variables were fed to three tree-based machine learning...

Full description

Bibliographic Details
Main Authors: Emmanouil Sofianos, Emmanouil Zaganidis, Theophilos Papadimitriou, Periklis Gogas
Format: Article
Language:English
Published: MDPI AG 2024-03-01
Series:Energies
Subjects:
Online Access:https://www.mdpi.com/1996-1073/17/6/1296
_version_ 1797241319928627200
author Emmanouil Sofianos
Emmanouil Zaganidis
Theophilos Papadimitriou
Periklis Gogas
author_facet Emmanouil Sofianos
Emmanouil Zaganidis
Theophilos Papadimitriou
Periklis Gogas
author_sort Emmanouil Sofianos
collection DOAJ
description This study aims to forecast New York and Los Angeles gasoline spot prices on a daily frequency. The dataset includes gasoline prices and a big set of 128 other relevant variables spanning the period from 17 February 2004 to 26 March 2022. These variables were fed to three tree-based machine learning algorithms: decision trees, random forest, and XGBoost. Furthermore, a variable importance measure (VIM) technique was applied to identify and rank the most important explanatory variables. The optimal model, a trained random forest, achieves a mean absolute percent error (MAPE) in the out-of-sample of 3.23% for the New York and 3.78% for the Los Angeles gasoline spot prices. The first lag, AR (1), of gasoline is the most important variable in both markets; the top five variables are all energy-related. This paper can strengthen the understanding of price determinants and has the potential to inform strategic decisions and policy directions within the energy sector, making it a valuable asset for both industry practitioners and policymakers.
first_indexed 2024-04-24T18:21:27Z
format Article
id doaj.art-8e3232a73151443bba6c4208402590e5
institution Directory Open Access Journal
issn 1996-1073
language English
last_indexed 2024-04-24T18:21:27Z
publishDate 2024-03-01
publisher MDPI AG
record_format Article
series Energies
spelling doaj.art-8e3232a73151443bba6c4208402590e52024-03-27T13:35:22ZengMDPI AGEnergies1996-10732024-03-01176129610.3390/en17061296Forecasting East and West Coast Gasoline Prices with Tree-Based Machine Learning AlgorithmsEmmanouil Sofianos0Emmanouil Zaganidis1Theophilos Papadimitriou2Periklis Gogas3Bureau d’Economie Théorique et Appliquée (BETA), University of Strasbourg, 67085 Strasbourg, FranceDepartment of Economics, Democritus University of Thrace, 69100 Komotini, GreeceDepartment of Economics, Democritus University of Thrace, 69100 Komotini, GreeceDepartment of Economics, Democritus University of Thrace, 69100 Komotini, GreeceThis study aims to forecast New York and Los Angeles gasoline spot prices on a daily frequency. The dataset includes gasoline prices and a big set of 128 other relevant variables spanning the period from 17 February 2004 to 26 March 2022. These variables were fed to three tree-based machine learning algorithms: decision trees, random forest, and XGBoost. Furthermore, a variable importance measure (VIM) technique was applied to identify and rank the most important explanatory variables. The optimal model, a trained random forest, achieves a mean absolute percent error (MAPE) in the out-of-sample of 3.23% for the New York and 3.78% for the Los Angeles gasoline spot prices. The first lag, AR (1), of gasoline is the most important variable in both markets; the top five variables are all energy-related. This paper can strengthen the understanding of price determinants and has the potential to inform strategic decisions and policy directions within the energy sector, making it a valuable asset for both industry practitioners and policymakers.https://www.mdpi.com/1996-1073/17/6/1296gasolinedecision treerandom forestXGBoostmachine learningforecasting
spellingShingle Emmanouil Sofianos
Emmanouil Zaganidis
Theophilos Papadimitriou
Periklis Gogas
Forecasting East and West Coast Gasoline Prices with Tree-Based Machine Learning Algorithms
Energies
gasoline
decision tree
random forest
XGBoost
machine learning
forecasting
title Forecasting East and West Coast Gasoline Prices with Tree-Based Machine Learning Algorithms
title_full Forecasting East and West Coast Gasoline Prices with Tree-Based Machine Learning Algorithms
title_fullStr Forecasting East and West Coast Gasoline Prices with Tree-Based Machine Learning Algorithms
title_full_unstemmed Forecasting East and West Coast Gasoline Prices with Tree-Based Machine Learning Algorithms
title_short Forecasting East and West Coast Gasoline Prices with Tree-Based Machine Learning Algorithms
title_sort forecasting east and west coast gasoline prices with tree based machine learning algorithms
topic gasoline
decision tree
random forest
XGBoost
machine learning
forecasting
url https://www.mdpi.com/1996-1073/17/6/1296
work_keys_str_mv AT emmanouilsofianos forecastingeastandwestcoastgasolinepriceswithtreebasedmachinelearningalgorithms
AT emmanouilzaganidis forecastingeastandwestcoastgasolinepriceswithtreebasedmachinelearningalgorithms
AT theophilospapadimitriou forecastingeastandwestcoastgasolinepriceswithtreebasedmachinelearningalgorithms
AT periklisgogas forecastingeastandwestcoastgasolinepriceswithtreebasedmachinelearningalgorithms