Housing Price Prediction Using Machine Learning Algorithms in COVID-19 Times

Machine learning algorithms are being used for multiple real-life applications and in research. As a consequence of digital technology, large structured and georeferenced datasets are now more widely available, facilitating the use of these algorithms to analyze and identify patterns, as well as to...

Full description

Bibliographic Details
Main Authors: Raul-Tomas Mora-Garcia, Maria-Francisca Cespedes-Lopez, V. Raul Perez-Sanchez
Format: Article
Language:English
Published: MDPI AG 2022-11-01
Series:Land
Subjects:
Online Access:https://www.mdpi.com/2073-445X/11/11/2100
_version_ 1797464851319095296
author Raul-Tomas Mora-Garcia
Maria-Francisca Cespedes-Lopez
V. Raul Perez-Sanchez
author_facet Raul-Tomas Mora-Garcia
Maria-Francisca Cespedes-Lopez
V. Raul Perez-Sanchez
author_sort Raul-Tomas Mora-Garcia
collection DOAJ
description Machine learning algorithms are being used for multiple real-life applications and in research. As a consequence of digital technology, large structured and georeferenced datasets are now more widely available, facilitating the use of these algorithms to analyze and identify patterns, as well as to make predictions that help users in decision making. This research aims to identify the best machine learning algorithms to predict house prices, and to quantify the impact of the COVID-19 pandemic on house prices in a Spanish city. The methodology addresses the phases of data preparation, feature engineering, hyperparameter training and optimization, model evaluation and selection, and finally model interpretation. Ensemble learning algorithms based on boosting (Gradient Boosting Regressor, Extreme Gradient Boosting, and Light Gradient Boosting Machine) and bagging (random forest and extra-trees regressor) are used and compared with a linear regression model. A case study is developed with georeferenced microdata of the real estate market in Alicante (Spain), before and after the pandemic declaration derived from COVID-19, together with information from other complementary sources such as the cadastre, socio-demographic and economic indicators, and satellite images. The results show that machine learning algorithms perform better than traditional linear models because they are better adapted to the nonlinearities of complex data such as real estate market data. Algorithms based on bagging show overfitting problems (random forest and extra-trees regressor) and those based on boosting have better performance and lower overfitting. This research contributes to the literature on the Spanish real estate market by being one of the first studies to use machine learning and microdata to explore the incidence of the COVID-19 pandemic on house prices.
first_indexed 2024-03-09T18:14:02Z
format Article
id doaj.art-ab4470e8dfc64527a7438682b5bf26c6
institution Directory Open Access Journal
issn 2073-445X
language English
last_indexed 2024-03-09T18:14:02Z
publishDate 2022-11-01
publisher MDPI AG
record_format Article
series Land
spelling doaj.art-ab4470e8dfc64527a7438682b5bf26c62023-11-24T08:56:47ZengMDPI AGLand2073-445X2022-11-011111210010.3390/land11112100Housing Price Prediction Using Machine Learning Algorithms in COVID-19 TimesRaul-Tomas Mora-Garcia0Maria-Francisca Cespedes-Lopez1V. Raul Perez-Sanchez2Building Sciences and Urbanism Department, University of Alicante, 03690 San Vicente del Raspeig, SpainBuilding Sciences and Urbanism Department, University of Alicante, 03690 San Vicente del Raspeig, SpainBuilding Sciences and Urbanism Department, University of Alicante, 03690 San Vicente del Raspeig, SpainMachine learning algorithms are being used for multiple real-life applications and in research. As a consequence of digital technology, large structured and georeferenced datasets are now more widely available, facilitating the use of these algorithms to analyze and identify patterns, as well as to make predictions that help users in decision making. This research aims to identify the best machine learning algorithms to predict house prices, and to quantify the impact of the COVID-19 pandemic on house prices in a Spanish city. The methodology addresses the phases of data preparation, feature engineering, hyperparameter training and optimization, model evaluation and selection, and finally model interpretation. Ensemble learning algorithms based on boosting (Gradient Boosting Regressor, Extreme Gradient Boosting, and Light Gradient Boosting Machine) and bagging (random forest and extra-trees regressor) are used and compared with a linear regression model. A case study is developed with georeferenced microdata of the real estate market in Alicante (Spain), before and after the pandemic declaration derived from COVID-19, together with information from other complementary sources such as the cadastre, socio-demographic and economic indicators, and satellite images. The results show that machine learning algorithms perform better than traditional linear models because they are better adapted to the nonlinearities of complex data such as real estate market data. Algorithms based on bagging show overfitting problems (random forest and extra-trees regressor) and those based on boosting have better performance and lower overfitting. This research contributes to the literature on the Spanish real estate market by being one of the first studies to use machine learning and microdata to explore the incidence of the COVID-19 pandemic on house prices.https://www.mdpi.com/2073-445X/11/11/2100machine learningmass appraisalreal estate marketpartial dependence plotsCOVID-19
spellingShingle Raul-Tomas Mora-Garcia
Maria-Francisca Cespedes-Lopez
V. Raul Perez-Sanchez
Housing Price Prediction Using Machine Learning Algorithms in COVID-19 Times
Land
machine learning
mass appraisal
real estate market
partial dependence plots
COVID-19
title Housing Price Prediction Using Machine Learning Algorithms in COVID-19 Times
title_full Housing Price Prediction Using Machine Learning Algorithms in COVID-19 Times
title_fullStr Housing Price Prediction Using Machine Learning Algorithms in COVID-19 Times
title_full_unstemmed Housing Price Prediction Using Machine Learning Algorithms in COVID-19 Times
title_short Housing Price Prediction Using Machine Learning Algorithms in COVID-19 Times
title_sort housing price prediction using machine learning algorithms in covid 19 times
topic machine learning
mass appraisal
real estate market
partial dependence plots
COVID-19
url https://www.mdpi.com/2073-445X/11/11/2100
work_keys_str_mv AT raultomasmoragarcia housingpricepredictionusingmachinelearningalgorithmsincovid19times
AT mariafranciscacespedeslopez housingpricepredictionusingmachinelearningalgorithmsincovid19times
AT vraulperezsanchez housingpricepredictionusingmachinelearningalgorithmsincovid19times