Interpolation of GNSS Position Time Series Using GBDT, XGBoost, and RF Machine Learning Algorithms and Models Error Analysis

The global navigation satellite system (GNSS) position time series provides essential data for geodynamic and geophysical studies. Interpolation of the GNSS position time series is necessary because missing data will produce inaccurate conclusions made from the studies. The spatio-temporal correlati...

Full description

Bibliographic Details
Main Authors: Zhen Li, Tieding Lu, Kegen Yu, Jie Wang
Format: Article
Language:English
Published: MDPI AG 2023-09-01
Series:Remote Sensing
Subjects:
Online Access:https://www.mdpi.com/2072-4292/15/18/4374
_version_ 1797577317257576448
author Zhen Li
Tieding Lu
Kegen Yu
Jie Wang
author_facet Zhen Li
Tieding Lu
Kegen Yu
Jie Wang
author_sort Zhen Li
collection DOAJ
description The global navigation satellite system (GNSS) position time series provides essential data for geodynamic and geophysical studies. Interpolation of the GNSS position time series is necessary because missing data will produce inaccurate conclusions made from the studies. The spatio-temporal correlations between GNSS reference stations cannot be considered when using traditional interpolation methods. This paper examines the use of machine learning models to reflect the spatio-temporal correlation among GNSS reference stations. To form the machine learning problem, the time series to be interpolated are treated as output values, and the time series from the remaining GNSS reference stations are used as input data. Specifically, three machine learning algorithms (i.e., the gradient boosting decision tree (GBDT), eXtreme gradient boosting (XGBoost), and random forest (RF)) are utilized to perform interpolation with the time series data from five GNSS reference stations in North China. The results of the interpolation of discrete points indicate that the three machine learning models achieve similar interpolation precision in the Up component, which is 45% better than the traditional cubic spline interpolation precision. The results of the interpolation of continuous missing data indicate that seasonal oscillations caused by thermal expansion effects in summer significantly affect the interpolation precision. Meanwhile, we improved the interpolation precision of the three models by adding data from five stations which have high correlation with the initial five GNSS reference stations. The interpolated time series for the North, East, and Up (NEU) are examined by principal component analysis (PCA), and the results show that the GBDT and RF models perform interpolation better than the XGBoost model.
first_indexed 2024-03-10T22:06:33Z
format Article
id doaj.art-9b10e09913e44f46a6cb5f7575729282
institution Directory Open Access Journal
issn 2072-4292
language English
last_indexed 2024-03-10T22:06:33Z
publishDate 2023-09-01
publisher MDPI AG
record_format Article
series Remote Sensing
spelling doaj.art-9b10e09913e44f46a6cb5f75757292822023-11-19T12:46:48ZengMDPI AGRemote Sensing2072-42922023-09-011518437410.3390/rs15184374Interpolation of GNSS Position Time Series Using GBDT, XGBoost, and RF Machine Learning Algorithms and Models Error AnalysisZhen Li0Tieding Lu1Kegen Yu2Jie Wang3School of Surveying and Geoinformation Engineering, East China University of Technology, Nanchang 330013, ChinaSchool of Surveying and Geoinformation Engineering, East China University of Technology, Nanchang 330013, ChinaSchool of Environmental Science and Spatial Informatics, China University of Mining and Technology, Xuzhou 221116, ChinaSchool of Civil and Surveying & Mapping Engineering, Jiangxi University of Science and Technology, Ganzhou 341000, ChinaThe global navigation satellite system (GNSS) position time series provides essential data for geodynamic and geophysical studies. Interpolation of the GNSS position time series is necessary because missing data will produce inaccurate conclusions made from the studies. The spatio-temporal correlations between GNSS reference stations cannot be considered when using traditional interpolation methods. This paper examines the use of machine learning models to reflect the spatio-temporal correlation among GNSS reference stations. To form the machine learning problem, the time series to be interpolated are treated as output values, and the time series from the remaining GNSS reference stations are used as input data. Specifically, three machine learning algorithms (i.e., the gradient boosting decision tree (GBDT), eXtreme gradient boosting (XGBoost), and random forest (RF)) are utilized to perform interpolation with the time series data from five GNSS reference stations in North China. The results of the interpolation of discrete points indicate that the three machine learning models achieve similar interpolation precision in the Up component, which is 45% better than the traditional cubic spline interpolation precision. The results of the interpolation of continuous missing data indicate that seasonal oscillations caused by thermal expansion effects in summer significantly affect the interpolation precision. Meanwhile, we improved the interpolation precision of the three models by adding data from five stations which have high correlation with the initial five GNSS reference stations. The interpolated time series for the North, East, and Up (NEU) are examined by principal component analysis (PCA), and the results show that the GBDT and RF models perform interpolation better than the XGBoost model.https://www.mdpi.com/2072-4292/15/18/4374GNSS position time seriesinterpolationgradient boosting decision treeeXtreme gradient boostingrandom forest
spellingShingle Zhen Li
Tieding Lu
Kegen Yu
Jie Wang
Interpolation of GNSS Position Time Series Using GBDT, XGBoost, and RF Machine Learning Algorithms and Models Error Analysis
Remote Sensing
GNSS position time series
interpolation
gradient boosting decision tree
eXtreme gradient boosting
random forest
title Interpolation of GNSS Position Time Series Using GBDT, XGBoost, and RF Machine Learning Algorithms and Models Error Analysis
title_full Interpolation of GNSS Position Time Series Using GBDT, XGBoost, and RF Machine Learning Algorithms and Models Error Analysis
title_fullStr Interpolation of GNSS Position Time Series Using GBDT, XGBoost, and RF Machine Learning Algorithms and Models Error Analysis
title_full_unstemmed Interpolation of GNSS Position Time Series Using GBDT, XGBoost, and RF Machine Learning Algorithms and Models Error Analysis
title_short Interpolation of GNSS Position Time Series Using GBDT, XGBoost, and RF Machine Learning Algorithms and Models Error Analysis
title_sort interpolation of gnss position time series using gbdt xgboost and rf machine learning algorithms and models error analysis
topic GNSS position time series
interpolation
gradient boosting decision tree
eXtreme gradient boosting
random forest
url https://www.mdpi.com/2072-4292/15/18/4374
work_keys_str_mv AT zhenli interpolationofgnsspositiontimeseriesusinggbdtxgboostandrfmachinelearningalgorithmsandmodelserroranalysis
AT tiedinglu interpolationofgnsspositiontimeseriesusinggbdtxgboostandrfmachinelearningalgorithmsandmodelserroranalysis
AT kegenyu interpolationofgnsspositiontimeseriesusinggbdtxgboostandrfmachinelearningalgorithmsandmodelserroranalysis
AT jiewang interpolationofgnsspositiontimeseriesusinggbdtxgboostandrfmachinelearningalgorithmsandmodelserroranalysis