Outlier treatments using interolation on Malaysia tourist arrival forecasting: SARIMA and ANN approaches

Outliers are unusual observations that appear in a piece of data that are very different from the rest of the data. The presence of an outlier may directly affect the variance, the model parameters, and the overall estimation, especially during forecasting. To obtain an accurate forecast, any...

Full description

Bibliographic Details
Main Author: Wahir, Norsoraya Azurin
Format: Thesis
Language:English
English
English
Published: 2020
Subjects:
Online Access:http://eprints.uthm.edu.my/1095/1/24p%20NORSORAYA%20AZURIN%20WAHIR.pdf
http://eprints.uthm.edu.my/1095/2/NORSORAYA%20AZURIN%20WAHIR%20COPYRIGHT%20DECLARATION.pdf
http://eprints.uthm.edu.my/1095/3/NORSORAYA%20AZURIN%20WAHIR%20WATERMARK.pdf
Description
Summary:Outliers are unusual observations that appear in a piece of data that are very different from the rest of the data. The presence of an outlier may directly affect the variance, the model parameters, and the overall estimation, especially during forecasting. To obtain an accurate forecast, any outliers that are present in the data must be addressed. This research used monthly Malaysia tourist arrivals from 1998 until 2015 and an ARIMA outlier detection method to detect outliers on original data. The detected outliers were regarded as missing values then treated using interpolation method which are Linear Interpolation and Cubic Spline Interpolation methods. In this study, SARIMA model and Artificial Neural Network model were used as forecasting tools using the data before and after outlier treatment. The comparison of forecast performance between all models were calculated using MSE, MAD, MAPE and R2 including the data before and after outlier treatment. This study found that once the outlier in the data was treated, ANN model of Cubic Spline Interpolation performs the best models compare to other models which is 95.65% using R2 validation test. On the other hand, ANN approach outperforms SARIMA approach on both data for before and after outlier treatment which are 6.05% and 2.52%.