A Map Reduce-Based Nearest Neighbor Approach for Big-Data-Driven Traffic Flow Prediction

In big-data-driven traffic flow prediction systems, the robustness of prediction performance depends on accuracy and timeliness. This paper presents a new MapReduce-based nearest neighbor (NN) approach for <italic>traffic flow prediction</italic> using <italic>correlation</itali...

Full description

Bibliographic Details
Main Authors: Dawen Xia, Huaqing Li, Binfeng Wang, Yantao Li, Zili Zhang
Format: Article
Language:English
Published: IEEE 2016-01-01
Series:IEEE Access
Subjects:
Online Access:https://ieeexplore.ieee.org/document/7499912/
Description
Summary:In big-data-driven traffic flow prediction systems, the robustness of prediction performance depends on accuracy and timeliness. This paper presents a new MapReduce-based nearest neighbor (NN) approach for <italic>traffic flow prediction</italic> using <italic>correlation</italic> analysis (TFPC) on a Hadoop platform. In particular, we develop a real-time prediction system including two key modules, i.e., offline distributed training (ODT) and online parallel prediction (OPP). Moreover, we build a parallel <inline-formula> <tex-math notation="LaTeX">$k$ </tex-math></inline-formula>-nearest neighbor optimization classifier, which incorporates correlation information among traffic flows into the classification process. Finally, we propose a novel prediction calculation method, combining the current data observed in OPP and the classification results obtained from large-scale historical data in ODT, to generate traffic flow prediction in real time. The empirical study on real-world traffic flow big data using the leave-one-out cross validation method shows that TFPC significantly outperforms four state-of-the-art prediction approaches, i.e., autoregressive integrated moving average, Na&#x00EF;ve Bayes, multilayer perceptron neural networks, and NN regression, in terms of accuracy, which can be improved 90.07&#x0025; in the best case, with an average mean absolute percent error of 5.53&#x0025;. In addition, it displays excellent speedup, scaleup, and sizeup.
ISSN:2169-3536