A Map Reduce-Based Nearest Neighbor Approach for Big-Data-Driven Traffic Flow Prediction
In big-data-driven traffic flow prediction systems, the robustness of prediction performance depends on accuracy and timeliness. This paper presents a new MapReduce-based nearest neighbor (NN) approach for <italic>traffic flow prediction</italic> using <italic>correlation</itali...
Main Authors: | , , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
IEEE
2016-01-01
|
Series: | IEEE Access |
Subjects: | |
Online Access: | https://ieeexplore.ieee.org/document/7499912/ |
_version_ | 1818876942393278464 |
---|---|
author | Dawen Xia Huaqing Li Binfeng Wang Yantao Li Zili Zhang |
author_facet | Dawen Xia Huaqing Li Binfeng Wang Yantao Li Zili Zhang |
author_sort | Dawen Xia |
collection | DOAJ |
description | In big-data-driven traffic flow prediction systems, the robustness of prediction performance depends on accuracy and timeliness. This paper presents a new MapReduce-based nearest neighbor (NN) approach for <italic>traffic flow prediction</italic> using <italic>correlation</italic> analysis (TFPC) on a Hadoop platform. In particular, we develop a real-time prediction system including two key modules, i.e., offline distributed training (ODT) and online parallel prediction (OPP). Moreover, we build a parallel <inline-formula> <tex-math notation="LaTeX">$k$ </tex-math></inline-formula>-nearest neighbor optimization classifier, which incorporates correlation information among traffic flows into the classification process. Finally, we propose a novel prediction calculation method, combining the current data observed in OPP and the classification results obtained from large-scale historical data in ODT, to generate traffic flow prediction in real time. The empirical study on real-world traffic flow big data using the leave-one-out cross validation method shows that TFPC significantly outperforms four state-of-the-art prediction approaches, i.e., autoregressive integrated moving average, Naïve Bayes, multilayer perceptron neural networks, and NN regression, in terms of accuracy, which can be improved 90.07% in the best case, with an average mean absolute percent error of 5.53%. In addition, it displays excellent speedup, scaleup, and sizeup. |
first_indexed | 2024-12-19T13:50:24Z |
format | Article |
id | doaj.art-69d6934a622f4f14ac02fdb01b639de8 |
institution | Directory Open Access Journal |
issn | 2169-3536 |
language | English |
last_indexed | 2024-12-19T13:50:24Z |
publishDate | 2016-01-01 |
publisher | IEEE |
record_format | Article |
series | IEEE Access |
spelling | doaj.art-69d6934a622f4f14ac02fdb01b639de82022-12-21T20:18:45ZengIEEEIEEE Access2169-35362016-01-0142920293410.1109/ACCESS.2016.25700217499912A Map Reduce-Based Nearest Neighbor Approach for Big-Data-Driven Traffic Flow PredictionDawen Xia0https://orcid.org/0000-0002-0151-9643Huaqing Li1Binfeng Wang2Yantao Li3Zili Zhang4School of Computer and Information Science, Southwest University, Chongqing, ChinaSchool of Electronics and Information Engineering, Southwest University, Chongqing, ChinaSchool of Computer and Information Science, Southwest University, Chongqing, ChinaSchool of Computer and Information Science, Southwest University, Chongqing, ChinaSchool of Computer and Information Science, Southwest University, Chongqing, ChinaIn big-data-driven traffic flow prediction systems, the robustness of prediction performance depends on accuracy and timeliness. This paper presents a new MapReduce-based nearest neighbor (NN) approach for <italic>traffic flow prediction</italic> using <italic>correlation</italic> analysis (TFPC) on a Hadoop platform. In particular, we develop a real-time prediction system including two key modules, i.e., offline distributed training (ODT) and online parallel prediction (OPP). Moreover, we build a parallel <inline-formula> <tex-math notation="LaTeX">$k$ </tex-math></inline-formula>-nearest neighbor optimization classifier, which incorporates correlation information among traffic flows into the classification process. Finally, we propose a novel prediction calculation method, combining the current data observed in OPP and the classification results obtained from large-scale historical data in ODT, to generate traffic flow prediction in real time. The empirical study on real-world traffic flow big data using the leave-one-out cross validation method shows that TFPC significantly outperforms four state-of-the-art prediction approaches, i.e., autoregressive integrated moving average, Naïve Bayes, multilayer perceptron neural networks, and NN regression, in terms of accuracy, which can be improved 90.07% in the best case, with an average mean absolute percent error of 5.53%. In addition, it displays excellent speedup, scaleup, and sizeup.https://ieeexplore.ieee.org/document/7499912/Big data analyticstraffic flow predictioncorrelation analysisparallel classifierHadoop MapReduce |
spellingShingle | Dawen Xia Huaqing Li Binfeng Wang Yantao Li Zili Zhang A Map Reduce-Based Nearest Neighbor Approach for Big-Data-Driven Traffic Flow Prediction IEEE Access Big data analytics traffic flow prediction correlation analysis parallel classifier Hadoop MapReduce |
title | A Map Reduce-Based Nearest Neighbor Approach for Big-Data-Driven Traffic Flow Prediction |
title_full | A Map Reduce-Based Nearest Neighbor Approach for Big-Data-Driven Traffic Flow Prediction |
title_fullStr | A Map Reduce-Based Nearest Neighbor Approach for Big-Data-Driven Traffic Flow Prediction |
title_full_unstemmed | A Map Reduce-Based Nearest Neighbor Approach for Big-Data-Driven Traffic Flow Prediction |
title_short | A Map Reduce-Based Nearest Neighbor Approach for Big-Data-Driven Traffic Flow Prediction |
title_sort | map reduce based nearest neighbor approach for big data driven traffic flow prediction |
topic | Big data analytics traffic flow prediction correlation analysis parallel classifier Hadoop MapReduce |
url | https://ieeexplore.ieee.org/document/7499912/ |
work_keys_str_mv | AT dawenxia amapreducebasednearestneighborapproachforbigdatadriventrafficflowprediction AT huaqingli amapreducebasednearestneighborapproachforbigdatadriventrafficflowprediction AT binfengwang amapreducebasednearestneighborapproachforbigdatadriventrafficflowprediction AT yantaoli amapreducebasednearestneighborapproachforbigdatadriventrafficflowprediction AT zilizhang amapreducebasednearestneighborapproachforbigdatadriventrafficflowprediction AT dawenxia mapreducebasednearestneighborapproachforbigdatadriventrafficflowprediction AT huaqingli mapreducebasednearestneighborapproachforbigdatadriventrafficflowprediction AT binfengwang mapreducebasednearestneighborapproachforbigdatadriventrafficflowprediction AT yantaoli mapreducebasednearestneighborapproachforbigdatadriventrafficflowprediction AT zilizhang mapreducebasednearestneighborapproachforbigdatadriventrafficflowprediction |