Data Balance Algorithm Based on Histogram in MapReduce

MapReduce model is a typical distributed computing model, which is widely used in large-scale data processing, and its performance depends largely on the data distribution status. As the data content is often unbalanced, coupled with the storage of randomness, so MapReduce model prone to data skew p...

Full description

Bibliographic Details
Format: Article
Language:zho
Published: EDP Sciences 2018-06-01
Series:Xibei Gongye Daxue Xuebao
Subjects:
Online Access:https://www.jnwpu.org/articles/jnwpu/pdf/2018/03/jnwpu2018363p480.pdf