Efficient parallel simulation over large-scale social contact networks

Social contact network (SCN) models the daily contacts between people in real life. It consists of agents and locations. When agents visit a location at the same time, the social interactions can be established among them. Simulations over SCN have been employed to study social dynamics such as dise...

Full description

Bibliographic Details
Main Authors: Wu, Yulin, Cai, Wentong, Li, Zengxiang, Tan, Wen Jun, Hou, Xiangting
Other Authors: School of Computer Science and Engineering
Format: Journal Article
Language:English
Published: 2020
Subjects:
Online Access:https://hdl.handle.net/10356/143058
Description
Summary:Social contact network (SCN) models the daily contacts between people in real life. It consists of agents and locations. When agents visit a location at the same time, the social interactions can be established among them. Simulations over SCN have been employed to study social dynamics such as disease spread among population. Because of the scale of SCN and the execution time requirement, the simulations are usually run in parallel. However, a challenge to the parallel simulation is that the structure of SCN is naturally skewed with a few hub locations that have far more visitors than others. These hub locations can cause load imbalance and heavy communication between partitions, which therefore impact the simulation performance. This article proposes a comprehensive solution to address this challenge. First, the hub locations are decomposed into small locations, so that SCN can be divided into partitions with better balanced workloads. Second, the agents are decomposed to exploit data locality, so that the overall communication across partitions can be greatly reduced. Third, two enhanced execution mechanisms are designed for locations and agents, respectively, to improve simulation parallelism. To evaluate the efficiency of the proposed solution, an epidemic simulation was developed and extensive experiments were conducted on two computer clusters using three SCN datasets with different scales. The results demonstrate that our approach can significantly improve the execution performance of the simulation.