Keywords
|
Parallel community detection, Label diffusion, Local similarity, Label selection, Spark, Social networks
|
Abstract
|
Parallel and distributed community detection in large-scale complex networks, such as social networks, is a challenging task. Parallel and distributed algorithm with high accuracy and low computational complexity is one of the essential issues in the community detection field. In this paper, we propose a novel fast, and accurate Spark-based parallel label diffusion and label selection-based (PLDLS) community detection algorithm with two-step of label diffusion of core nodes along with a new label selection (propagation) method. We have used multi-factor criteria for computing node's importance and adopted a new method for selecting core nodes. In the first phase, utilizing the fact that nodes forming triangles, tend to be in the same community, parallel label diffusion of core nodes is performed to diffuse labels up to two levels. In the second phase, through an iterative and parallel process, the most appropriate labels are assigned to the remaining nodes. PLDLS proposes an improved robust version of LPA by putting aside randomness parameter tuning. Furthermore, we utilize a fast and parallel merge phase to get even more dense and accurate communities. Conducted experiments on real-world and artificial networks, indicates the better accuracy and low execution time of PLDLS in comparison with other examined methods.
|