Distributed Subgraph Query Processing Using Filtering Scores on Spark

As various services have been generating large-scale graphs to represent multiple relationships between objects, studies have been conducted to obtain subgraphs with particular patterns. In this paper, we propose a distributed query processing method to efficiently search a subgraph for a large grap...

Full description

Bibliographic Details
Main Authors: Kyoungsoo Bok, Minyoung Kim, Hyeonbyeong Lee, Dojin Choi, Jongtae Lim, Jaesoo Yoo
Format: Article
Language:English
Published: MDPI AG 2023-08-01
Series:Electronics
Subjects:
Online Access:https://www.mdpi.com/2079-9292/12/17/3645
Description
Summary:As various services have been generating large-scale graphs to represent multiple relationships between objects, studies have been conducted to obtain subgraphs with particular patterns. In this paper, we propose a distributed query processing method to efficiently search a subgraph for a large graph on Spark. To reduce unnecessary processing costs, the search order is determined by filtering scores using the probability distribution. The partitioned queries are searched in parallel in the distributed graph of each slave node according to the search order, and the local search results obtained from each slave node are combined and returned. The query is partitioned in triplets based on the determined search order. The performance of the proposed method is compared with the performance of existing methods to demonstrate its superiority.
ISSN:2079-9292