High Impact Factor : 4.396 icon | Submit Manuscript Online icon |

Subgraph Matching with Set Similarity using Apache Spark

Author(s):

Nisha yadav , MRK Institute of Technology and Management Rewari, Haryana, India; Anil Kumar, MRK Institute of Technology and Management Rewari, Haryana, India

Keywords:

Graph database, Pruning techniques, Graph topology, Selection algorithm, Data mining

Abstract

Graph databases have been widely used as important tools to model and query complex graph data with the emergence of many real applications such as social networks, Semantic Web, biological networks, and so on, wherein each vertex in a graph usually contains information, which can be modeled by a set of tokens or elements. Liang Hong, Lei Zou, Xiang Lian, Philip S. Yu proposed the method for subgraph matching with set similarity (SMS2) query over a large graph database, which retrieves subgraphs that are structurally isomorphic to the query graph, and meanwhile satisfy the condition of vertex pair matching with the (dynamic) weighted set similarity. To handle the above problem, efficient pruning techniques are proposed by considering both vertex set similarity and graph topology. Frequent patterns are determined for the element sets of the vertices of data graph, and lightweight signatures for both query vertices and data vertices are designed in order to process the SMS2 query efficiently. Based on the determined frequent patterns of elements sets of vertices and signatures, an efficient two-phase pruning strategy including set similarity pruning and structure-based pruning is proposed to facilitate online pruning. Finally, an efficient dominating-set-based subgraph match algorithm guided by a dominating set selection algorithm is proposed to find subgraph matches in order to achieve better query performance. All the techniques are designed for the distributed systems using Apache Spark in order to offer a better price/performance ratio than centralized systems and to increase availability using redundancy when parts of a system fail.

Other Details

Paper ID: IJSRDV4I50592
Published in: Volume : 4, Issue : 5
Publication Date: 01/08/2016
Page(s): 1731-1736

Article Preview

Download Article