NESL Technical Report #: 2011-4-6
Abstract: Complex social and information network search becomes impor- tant with a variety of applications. In the core of these applications, lies a common and critical problem: Given a labeled network and a query graph, how to efficiently search the query graph in the tar- get network. The presence of noise and the incomplete knowledge about the structure and content of the target network make it unre- alistic to find an exact match. Rather, it is more appealing to find the top-k approximate matches. In this paper, we propose a neighborhood-based similarity mea- sure that could avoid costly graph isomorphism and edit distance computation. Under this new measure, we prove that subgraph sim- ilarity search is NP hard, while graph similarity match is polyno- mial. By studying the principles behind this measure, we found an information propagation model that is able to convert a large net- work into a set of multidimensional vectors, where sophisticated indexing and similarity search algorithms are available. The pro- posed method, called Ness (Neighborhood Based Similarity Search), is appropriate for graphs with low automorphism and high noise, which are common in many social and information networks. Ness is not only efficient, but also robust against structural noise and in- formation loss. Empirical results show that it can quickly and accu- rately find high-quality matches in large networks, with negligible cost.
Publication Forum: 2011 ACM SIGMOD
NESL Document?: Yes
Document category: Conference Paper