Imperial College London


Faculty of EngineeringDepartment of Computing

Professor of Computer Systems



+44 (0)20 7594 8375j.mccann Website




Miss Teresa Ng +44 (0)20 7594 8300




260ACE ExtensionSouth Kensington Campus






BibTex format

author = {Yu, W and McCann, J},
doi = {10.14778/2735479.2735489},
journal = {Proceedings of the VLDB Endowment International Conference on Very Large Data Bases},
pages = {569--580},
title = {Efficient Partial-Pairs SimRank Search on Large Graphs},
url = {},
volume = {8},
year = {2015}

RIS format (EndNote, RefMan)

AB - The assessment of node-to-node similarities based on graph topology arises in a myriad of applications, e.g., web search. SimRank is a notable measure of this type, with the intuition that "two nodes are similar if their in-neighbors are similar". While most existing work retrieving SimRank only considers all-pairs SimRank s(, ) and single-source SimRank s(, j) (scores between every node and query j), there are appealing applications for partial-pairs SimRank, e.g., similarity join. Given two node subsets A and B in a graph, partial-pairs SimRank assessment aims to retrieve only {s(a, b)}∀aεA,∀bεB. However, the best-known solution appears not self-contained since it hinges on the premise that the SimRank scores with node-pairs in an h-go cover set must be given beforehand.This paper focuses on efficient assessment of partial-pairs SimRank in a self-contained manner. (1) We devise a novel "seed germination" model that computes partial-pairs SimRank in O(k|E| min{|A|, |B|}) time and O(|E| + k|V|) memory for k iterations on a graph of |V| nodes and |E| edges. (2) We further eliminate unnecessary edge access to improve the time of partial-pairs SimRank to O(m min{|A|, |B|}), where m ≤ min{k|E|, Δ2k}, and Δ is the maximum degree. (3) We show that our partial-pairs SimRank model also can handle the computations of all-pairs and single-source SimRanks. (4) We empirically verify that our algorithms are (a) 38x faster than the best-known competitors, and (b) memory-efficient, allowing scores to be assessed accurately on graphs with tens of millions of links.
AU - Yu,W
AU - McCann,J
DO - 10.14778/2735479.2735489
EP - 580
PY - 2015///
SN - 2150-8097
SP - 569
TI - Efficient Partial-Pairs SimRank Search on Large Graphs
T2 - Proceedings of the VLDB Endowment International Conference on Very Large Data Bases
UR -
UR -
VL - 8
ER -