Imperial College London

ProfessorJulieMcCann

Faculty of EngineeringDepartment of Computing

Professor of Computer Systems
 
 
 
//

Contact

 

+44 (0)20 7594 8375j.mccann Website

 
 
//

Location

 

258ACE ExtensionSouth Kensington Campus

//

Summary

 

Publications

Citation

BibTex format

@article{Yu:2015:10.1109/TKDE.2014.2339828,
author = {Yu, W and Lin, X and Zhang, W and McCann, J},
doi = {10.1109/TKDE.2014.2339828},
journal = {IEEE Transactions on Knowledge and Data Engineering},
pages = {1810--1823},
title = {Fast All-Pairs SimRank Assessment on Large Graphs and Bipartite Domains},
url = {http://dx.doi.org/10.1109/TKDE.2014.2339828},
volume = {27},
year = {2015}
}

RIS format (EndNote, RefMan)

TY  - JOUR
AB - SimRank is a powerful model for assessing vertex-pair similarities in a graph. It follows the concept that two vertices are similar if they are referenced by similar vertices. The prior work [18] exploits partial sums memoization to compute SimRankin O(Kmn) time on a graph of n vertices and m edges, for K iterations. However, the computations among different partial sums may have duplicate redundancy. Besides, to guarantee a given accuracy , the existing SimRank needs K = ⌈logC ⌉iterations, where C is a damping factor, but the geometric rate of convergence is slow if a high accuracy is expected. In this paper, (1) a novel clustering strategy is proposed to eliminate duplicate computations occurring in partial sums, and an efcient algorithm is then devised to accelerate SimRank computation to O(Kd′n2) time, where d′ is typically much smaller than m n . (2) A new differential SimRank equation is proposed, which can represent the SimRank matrix as an exponential sum of transition matrices, as opposed to the geometric sum of the conventional counterpart. This leads to a further speedup in the convergence rate of SimRank iterations. (3) In bipartite domains, a novel ner-grained partial max clustering method is developed to speed up the computation of the Minimax SimRank variation from O(Kmn) to O(Km′n) time, where m′ (≤ m) is the number of edges ina reduced graph after edge clustering, which can be typically much smaller than m. Using real and synthetic data, we empirically verify that (1) our approach of partial sums sharing outperforms the best known algorithm by up to one order of magnitude; (2) the revised notion of SimRank further achieves a 5X speedup on large graphs while also fairly preserving the relative order of original SimRank scores; (3) our ner-grained partial max memoization for the Minimax SimRank variation in bipartite domains is 0.5–1.2 orders of magnitude faster than the baselines.
AU - Yu,W
AU - Lin,X
AU - Zhang,W
AU - McCann,J
DO - 10.1109/TKDE.2014.2339828
EP - 1823
PY - 2015///
SN - 1041-4347
SP - 1810
TI - Fast All-Pairs SimRank Assessment on Large Graphs and Bipartite Domains
T2 - IEEE Transactions on Knowledge and Data Engineering
UR - http://dx.doi.org/10.1109/TKDE.2014.2339828
UR - http://hdl.handle.net/10044/1/23640
VL - 27
ER -