Imperial College London

ProfessorJulieMcCann

Faculty of EngineeringDepartment of Computing

Professor of Computer Systems
 
 
 
//

Contact

 

+44 (0)20 7594 8375j.mccann Website

 
 
//

Location

 

258ACE ExtensionSouth Kensington Campus

//

Summary

 

Publications

Citation

BibTex format

@article{Yu:2015:10.14778/2735479.2735489,
author = {Yu, W and McCann, J},
doi = {10.14778/2735479.2735489},
journal = {Proceedings of the VLDB Endowment International Conference on Very Large Data Bases},
pages = {569--580},
title = {Efficient Partial-Pairs SimRank Search on Large Graphs},
url = {http://dx.doi.org/10.14778/2735479.2735489},
volume = {8},
year = {2015}
}

RIS format (EndNote, RefMan)

TY  - JOUR
AB - The assessment of node-to-node similarities based on graph topology arises in a myriad of applications, e.g., web search. SimRank is a notable measure of this type, with the intuition that "two nodes are similar if their in-neighbors are similar". While most existing work retrieving SimRank only considers all-pairs SimRank s(, ) and single-source SimRank s(, j) (scores between every node and query j), there are appealing applications for partial-pairs SimRank, e.g., similarity join. Given two node subsets A and B in a graph, partial-pairs SimRank assessment aims to retrieve only {s(a, b)}∀aεA,∀bεB. However, the best-known solution appears not self-contained since it hinges on the premise that the SimRank scores with node-pairs in an h-go cover set must be given beforehand.This paper focuses on efficient assessment of partial-pairs SimRank in a self-contained manner. (1) We devise a novel "seed germination" model that computes partial-pairs SimRank in O(k|E| min{|A|, |B|}) time and O(|E| + k|V|) memory for k iterations on a graph of |V| nodes and |E| edges. (2) We further eliminate unnecessary edge access to improve the time of partial-pairs SimRank to O(m min{|A|, |B|}), where m ≤ min{k|E|, Δ2k}, and Δ is the maximum degree. (3) We show that our partial-pairs SimRank model also can handle the computations of all-pairs and single-source SimRanks. (4) We empirically verify that our algorithms are (a) 38x faster than the best-known competitors, and (b) memory-efficient, allowing scores to be assessed accurately on graphs with tens of millions of links.
AU - Yu,W
AU - McCann,J
DO - 10.14778/2735479.2735489
EP - 580
PY - 2015///
SN - 2150-8097
SP - 569
TI - Efficient Partial-Pairs SimRank Search on Large Graphs
T2 - Proceedings of the VLDB Endowment International Conference on Very Large Data Bases
UR - http://dx.doi.org/10.14778/2735479.2735489
UR - http://hdl.handle.net/10044/1/23679
VL - 8
ER -