Imperial College London


Faculty of EngineeringDepartment of Computing

Professor of Computer Engineering



+44 (0)20 7594 8313w.luk Website




434Huxley BuildingSouth Kensington Campus






BibTex format

author = {Burovskiy, P and Girdlestone, S and Davies, C and Sherwin, S and Luk, W},
doi = {10.1109/FPL.2014.6927453},
title = {Dataflow acceleration of Krylov subspace sparse banded problems},
url = {},
year = {2014}

RIS format (EndNote, RefMan)

AB - © 2014 Technical University of Munich (TUM). Most of the efforts in the FPGA community related to sparse linear algebra focus on increasing the degree of internal parallelism in matrix-vector multiply kernels. We propose a parametrisable dataflow architecture presenting an alternative and complementary approach to support acceleration of banded sparse linear algebra problems which benefit from building a Krylov subspace. We use banded structure of a matrix A to overlap the computations Ax, A2x,..., Akx by building a pipeline of matrix-vector multiplication processing elements (PEs) each performing Aix. Due to on-chip data locality, FLOPS rate sustainable by such pipeline scales linearly with k. Our approach enables trade-off between the number k of overlapped matrix power actions and the level of parallelism in a PE. We illustrate our approach for Google PageRank computation by power iteration for large banded single precision sparse matrices. Our design scales up to 32 sequential PEs with floating point accumulation and 80 PEs with fixed point accumulation on Stratix V D8 FPGA. With 80 single-pipe fixed point PEs clocked at 160Mhz, our design sustains 12.7 GFLOPS.
AU - Burovskiy,P
AU - Girdlestone,S
AU - Davies,C
AU - Sherwin,S
AU - Luk,W
DO - 10.1109/FPL.2014.6927453
PY - 2014///
TI - Dataflow acceleration of Krylov subspace sparse banded problems
UR -
UR -
ER -