Imperial College London

ProfessorPaulKelly

Faculty of EngineeringDepartment of Computing

Professor of Software Technology
 
 
 
//

Contact

 

+44 (0)20 7594 8332p.kelly Website

 
 
//

Location

 

Level 3 (upstairs), William Penney Building, room 304William Penney LaboratorySouth Kensington Campus

//

Summary

 

Publications

Citation

BibTex format

@inproceedings{Kelly:2013,
author = {Kelly, PH and Cohen, A and Grosser, T and Ramanujam, J and Sadayappan, P and Verdoolaege, S},
publisher = {ACM Press},
title = {Split Tiling for GPUs: Automatic Parallelization Using Trapezoidal Tiles},
url = {http://dl.acm.org/citation.cfm?id=2458526&CFID=347882156&CFTOKEN=62979459},
year = {2013}
}

RIS format (EndNote, RefMan)

TY  - CPAPER
AB - Tiling is a key technique to enhance data reuse. For computations structured as one sequential outer "time" loop enclosing a set of parallel inner loops, tiling only the parallel inner loops may not enable enough data reuse in the cache. Tiling the inner loops along with the outer time loop enhances data locality but may require other transformations like loop skewing that inhibit inter-tile parallelism.One approach to tiling that enhances data locality without inhibiting inter-tile parallelism is split tiling, where tiles are subdivided into a sequence of trapezoidal computation steps. In this paper, we develop an approach to generate split tiled code for GPUs in the PPCG polyhedral code generator. We propose a generic algorithm to calculate index-set splitting that enables us to perform tiling for locality and synchronization avoidance, while simultaneously maintaining parallelism, without the need for skewing or redundant computations. Our algorithm performs split tiling for an arbitrary number of dimensions and without the need to construct any large integer linear program. The method and its implementation are evaluated on standard stencil kernels and compared with a state-of-the-art polyhedral compiler and with a domain-specific stencil compiler, both targeting CUDA GPUs.
AU - Kelly,PH
AU - Cohen,A
AU - Grosser,T
AU - Ramanujam,J
AU - Sadayappan,P
AU - Verdoolaege,S
PB - ACM Press
PY - 2013///
TI - Split Tiling for GPUs: Automatic Parallelization Using Trapezoidal Tiles
UR - http://dl.acm.org/citation.cfm?id=2458526&CFID=347882156&CFTOKEN=62979459
ER -