Imperial College London

ProfessorPaulKelly

Faculty of EngineeringDepartment of Computing

Professor of Software Technology
 
 
 
//

Contact

 

+44 (0)20 7594 8332p.kelly Website

 
 
//

Location

 

Level 3 (upstairs), William Penney Building, room 304William Penney LaboratorySouth Kensington Campus

//

Summary

 

Publications

Publication Type
Year
to

162 results found

Saeedi S, Bodin B, Wagstaff H, Nisbet A, Nardi L, Mawer J, Melot N, Palomar O, Vespa E, Spink T, Gorgovan C, Webb A, Clarkson J, Tomusk E, Debrunner T, Kaszyk K, Gonzalez-De-Aledo P, Rodchenko A, Riley G, Kotselidis C, Franke B, O'Boyle MFP, Davison AJ, Kelly PHJ, Lujan M, Furber Set al., 2018, Navigating the Landscape for Real-Time Localization and Mapping for Robotics and Virtual and Augmented Reality, PROCEEDINGS OF THE IEEE, Vol: 106, Pages: 2020-2039, ISSN: 0018-9219

JOURNAL ARTICLE

Kelly PHJ, 2018, IEEE Cluster 2018 Message from the Program Chair, ISSN: 1552-5244

CONFERENCE PAPER

Bodin B, Nardi L, Wagstaff H, Kelly PHJ, O'Boyle Met al., 2018, Algorithmic Performance-Accuracy Trade-off in 3D Vision Applications, Pages: 123-124

© 2018 IEEE. Simultaneous Localisation And Mapping (SLAM) is a key component of robotics and augmented reality (AR) systems. While a large number of SLAM algorithms have been presented, there has been little effort to unify the interface of such algorithms, or to perform a holistic comparison of their capabilities. This is particularly true when it comes to evaluate the potential trade-offs between computation speed, accuracy, and power consumption. SLAMBench is a benchmarking framework to evaluate existing and future SLAM systems, both open and closed source, over an extensible list of datasets, while using a comparable and clearly specified list of performance metrics. SLAMBench is a publicly-available software framework which represents a starting point for quantitative, comparable and validatable experimental research to investigate trade-offs in performance, accuracy and energy consumption across SLAM systems. In this poster we give an overview of SLAMBench and in particular we show how this framework can be used within Design Space Exploration and large-scale performance evaluation on mobile phones.

CONFERENCE PAPER

Bodin B, Wagstaff H, Saeedi S, Nardi L, Vespa E, Mawer J, Nisbet A, Lujan M, Furber S, Davison AJ, Kelly PHJ, O'Boyle MFPet al., 2018, SLAMBench2: multi-objective head-to-head benchmarking for visual SLAM, IEEE International Conference on Robotics and Automation (ICRA), Publisher: IEEE, Pages: 3637-3644, ISSN: 1050-4729

SLAM is becoming a key component of robotics and augmented reality (AR) systems. While a large number of SLAM algorithms have been presented, there has been little effort to unify the interface of such algorithms, or to perform a holistic comparison of their capabilities. This is a problem since different SLAM applications can have different functional and non-functional requirements. For example, a mobile phone-based AR application has a tight energy budget, while a UAV navigation system usually requires high accuracy. SLAMBench2 is a benchmarking framework to evaluate existing and future SLAM systems, both open and close source, over an extensible list of datasets, while using a comparable and clearly specified list of performance metrics. A wide variety of existing SLAM algorithms and datasets is supported, e.g. ElasticFusion, InfiniTAM, ORB-SLAM2, OKVIS, and integrating new ones is straightforward and clearly specified by the framework. SLAMBench2 is a publicly-available software framework which represents a starting point for quantitative, comparable and val-idatable experimental research to investigate trade-offs across SLAM systems.

CONFERENCE PAPER

Vespa E, Nikolov N, Grimm M, Nardi L, Kelly PHJ, Leutenegger Set al., 2018, Efficient Octree-Based Volumetric SLAM Supporting Signed-Distance and Occupancy Mapping, IEEE ROBOTICS AND AUTOMATION LETTERS, Vol: 3, Pages: 1144-1151, ISSN: 2377-3766

JOURNAL ARTICLE

Nica A, Vespa E, González de Aledo P, Kelly PHJet al., 2018, Investigating automatic vectorization for real-time 3D scene understanding

© 2018 Association for Computing Machinery. Simultaneous Localization And Mapping (SLAM) is the problem of building a representation of a geometric space while simultaneously estimating the observer’s location within the space. While this seems to be a chicken-and-egg problem, several algorithms have appeared in the last decades that approximately and iteratively solve this problem. SLAM algorithms are tailored to the available resources, hence aimed at balancing the precision of the map with the constraints that the computational platform imposes and the desire to obtain real-time results. Working with KinectFusion, an established SLAM implementation, we explore in this work the vectorization opportunities present in this scenario, with the goal of using the CPU to its full potential. Using ISPC, an automatic vectorization tool, we produce a partially vectorized version of KinectFusion. Along the way we explore a number of optimization strategies, among which tiling to exploit ray-coherence and outer loop vectorization, obtaining up to 4x speed-up over the baseline on an 8-wide vector machine.

CONFERENCE PAPER

Unat D, Dubey A, Hoefler T, Shalf J, Abraham M, Bianco M, Chamberlain BL, Cledat R, Edwards HC, Finkel H, Fuerlinger K, Hannig F, Jeannot E, Kamil A, Keasler J, Kelly PHJ, Leung V, Ltaief H, Maruyama N, Newburn CJ, Pericas Met al., 2017, Trends in Data Locality Abstractions for HPC Systems, IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS, Vol: 28, Pages: 3007-3020, ISSN: 1045-9219

JOURNAL ARTICLE

Bolten M, Franchetti F, Kelly PHJ, Lengauer C, Mohr Met al., 2017, Algebraic description and automatic generation of multigrid methods in SPIRAL, CONCURRENCY AND COMPUTATION-PRACTICE & EXPERIENCE, Vol: 29, ISSN: 1532-0626

JOURNAL ARTICLE

Saeedi S, Nardi L, Johns E, Bodin B, Kelly PHJ, Davison AJet al., 2017, Application-oriented design space exploration for SLAM algorithms, Pages: 5716-5723, ISSN: 1050-4729

© 2017 IEEE. In visual SLAM, there are many software and hardware parameters, such as algorithmic thresholds and GPU frequency, that need to be tuned; however, this tuning should also take into account the structure and motion of the camera. In this paper, we determine the complexity of the structure and motion with a few parameters calculated using information theory. Depending on this complexity and the desired performance metrics, suitable parameters are explored and determined. Additionally, based on the proposed structure and motion parameters, several applications are presented, including a novel active SLAM approach which guides the camera in such a way that the SLAM algorithm achieves the desired performance metrics. Real-world and simulated experimental results demonstrate the effectiveness of the proposed design space and its applications.

CONFERENCE PAPER

Luporini F, Ham DA, Kelly PHJ, 2017, An Algorithm for the Optimization of Finite Element Integration Loops, ACM TRANSACTIONS ON MATHEMATICAL SOFTWARE, Vol: 44, ISSN: 0098-3500

JOURNAL ARTICLE

Nardi L, Bodin B, Saeedi S, Vespa E, Davison AJ, Kelly PHJet al., 2017, Algorithmic Performance-Accuracy Trade-off in 3D Vision Applications Using HyperMapper, 31st IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPS), Publisher: IEEE, Pages: 1434-1443, ISSN: 2164-7062

CONFERENCE PAPER

Rathgeber F, Ham DA, Mitchell L, Lange M, Luporini F, Mcrae ATT, Bercea G-T, Markall GR, Kelly PHJet al., 2017, Firedrake: Automating the Finite Element Method by Composing Abstractions, ACM TRANSACTIONS ON MATHEMATICAL SOFTWARE, Vol: 43, ISSN: 0098-3500

JOURNAL ARTICLE

Bodin B, Nardi L, Zia MZ, Wagstaff H, Shenoy GS, Emani M, Mawer J, Kotselidis C, Nisbet A, Lujan M, Franke B, Kelly PHJ, O’Boyle Met al., 2016, Integrating Algorithmic Parameters into Benchmarking and Design Space Exploration in 3D Scene Understanding, International conference on Parallel Architectures and Compilation Techniques, Publisher: IEEE

System designers typically use well-studied benchmarks toevaluate and improve new architectures and compilers. Wedesign tomorrow's systems based on yesterday's applications.In this paper we investigate an emerging application,3D scene understanding, likely to be signi cant in the mobilespace in the near future. Until now, this application couldonly run in real-time on desktop GPUs. In this work, weexamine how it can be mapped to power constrained embeddedsystems. Key to our approach is the idea of incrementalco-design exploration, where optimization choices that concernthe domain layer are incrementally explored togetherwith low-level compiler and architecture choices. The goalof this exploration is to reduce execution time while minimizingpower and meeting our quality of result objective.As the design space is too large to exhaustively evaluate,we use active learning based on a random forest predictorto nd good designs. We show that our approach can, forthe rst time, achieve dense 3D mapping and tracking in thereal-time range within a 1W power budget on a popular embeddeddevice. This is a 4.8x execution time improvementand a 2.8x power reduction compared to the state-of-the-art.

CONFERENCE PAPER

Bercea G-T, McRae ATT, Ham DA, Mitchell L, Rathgeber F, Nardi L, Luporini F, Kelly PHJet al., 2016, A structure-exploiting numbering algorithm for finite elements on extruded meshes, and its performance evaluation in Firedrake, GEOSCIENTIFIC MODEL DEVELOPMENT, Vol: 9, Pages: 3803-3815, ISSN: 1991-959X

JOURNAL ARTICLE

Reguly IZ, Mudalige GR, Bertolli C, Giles MB, Betts A, Kelly PHJ, Radford Det al., 2016, Acceleration of a Full-Scale Industrial CFD Application with OP2, IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS, Vol: 27, Pages: 1265-1278, ISSN: 1045-9219

JOURNAL ARTICLE

Wozniak BD, Witherden FD, Russell FP, Vincent PE, Kelly PHJet al., 2016, GiMMiK-Generating bespoke matrix multiplication kernels for accelerators: Application to high-order Computational Fluid Dynamics, COMPUTER PHYSICS COMMUNICATIONS, Vol: 202, Pages: 12-22, ISSN: 0010-4655

JOURNAL ARTICLE

Zia MZ, Nardi L, Jack A, Vespa E, Bodin B, Kelly PHJ, Davison AJet al., 2016, Comparative Design Space Exploration of Dense and Semi-Dense SLAM, IEEE International Conference on Robotics and Automation (ICRA), Publisher: IEEE, Pages: 1292-1299, ISSN: 1050-4729

CONFERENCE PAPER

Bodin B, Nardi L, Kelly PHJ, O'Boyle MFPet al., 2016, Diplomat: Mapping of Multi-kernel Applications Using a Static Dataflow Abstraction, 24th IEEE International Symposium on Modeling, Analysis and Simulation of Computer and Telecommunication Systems (MASCOTS), Publisher: IEEE, Pages: 241-250, ISSN: 1526-7539

CONFERENCE PAPER

Russell FP, Wilkinson KA, Kelly PHJ, Skylaris C-Ket al., 2015, Optimised three-dimensional Fourier interpolation: An analysis of techniques and application to a linear-scaling density functional theory code, COMPUTER PHYSICS COMMUNICATIONS, Vol: 187, Pages: 8-19, ISSN: 0010-4655

JOURNAL ARTICLE

Gorman GJ, Rokos G, Southern J, Kelly PHJet al., 2015, Thread-parallel anisotropic mesh adaptation, SEMA SIMAI Springer Series, Vol: 5, Pages: 113-137, ISSN: 2199-3041

© 2015, Springer International Publishing Switzerland. Mesh adaptation is a powerful way to minimise the computational cost of mesh based computation. It is particularly successful for multi-scale problems where the required mesh resolution can vary by orders of magnitude across the domain. The end result is local control over solution accuracy and reduced time to solution. In the case of large scale simulations, where the time to solution is unacceptable or the memory requirements exceeds available RAM, mesh based computation is typically parallelised using domain decomposition methods using the Message Passing Interface (MPI). This allows a simulation to run in parallel on a distributed memory computer. While this has been a high successful strategy up until now, the drive towards low power multi- and many-core architectures means that an even higher degree of parallelism is required and the memory hierarchy exploited to maximise memory bandwidth. For this reason application codes are increasingly adopting a hybrid parallel approach whereby decomposition methods, implemented using the Message Passing Interface (MPI), are applied for inter-node parallelisation, while a threaded programming model is used for intra-node parallelisation. Mesh adaptivity has been successfully parallelised using MPI by a number of groups, and can be implemented efficiently with few modifications to the serial code. However, thread-level parallelism is significantly more challenging because each thread modifies the mesh data and therefore must be carefully marshalled to avoid data races while still ensuring enough parallelism is exposed to achieve good parallel efficiency. Here we describe a new thread-parallel algorithm for anisotropic mesh adaptation algorithms. For each mesh optimisation phase (refinement, coarsening, swapping and smoothing) we describe how independent sets of tasks are defined. We show how a deferred updates strategy can be used to update the mesh data structure

JOURNAL ARTICLE

Rokos G, Gorman G, Kelly PHJ, 2015, A Fast and Scalable Graph Coloring Algorithm for Multi-core and Many-core Architectures, 21st International Conference on Parallel and Distributed Computing (Euro-Par), Publisher: SPRINGER-VERLAG BERLIN, Pages: 414-425, ISSN: 0302-9743

CONFERENCE PAPER

Nardi L, Bodin B, Zia MZ, Mawer J, Nisbet A, Kelly PHJ, Davison AJ, Lujan M, O'Boyle MEP, Riley G, Topham N, Furber Set al., 2015, Introducing SLAMBench, a performance and accuracy benchmarking methodology for SLAM, IEEE International Conference on Robotics and Automation (ICRA), Publisher: IEEE COMPUTER SOC, Pages: 5783-5790, ISSN: 1050-4729

CONFERENCE PAPER

Popovici DT, Russell FP, Wilkinson K, Skylaris C-K, Kelly PHJ, Franchetti Fet al., 2015, Generating Optimized Fourier Interpolation Routines for Density Functional Theory using SPIRAL, 29th IEEE International Parallel and Distributed Processing Symposium (IPDPS), Publisher: IEEE, Pages: 743-752, ISSN: 1530-2075

CONFERENCE PAPER

Nardi L, Bodin B, Zia MZ, Mawer J, Nisbet A, Kelly PHJ, Davison AJ, Luján M, O'Boyle MFP, Riley GD, Topham NP, Furber SBet al., 2015, Introducing SLAMBench, a performance and accuracy benchmarking methodology for SLAM., Publisher: IEEE, Pages: 5783-5790

CONFERENCE PAPER

Luporini F, Varbanescu AL, Rathgeber F, Bercea G-T, Ramanujam J, Ham DA, Kelly PHJet al., 2014, Cross-Loop Optimization of Arithmetic Intensity for Finite Element Local Assembly, ACM TRANSACTIONS ON ARCHITECTURE AND CODE OPTIMIZATION, Vol: 11, ISSN: 1544-3566

JOURNAL ARTICLE

Collingbourne P, Cadar C, Kelly PHJ, 2014, Symbolic Crosschecking of Data-Parallel Floating-Point Code, IEEE TRANSACTIONS ON SOFTWARE ENGINEERING, Vol: 40, Pages: 710-737, ISSN: 0098-5589

JOURNAL ARTICLE

Konstantinidis A, Kelly PHJ, Ramanujam J, Sadayappan Pet al., 2014, Parametric GPU Code Generation for Affine Loop Programs, 26th International Workshop on Languages and Compilers for Parallel Computing (LCPC), Publisher: SPRINGER-VERLAG BERLIN, Pages: 136-151, ISSN: 0302-9743

CONFERENCE PAPER

Strout MM, Luporini F, Krieger CD, Bertolli C, Bercea G-T, Olschanowsky C, Ramanujam J, Kelly PHJet al., 2014, Generalizing Run-time Tiling with the Loop Chain Abstraction, IEEE 28th International Parallel & Distributed Processing Symposium (IPDPS), Publisher: IEEE, ISSN: 1530-2075

CONFERENCE PAPER

Salas-Moreno RF, Glocker B, Kelly PHJ, Davison AJet al., 2014, Dense Planar SLAM, IEEE International Symposium on Mixed and Augmented Reality (ISMAR) - Science and Technology, Publisher: IEEE, Pages: 157-164, ISSN: 1554-7868

CONFERENCE PAPER

Salas-Moreno RF, Glocker B, Kelly PHJ, Davison AJet al., 2014, Dense planar SLAM., Publisher: IEEE Computer Society, Pages: 157-164

CONFERENCE PAPER

This data is extracted from the Web of Science and reproduced under a licence from Thomson Reuters. You may not copy or re-distribute this data in whole or in part without the written consent of the Science business of Thomson Reuters.

Request URL: http://wlsprd.imperial.ac.uk:80/respub/WEB-INF/jsp/search-html.jsp Request URI: /respub/WEB-INF/jsp/search-html.jsp Query String: respub-action=search.html&id=00003206&limit=30&person=true