Imperial College London

ProfessorWayneLuk

Faculty of EngineeringDepartment of Computing

Professor of Computer Engineering
 
 
 
//

Contact

 

+44 (0)20 7594 8313w.luk Website

 
 
//

Location

 

434Huxley BuildingSouth Kensington Campus

//

Summary

 

Publications

Publication Type
Year
to

611 results found

Todman T, Luk W, 2013, RUNTIME ASSERTIONS AND EXCEPTIONS FOR STREAMING SYSTEMS, 23rd International Conference on Field Programmable Logic and Applications (FPL), Publisher: IEEE, ISSN: 1946-1488

CONFERENCE PAPER

Gan L, Fu H, Luk W, Yang C, Xue W, Yang Get al., 2013, Global Atmospheric Simulation on a Reconfigurable Platform, 21st Annual International IEEE Symposium on Field-Programmable Custom Computing Machines (FCCM), Publisher: IEEE, Pages: 230-230

CONFERENCE PAPER

Eele A, Maciejowski J, Chau T, Luk Wet al., 2013, Parallelisation of sequential Monte Carlo for real-time control in air traffic management, Pages: 4859-4864, ISSN: 0191-2216

This paper presents the parallelisation of a Sequential Monte Carlo algorithm, and the associated changes required when applied to the problem of conflict resolution and aircraft trajectory control in air traffic management. The target problem is non-linear constrained, non-convex and multi-agent. The new method is shown to have a 98.5% computational time saving over that of a previous sequential implementation, with no degradation in path quality. The computation saving is enough to allow real-time implementation. © 2013 IEEE.

CONFERENCE PAPER

Kurek M, Becker T, Luk W, 2013, Parametric Optimization of Reconfigurable Designs Using Machine Learning, 9th International Applied Reconfigurable Computing Symposium (ARC), Publisher: SPRINGER-VERLAG BERLIN, Pages: 134-145, ISSN: 0302-9743

CONFERENCE PAPER

Petrov Z, Zaykov PG, Cardoso JMP, Coutinho JGF, Diniz PC, Luk Wet al., 2013, An Aspect-Oriented Approach for Designing Safety-Critical Systems, IEEE Aerospace Conference, Publisher: IEEE, ISSN: 1095-323X

CONFERENCE PAPER

Niu X, Chau TCP, Jin Q, Luk W, Liu Qet al., 2013, Automating elimination of idle functions by run-time reconfiguration, 21st Annual International IEEE Symposium on Field-Programmable Custom Computing Machines (FCCM), Publisher: IEEE, Pages: 97-104

CONFERENCE PAPER

Eele A, Maciejowski J, Chau T, Luk Wet al., 2013, Parallelisation of Sequential Monte Carlo for Real-Time Control in Air Traffic Management, 52nd IEEE Annual Conference on Decision and Control (CDC), Publisher: IEEE, Pages: 4853-4858, ISSN: 0743-1546

CONFERENCE PAPER

Grigoras P, Niu X, Coutinho JGF, Luk W, Bower J, Pell Oet al., 2013, Aspect Driven Compilation for Dataflow Designs, IEEE 24th International Conference on Application-Specific Systems, Architectures and Processors (ASAP), Publisher: IEEE, Pages: 18-25, ISSN: 2160-0511

CONFERENCE PAPER

Cattaneo R, Niu X, Pilato C, Becker T, Luk W, Santambrogio MDet al., 2013, A Framework for Effective Exploitation of Partial Reconfiguration in Dataflow Computing, 8th International Workshop on Reconfigurable and Communication-Centric Systems-on-Chip (ReCoSoC), Publisher: IEEE

CONFERENCE PAPER

Arram J, Luk W, Jiang P, 2013, ReconfigurACable Filtered Acceleration of Short Read AlignmentAC, 12th International Conference on Field-Programmable Technology (FPT), Publisher: IEEE, Pages: 438-441

CONFERENCE PAPER

Denholm S, Inouet H, Takenaka T, Luk Wet al., 2013, Application-Specific Customisation of Market Data Feed Arbitration, 12th International Conference on Field-Programmable Technology (FPT), Publisher: IEEE, Pages: 322-325

CONFERENCE PAPER

Niu X, Coutinho JGF, Wang Y, Luk Wet al., 2013, Computing nodes in reconfigurable clusters are occupied and released by applications during their, 12th International Conference on Field-Programmable Technology (FPT), Publisher: IEEE, Pages: 214-221

CONFERENCE PAPER

Arram J, Tsoi KH, Luk W, Jiang Pet al., 2013, Reconfigurable Acceleration of Short Read Mapping, 21st Annual International IEEE Symposium on Field-Programmable Custom Computing Machines (FCCM), Publisher: IEEE, Pages: 210-217

CONFERENCE PAPER

Guo C, Luk W, 2013, Accelerating HAC Estimation for Multivariate Time Series, IEEE 24th International Conference on Application-Specific Systems, Architectures and Processors (ASAP), Publisher: IEEE, Pages: 42-49, ISSN: 2160-0511

CONFERENCE PAPER

Chau TCP, Kwok K-W, Chow GCT, Tsoi KH, Lee K-H, Tse Z, Cheung PYK, Luk Wet al., 2013, Acceleration of Real-time Proximity Query for Dynamic Active Constraints, 12th International Conference on Field-Programmable Technology (FPT), Publisher: IEEE, Pages: 206-213

CONFERENCE PAPER

Inggs G, Thomas D, Luk W, 2013, A Heterogeneous Computing framework for Computational Finance, 42nd Annual International Conference on Parallel Processing (ICPP), Publisher: IEEE, Pages: 688-697, ISSN: 0190-3918

CONFERENCE PAPER

Ruan H, Huang X, Fu H, Yang G, Luk W, Racaniere S, Pell O, Han Wet al., 2013, An FPGA-Based Data Flow Engine For Gaussian Copula Model, 21st Annual International IEEE Symposium on Field-Programmable Custom Computing Machines (FCCM), Publisher: IEEE, Pages: 218-225

CONFERENCE PAPER

Niu X, Chau TCP, Jin Q, Luk W, Liu Qet al., 2013, Automating resource optimisation in reconfigurable design (abstract only)., Publisher: ACM, Pages: 275-275

CONFERENCE PAPER

Le Masle A, Luk W, 2012, Detecting power attacks on reconfigurable hardware, Pages: 14-19

We present a novel framework to detect power attacks on crypto-systems implemented on reconfigurable hardware. We monitor the device supply voltage with a ring oscillator-based on-chip power monitor. In order to detect the insertion of power measurement circuits onto a device's power rail, a power attack detection strategy taking into account abnormal supply voltages and power rail resistance values is developed. Our strategy is integrated into an on-chip attack detector. The entire framework implementation only takes 3300 LUTs of a Spartan-6 LX45 FPGA, which is 12% of the total area available. Our results on an AES and RSA crypto-system show that our attack detection framework can reach false-positive and false-negative rates as low as 0% over all our test cases if proper operating margins are set. © 2012 IEEE.

CONFERENCE PAPER

Niu X, Jin Q, Luk W, Liu Q, Pell Oet al., 2012, Exploiting run-time reconfiguration in stencil computation, Pages: 173-180

Stencil computation is computationally intensive and required by many applications. This paper proposes an approach to exploit run-time reconfigurability of field-programmable accelerators for stencil computation. System throughput is optimized by partitioning, analysing and scheduling tasks in applications to remove idle functions. To evaluate the proposed approach, Reverse Time Migration (RTM), a high performance application, is developed. Our optimized runtime reconfigurable solution, which targets a Virtex-6 FPGA in a Maxeler MAX3424A system, can achieves an improved throughput of 102.8 GFlop/s, up to two orders of magnitude faster than the CPU reference designs, 1.59 times faster than the best published GPU and FPGA results, and 1.45 times faster than an optimized static implementation. © 2012 IEEE.

CONFERENCE PAPER

Chau TCP, Luk W, Cheung PYK, Eele A, Maciejowski Jet al., 2012, Adaptive sequential Monte Carlo approach for real-time applications, Pages: 527-530

This paper presents an adaptive Sequential Monte Carlo approach for real-time applications. Sequential Monte Carlo method is employed to estimate the states of dynamic systems using weighted particles. The proposed approach reduces the run-time computation complexity by adapting the size of the particle set. Multiple processing elements on FPGAs are dynamically allocated for improved energy efficiency without violating real-time constraints. A robot localisation application is developed based on the proposed approach. Compared to a non-adaptive implementation, the dynamic energy consumption is reduced by up to 70% without affecting the quality of solutions. © 2012 IEEE.

CONFERENCE PAPER

Betkaoui B, Wang Y, Thomas DB, Luk Wet al., 2012, Parallel FPGA-based all pairs shortest paths for sparse networks: A human brain connectome case study, Pages: 99-104

This paper proposes a highly parallel and scalable reconfigurable design for the All-Pairs Shortest-Paths (APSP) algorithm for very sparse networks. Our work is motivated by a computationally intensive bioinformatics application that employs this memory-latency bound algorithm. The proposed design methodology takes advantage of distributed on-chip memory resources of modern FPGAs to reduce accesses to high-latency off-chip memories. We develop design optimisations that yield different FPGA configurations which are selected at run time based on the input graph data. Using human brain network data, we are able to achieve performance results superior to those from multi-core CPU and GPU, while attaining linear scaling over the number of processors introduced. Our FPGA-based APSP design is over 10 times faster than a quad-core CPU implementation and 2-5 times faster than an AMD Cypress GPU implementation. © 2012 IEEE.

CONFERENCE PAPER

Todman T, Luk W, 2012, Verification of streaming designs by combining symbolic simulation and equivalence checking, Pages: 203-208

As design complexity grows, verification becomes a bottleneck in design development and implementation. This paper describes a novel approach for verifying reconfigurable streaming designs, based on symbolic simulation and equivalence checking. Compared with numerical simulation, symbolic simulation provides a more informative way of showing a design behaved as expected; equivalence checking enables automatic checking of equivalence of symbolic expressions. Our approach has been implemented for designs targeting Maxeler technologies, using an easy-to-use symbolic simulator and the Yices equivalence checker, together with other facilities such as an output combiner to support an automated verification flow. Several benchmarks including, including one-dimensional convolution and finite difference computation, are used to evaluate the proposed approach. © 2012 IEEE.

CONFERENCE PAPER

Jin Q, Becker T, Luk W, Thomas Det al., 2012, Optimising explicit finite difference option pricing for dynamic constant reconfiguration, Pages: 165-172

This paper demonstrates a novel optimisation methodology to adjust stencil based numerical procedures from the algorithm level, so as to reduce not only the amount of hardware resource consumption per kernel but also the amount of computation required to achieve desired result accuracy, when mapping the algorithm to reconfigurable hardware using dynamic constant reconfiguration. As a result, less area is needed to support run-time reconfiguration, and less computational steps are required in the numerical procedure to obtain a result with given error tolerance. We analyse one thousand fixed point implementations on a Virtex-6 XC6VLX760 FPGA for randomly generated option pricing problems, which are representative of industrial computation. When comparing optimised implementations to the un-optimised ones, the reconfiguration area upper bound is reduced by 22%; the average number of computational steps is reduced by 23%; and the area-computation-time product is reduced by 40%; while the numerical errors of the results are kept below the error tolerant level used in industry. © 2012 IEEE.

CONFERENCE PAPER

Cardoso JMP, Carvalho T, Coutinho JGF, Diniz PC, Petrov Z, Luk Wet al., 2012, Controlling hardware synthesis with aspects, Pages: 226-233

The synthesis and mapping of applications to configurable embedded systems is a notoriously hard process. Tools have a wide range of parameters, which interact in very unpredictable ways, thus creating a large and complex design space. When exploring this space, designers must understand the interfaces to the various tools and apply, often manually, a sequence of tool-specific transformations making this an extremely cumbersome and error-prone process. This paper describes the use of aspect-oriented techniques for capturing synthesis strategies for tuning the performance of applications' kernels. We illustrate the use of this approach when designing application-specific architectures generated by a high-level synthesis tool. The results highlight the impact of the various strategies when targeting custom hardware and expose the difficulties in devising these strategies. © 2012 IEEE.

CONFERENCE PAPER

Spacey S, Luk W, Kelly PHJ, Kuhn Det al., 2012, Improving communication latency with the write-only architecture, JOURNAL OF PARALLEL AND DISTRIBUTED COMPUTING, Vol: 72, Pages: 1617-1627, ISSN: 0743-7315

JOURNAL ARTICLE

Liu Q, Todman T, Luk W, Constantinides GAet al., 2012, Optimizing Hardware Design by Composing Utility-Directed Transformations, IEEE TRANSACTIONS ON COMPUTERS, Vol: 61, Pages: 1800-1812, ISSN: 0018-9340

JOURNAL ARTICLE

Pnevmatikatos D, Becker T, Brokalakis A, Bruneel K, Gaydadjiev G, Luk W, Papadimitriou K, Papaefstathiou I, Pell O, Pilato C, Robart M, Santambrogio MD, Sciuto D, Stroobandt D, Todman Tet al., 2012, FASTER: Facilitating analysis and synthesis technologies for effective reconfiguration, Pages: 234-241

The FASTER project aims to ease the definition, implementation and use of dynamically changing hardware systems. Our motivation stems from the promise reconfigurable systems hold for achieving better performance and extending product functionality and lifetime via the addition of new features that work at hardware speed. This is a clear advantage over the more straightforward software component adaptivity. However, designing a changing hardware system is both challenging and time consuming. The FASTER project will facilitate the use of reconfigurable technology by providing a complete methodology that enables designers to easily specify, analyse, implement and verify applications on platforms with general-purpose processors and acceleration modules implemented in the latest reconfigurable technology. To better adapt to different application requirements, the tool-chain will support both region-based and micro-reconfiguration and provide a flexible run-time system that will efficiently manage the reconfigurable resources. We will use applications from the embedded, high performance computing, and desktop domains to demonstrate the potential benefits of the FASTER tools on metrics such as performance, power consumption and total ownership cost. © 2012 IEEE.

CONFERENCE PAPER

Cheung K, Schultz SR, Luk W, 2012, A large-scale spiking neural network accelerator for FPGA systems, Pages: 113-120, ISSN: 0302-9743

Spiking neural networks (SNN) aim to mimic membrane potential dynamics of biological neurons. They have been used widely in neuromorphic applications and neuroscience modeling studies. We design a parallel SNN accelerator for producing large-scale cortical simulation targeting an off-the-shelf Field-Programmable Gate Array (FPGA)-based system. The accelerator parallelizes synaptic processing with run time proportional to the firing rate of the network. Using only one FPGA, this accelerator is estimated to support simulation of 64K neurons 2.5 times real-time, and achieves a spike delivery rate which is at least 1.4 times faster than a recent GPU accelerator with a benchmark toroidal network. © 2012 Springer-Verlag.

CONFERENCE PAPER

Yu C, Smith AM, Luk W, Leong PHW, Wilton SJEet al., 2012, Optimizing Floating Point Units in Hybrid FPGAs, IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS, Vol: 20, Pages: 1295-1303, ISSN: 1063-8210

JOURNAL ARTICLE

This data is extracted from the Web of Science and reproduced under a licence from Thomson Reuters. You may not copy or re-distribute this data in whole or in part without the written consent of the Science business of Thomson Reuters.

Request URL: http://wlsprd.imperial.ac.uk:80/respub/WEB-INF/jsp/search-html.jsp Request URI: /respub/WEB-INF/jsp/search-html.jsp Query String: id=00154588&limit=30&person=true&page=6&respub-action=search.html