Imperial College London

ProfessorWayneLuk

Faculty of EngineeringDepartment of Computing

Professor of Computer Engineering
 
 
 
//

Contact

 

+44 (0)20 7594 8313w.luk Website

 
 
//

Location

 

434Huxley BuildingSouth Kensington Campus

//

Summary

 

Publications

Publication Type
Year
to

611 results found

Fan X, Wu D, Cao W, Luk W, Wang Let al., 2018, Stream Processing Dual-Track CGRA for Object Inference, IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS, Vol: 26, Pages: 1098-1111, ISSN: 1063-8210

JOURNAL ARTICLE

Funie A-I, Grigoras P, Burovskiy P, Luk W, Salmon Met al., 2018, Run-time Reconfigurable Acceleration for Genetic Programming Fitness Evaluation in Trading Strategies, JOURNAL OF SIGNAL PROCESSING SYSTEMS FOR SIGNAL IMAGE AND VIDEO TECHNOLOGY, Vol: 90, Pages: 39-52, ISSN: 1939-8018

JOURNAL ARTICLE

Lee K-H, Fu KCD, Guo Z, Dong Z, Leong MCW, Cheung C-L, Lee AP-W, Luk W, Kwok K-Wet al., 2018, MR Safe Robotic Manipulator for MRI-Guided Intracardiac Catheterization, IEEE-ASME TRANSACTIONS ON MECHATRONICS, Vol: 23, Pages: 586-595, ISSN: 1083-4435

JOURNAL ARTICLE

Liang S, Yin S, Liu L, Luk W, Wei Set al., 2018, FP-BNN: Binarized neural network on FPGA, NEUROCOMPUTING, Vol: 275, Pages: 1072-1086, ISSN: 0925-2312

JOURNAL ARTICLE

Ng HC, Liu S, Luk W, 2018, ADAM: Automated design analysis and merging for speeding up FPGA development, Pages: 189-198

© 2018 Association for Computing Machinery. This paper introduces ADAM, an approach for merging multiple FPGA designs into a single hardware design, so that multiple place- and-route tasks can be replaced by a single task to speed up functional evaluation of designs, especially during the development process. ADAM has three key elements. First, a novel approximate maximum common subgraph detection algorithm with linear time complexity to maximize sharing of resources in the merged design. Second, a prototype tool implementing this common subgraph detection algorithm for dataflow graphs derived from Verilog designs; this tool would also generate the appropriate control circuits to enable selection of the original designs at runtime. Third, a comprehensive analysis of compilation time versus degree of similarity to identify the optimized user parameters for the proposed approach. Experimental results show that ADAM can reduce compilation time by around 5 times when each design is 95% similar to the others, and the compilation time is reduced from 1 hour to 10 minutes in the case of binomial filters.

CONFERENCE PAPER

Russell FP, Targett JS, Luk W, 2018, From Tensor Algebra to Hardware Accelerators: Generating Streaming Architectures for Solving Partial Differential Equations, ISSN: 1063-6862

© 2018 IEEE. Hardware accelerators are attractive targets for running scientific simulations due to their power efficiency. Since, large software simulations can take person years to develop, it is often impractical to use hardware acceleration, which requires significantly more development effort and expertise than software development. We present the design and implementation of a proof-of-concept compiler toolchain which enables rapid prototyping of hardware finite difference solvers for partial differential equations, generated from a high-level domain specific language. Multiple fields, grid staggering and non-linear terms are supported. We demonstrate that our approach is practical by generating and evaluating hardware designs derived from the heat and simplified shallow water equations.

CONFERENCE PAPER

Shao S, Tsai J, Mysior M, Luk W, Chau T, Warren A, Jeppesen Bet al., 2018, Towards Hardware Accelerated Reinforcement Learning for Application-Specific Robotic Control, ISSN: 1063-6862

© 2018 IEEE. Reinforcement Learning (RL) is an area of machine learning in which an agent interacts with the environment by making sequential decisions. The agent receives reward from the environment based on how good the decisions are and tries to find an optimal decision-making policy that maximises its longterm cumulative reward. This paper presents a novel approach which has showon promise in applying accelerated simulation of RL policy training to automating the control of a real robot arm for specific applications. The approach has two steps. First, design space exploration techniques are developed to enhance performance of an FPGA accelerator for RL policy training based on Trust Region Policy Optimisation (TRPO), which results in a 43% speed improvement over a previous FPGA implementation, while achieving 4.65 times speed up against deep learning libraries running on GPU and 19.29 times speed up against CPU. Second, the trained RL policy is transferred to a real robot arm. Our experiments show that the trained arm can successfully reach to and pick up predefined objects, demonstrating the feasibility of our approach.

CONFERENCE PAPER

Zhao R, Liu S, Ng HC, Wang E, Davis JJ, Niu X, Wang X, Shi H, Constantinides GA, Cheung PYK, Luk Wet al., 2018, Hardware Compilation of Deep Neural Networks: An Overview, ISSN: 1063-6862

© 2018 IEEE. Deploying a deep neural network model on a reconfigurable platform, such as an FPGA, is challenging due to the enormous design spaces of both network models and hardware design. A neural network model has various layer types, connection patterns and data representations, and the corresponding implementation can be customised with different architectural and modular parameters. Rather than manually exploring this design space, it is more effective to automate optimisation throughout an end-to-end compilation process. This paper provides an overview of recent literature proposing novel approaches to achieve this aim. We organise materials to mirror a typical compilation flow: front end, platform-independent optimisation and back end. Design templates for neural network accelerators are studied with a specific focus on their derivation methodologies. We also review previous work on network compilation and optimisation for other hardware platforms to gain inspiration regarding FPGA implementation. Finally, we propose some future directions for related research.

CONFERENCE PAPER

Zhao R, Ng H-C, Luk W, Niu Xet al., 2018, Towards Efficient Convolutional Neural Network for Domain-Specific Applications on FPGA.

CONFERENCE PAPER

Zhao R, Niu X, Luk W, 2018, Automatic Optimising CNN with Depthwise Separable Convolution on FPGA: (Abstact Only)., Publisher: ACM, Pages: 285-285

CONFERENCE PAPER

Arram J, Kaplan T, Luk W, Jiang Pet al., 2017, Leveraging FPGAs for Accelerating Short Read Alignment, IEEE-ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS, Vol: 14, Pages: 668-677, ISSN: 1545-5963

JOURNAL ARTICLE

Burovskiy P, Grigoras P, Sherwin S, Luk Wet al., 2017, Efficient Assembly for High-Order Unstructured FEM Meshes (FPL 2015), ACM TRANSACTIONS ON RECONFIGURABLE TECHNOLOGY AND SYSTEMS, Vol: 10, ISSN: 1936-7406

JOURNAL ARTICLE

Chau T, Burovskiy P, Flynn M, Luk Wet al., 2017, Advances in Dataflow Systems, ADVANCES IN COMPUTERS, VOL 106, Editors: Hurson, Milutinovic, Publisher: ELSEVIER ACADEMIC PRESS INC, Pages: 21-62, ISBN: 978-0-12-812230-3

BOOK CHAPTER

Chau TCP, Burovskiy P, Flynn MJ, Luk Wet al., 2017, Chapter Two - Advances in Dataflow Systems., Advances in Computers, Vol: 105, Pages: 21-62

JOURNAL ARTICLE

Cooper B, Girdlestone S, Burovskiy P, Gaydadjiev G, Averbukh V, Knowles PJ, Luo Wet al., 2017, Quantum Chemistry in Dataflow: Density-Fitting MP2, JOURNAL OF CHEMICAL THEORY AND COMPUTATION, Vol: 13, Pages: 5265-5272, ISSN: 1549-9618

JOURNAL ARTICLE

Fan H, Niu X, Liu Q, Luk Wet al., 2017, F-C3D: FPGA-based 3-Dimensional Convolutional Neural Network, 27th International Conference on Field Programmable Logic and Applications (FPL), Publisher: IEEE, ISSN: 1946-1488

CONFERENCE PAPER

Fu H, He C, Luk W, Li W, Yang Get al., 2017, A Nanosecond-level Hybrid Table Design for Financial Market Data Generators, 25th IEEE Annual International Symposium on Field-Programmable Custom Computing Machines (FCCM), Publisher: IEEE, Pages: 227-234

CONFERENCE PAPER

Fu H, He C, Ruan H, Greenspon I, Luk W, Zheng Y, Liao J, Zhang Q, Yang Get al., 2017, Accelerating Financial Market Server through Hybrid List Design (Abstract Only)., Publisher: ACM, Pages: 289-290

CONFERENCE PAPER

Funie A-I, Guo L, Niu X, Luk W, Salmon Met al., 2017, Custom Framework for Run-Time Trading Strategies, 13th International Symposium on Applied Reconfigurable Computing (ARC), Publisher: SPRINGER INTERNATIONAL PUBLISHING AG, Pages: 154-167, ISSN: 0302-9743

CONFERENCE PAPER

Gan L, Fu H, Luk W, Yang C, Xue W, Yang Get al., 2017, Solving Mesoscale Atmospheric Dynamics Using a Reconfigurable Dataflow Architecture, IEEE MICRO, Vol: 37, Pages: 40-50, ISSN: 0272-1732

JOURNAL ARTICLE

Gan L, Fu H, Mencer O, Luk W, Yang Get al., 2017, Data Flow Computing in Geoscience Applications, Editors: Hurson, Milutinovic, Publisher: ELSEVIER ACADEMIC PRESS INC, Pages: 125-158, ISBN: 978-0-12-811955-6

BOOK CHAPTER

Gan L, Fu H, Mencer O, Luk W, Yang Get al., 2017, Chapter Four - Data Flow Computing in Geoscience Applications., Advances in Computers, Vol: 104, Pages: 125-158

JOURNAL ARTICLE

Grigoras P, Burovskiy P, Arram J, Niu X, Cheung K, Xie J, Luk Wet al., 2017, dfesnippets: An Open-Source Library for Dataflow Acceleration on FPGAs, 13th International Symposium on Applied Reconfigurable Computing (ARC), Publisher: SPRINGER INTERNATIONAL PUBLISHING AG, Pages: 299-310, ISSN: 0302-9743

CONFERENCE PAPER

He C, Fu H, Guo C, Luk W, Yang Get al., 2017, A Fully-Pipelined Hardware Design for Gaussian Mixture Models, IEEE TRANSACTIONS ON COMPUTERS, Vol: 66, Pages: 1837-1850, ISSN: 0018-9340

JOURNAL ARTICLE

He C, Fu H, Luk W, Li W, Yang Get al., 2017, Exploring the Potential of Reconfigurable Platforms for Order Book Update, 27th International Conference on Field Programmable Logic and Applications (FPL), Publisher: IEEE, ISSN: 1946-1488

CONFERENCE PAPER

Hung E, Todman T, Luk W, 2017, Transparent In-Circuit Assertions for FPGAs, IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS, Vol: 36, Pages: 1193-1202, ISSN: 0278-0070

JOURNAL ARTICLE

Inggs G, Thomas DB, Luk W, 2017, A Domain Specific Approach to High Performance Heterogeneous Computing, IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS, Vol: 28, Pages: 2-15, ISSN: 1045-9219

JOURNAL ARTICLE

Lee K-H, Leong MCW, Chow MCK, Fu H-C, Luk W, Sze K-Y, Yeung C-K, Kwok K-Wet al., 2017, FEM-based Soft Robotic Control Framework for Intracavitary Navigation, IEEE International Conference on Real-time Computing and Robotics (RCAR), Publisher: IEEE, Pages: 11-16

CONFERENCE PAPER

Leong PHW, Amano H, Anderson J, Bertels K, Cardoso JMP, Diessel O, Gogniat G, Hutton M, Lee J, Luk W, Lysaght P, Platzner M, Prasanna VK, Rissa T, Silvano C, So HK-H, Wang Yet al., 2017, The First 25 Years of the FPL Conference: Significant Papers, ACM TRANSACTIONS ON RECONFIGURABLE TECHNOLOGY AND SYSTEMS, Vol: 10, ISSN: 1936-7406

JOURNAL ARTICLE

This data is extracted from the Web of Science and reproduced under a licence from Thomson Reuters. You may not copy or re-distribute this data in whole or in part without the written consent of the Science business of Thomson Reuters.

Request URL: http://wlsprd.imperial.ac.uk:80/respub/WEB-INF/jsp/search-html.jsp Request URI: /respub/WEB-INF/jsp/search-html.jsp Query String: respub-action=search.html&id=00154588&limit=30&person=true