611 results found
Fan X, Wu D, Cao W, et al., 2018, Stream Processing Dual-Track CGRA for Object Inference, IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS, Vol: 26, Pages: 1098-1111, ISSN: 1063-8210
Funie A-I, Grigoras P, Burovskiy P, et al., 2018, Run-time Reconfigurable Acceleration for Genetic Programming Fitness Evaluation in Trading Strategies, JOURNAL OF SIGNAL PROCESSING SYSTEMS FOR SIGNAL IMAGE AND VIDEO TECHNOLOGY, Vol: 90, Pages: 39-52, ISSN: 1939-8018
Lee K-H, Fu KCD, Guo Z, et al., 2018, MR Safe Robotic Manipulator for MRI-Guided Intracardiac Catheterization, IEEE-ASME TRANSACTIONS ON MECHATRONICS, Vol: 23, Pages: 586-595, ISSN: 1083-4435
Liu S, Niu X, Luk W, 2018, A Low-Power Deconvolutional Accelerator for Convolutional Neural Network Based Segmentation on FPGA: Abstract Only., Publisher: ACM, Pages: 293-293
Ng HC, Liu S, Luk W, 2018, ADAM: Automated design analysis and merging for speeding up FPGA development, Pages: 189-198
© 2018 Association for Computing Machinery. This paper introduces ADAM, an approach for merging multiple FPGA designs into a single hardware design, so that multiple place- and-route tasks can be replaced by a single task to speed up functional evaluation of designs, especially during the development process. ADAM has three key elements. First, a novel approximate maximum common subgraph detection algorithm with linear time complexity to maximize sharing of resources in the merged design. Second, a prototype tool implementing this common subgraph detection algorithm for dataflow graphs derived from Verilog designs; this tool would also generate the appropriate control circuits to enable selection of the original designs at runtime. Third, a comprehensive analysis of compilation time versus degree of similarity to identify the optimized user parameters for the proposed approach. Experimental results show that ADAM can reduce compilation time by around 5 times when each design is 95% similar to the others, and the compilation time is reduced from 1 hour to 10 minutes in the case of binomial filters.
Russell FP, Targett JS, Luk W, 2018, From Tensor Algebra to Hardware Accelerators: Generating Streaming Architectures for Solving Partial Differential Equations, ISSN: 1063-6862
© 2018 IEEE. Hardware accelerators are attractive targets for running scientific simulations due to their power efficiency. Since, large software simulations can take person years to develop, it is often impractical to use hardware acceleration, which requires significantly more development effort and expertise than software development. We present the design and implementation of a proof-of-concept compiler toolchain which enables rapid prototyping of hardware finite difference solvers for partial differential equations, generated from a high-level domain specific language. Multiple fields, grid staggering and non-linear terms are supported. We demonstrate that our approach is practical by generating and evaluating hardware designs derived from the heat and simplified shallow water equations.
Shao S, Tsai J, Mysior M, et al., 2018, Towards Hardware Accelerated Reinforcement Learning for Application-Specific Robotic Control, ISSN: 1063-6862
© 2018 IEEE. Reinforcement Learning (RL) is an area of machine learning in which an agent interacts with the environment by making sequential decisions. The agent receives reward from the environment based on how good the decisions are and tries to find an optimal decision-making policy that maximises its longterm cumulative reward. This paper presents a novel approach which has showon promise in applying accelerated simulation of RL policy training to automating the control of a real robot arm for specific applications. The approach has two steps. First, design space exploration techniques are developed to enhance performance of an FPGA accelerator for RL policy training based on Trust Region Policy Optimisation (TRPO), which results in a 43% speed improvement over a previous FPGA implementation, while achieving 4.65 times speed up against deep learning libraries running on GPU and 19.29 times speed up against CPU. Second, the trained RL policy is transferred to a real robot arm. Our experiments show that the trained arm can successfully reach to and pick up predefined objects, demonstrating the feasibility of our approach.
Zhao R, Liu S, Ng HC, et al., 2018, Hardware Compilation of Deep Neural Networks: An Overview, ISSN: 1063-6862
© 2018 IEEE. Deploying a deep neural network model on a reconfigurable platform, such as an FPGA, is challenging due to the enormous design spaces of both network models and hardware design. A neural network model has various layer types, connection patterns and data representations, and the corresponding implementation can be customised with different architectural and modular parameters. Rather than manually exploring this design space, it is more effective to automate optimisation throughout an end-to-end compilation process. This paper provides an overview of recent literature proposing novel approaches to achieve this aim. We organise materials to mirror a typical compilation flow: front end, platform-independent optimisation and back end. Design templates for neural network accelerators are studied with a specific focus on their derivation methodologies. We also review previous work on network compilation and optimisation for other hardware platforms to gain inspiration regarding FPGA implementation. Finally, we propose some future directions for related research.
Zhao R, Ng H-C, Luk W, et al., 2018, Towards Efficient Convolutional Neural Network for Domain-Specific Applications on FPGA.
Zhao R, Niu X, Luk W, 2018, Automatic Optimising CNN with Depthwise Separable Convolution on FPGA: (Abstact Only)., Publisher: ACM, Pages: 285-285
Arram J, Kaplan T, Luk W, et al., 2017, Leveraging FPGAs for Accelerating Short Read Alignment, IEEE-ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS, Vol: 14, Pages: 668-677, ISSN: 1545-5963
Burovskiy P, Grigoras P, Sherwin S, et al., 2017, Efficient Assembly for High-Order Unstructured FEM Meshes (FPL 2015), ACM TRANSACTIONS ON RECONFIGURABLE TECHNOLOGY AND SYSTEMS, Vol: 10, ISSN: 1936-7406
Chau T, Burovskiy P, Flynn M, et al., 2017, Advances in Dataflow Systems, ADVANCES IN COMPUTERS, VOL 106, Editors: Hurson, Milutinovic, Publisher: ELSEVIER ACADEMIC PRESS INC, Pages: 21-62, ISBN: 978-0-12-812230-3
Chau TCP, Burovskiy P, Flynn MJ, et al., 2017, Chapter Two - Advances in Dataflow Systems., Advances in Computers, Vol: 105, Pages: 21-62
Cooper B, Girdlestone S, Burovskiy P, et al., 2017, Quantum Chemistry in Dataflow: Density-Fitting MP2, JOURNAL OF CHEMICAL THEORY AND COMPUTATION, Vol: 13, Pages: 5265-5272, ISSN: 1549-9618
Fan H, Niu X, Liu Q, et al., 2017, F-C3D: FPGA-based 3-Dimensional Convolutional Neural Network, 27th International Conference on Field Programmable Logic and Applications (FPL), Publisher: IEEE, ISSN: 1946-1488
Fu H, He C, Luk W, et al., 2017, A Nanosecond-level Hybrid Table Design for Financial Market Data Generators, 25th IEEE Annual International Symposium on Field-Programmable Custom Computing Machines (FCCM), Publisher: IEEE, Pages: 227-234
Fu H, He C, Ruan H, et al., 2017, Accelerating Financial Market Server through Hybrid List Design (Abstract Only)., Publisher: ACM, Pages: 289-290
Funie A-I, Guo L, Niu X, et al., 2017, Custom Framework for Run-Time Trading Strategies, 13th International Symposium on Applied Reconfigurable Computing (ARC), Publisher: SPRINGER INTERNATIONAL PUBLISHING AG, Pages: 154-167, ISSN: 0302-9743
Gan L, Fu H, Luk W, et al., 2017, Solving Mesoscale Atmospheric Dynamics Using a Reconfigurable Dataflow Architecture, IEEE MICRO, Vol: 37, Pages: 40-50, ISSN: 0272-1732
Gan L, Fu H, Mencer O, et al., 2017, Chapter Four - Data Flow Computing in Geoscience Applications., Advances in Computers, Vol: 104, Pages: 125-158
Grigoras P, Burovskiy P, Arram J, et al., 2017, dfesnippets: An Open-Source Library for Dataflow Acceleration on FPGAs, 13th International Symposium on Applied Reconfigurable Computing (ARC), Publisher: SPRINGER INTERNATIONAL PUBLISHING AG, Pages: 299-310, ISSN: 0302-9743
He C, Fu H, Guo C, et al., 2017, A Fully-Pipelined Hardware Design for Gaussian Mixture Models, IEEE TRANSACTIONS ON COMPUTERS, Vol: 66, Pages: 1837-1850, ISSN: 0018-9340
He C, Fu H, Luk W, et al., 2017, Exploring the Potential of Reconfigurable Platforms for Order Book Update, 27th International Conference on Field Programmable Logic and Applications (FPL), Publisher: IEEE, ISSN: 1946-1488
Hung E, Todman T, Luk W, 2017, Transparent In-Circuit Assertions for FPGAs, IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS, Vol: 36, Pages: 1193-1202, ISSN: 0278-0070
Inggs G, Thomas DB, Luk W, 2017, A Domain Specific Approach to High Performance Heterogeneous Computing, IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS, Vol: 28, Pages: 2-15, ISSN: 1045-9219
Lee K-H, Leong MCW, Chow MCK, et al., 2017, FEM-based Soft Robotic Control Framework for Intracavitary Navigation, IEEE International Conference on Real-time Computing and Robotics (RCAR), Publisher: IEEE, Pages: 11-16
Leong PHW, Amano H, Anderson J, et al., 2017, The First 25 Years of the FPL Conference: Significant Papers, ACM TRANSACTIONS ON RECONFIGURABLE TECHNOLOGY AND SYSTEMS, Vol: 10, ISSN: 1936-7406
This data is extracted from the Web of Science and reproduced under a licence from Thomson Reuters. You may not copy or re-distribute this data in whole or in part without the written consent of the Science business of Thomson Reuters.