611 results found
Todman T, Luk W, 2013, RUNTIME ASSERTIONS AND EXCEPTIONS FOR STREAMING SYSTEMS, 23rd International Conference on Field Programmable Logic and Applications (FPL), Publisher: IEEE, ISSN: 1946-1488
Gan L, Fu H, Luk W, et al., 2013, Global Atmospheric Simulation on a Reconfigurable Platform, 21st Annual International IEEE Symposium on Field-Programmable Custom Computing Machines (FCCM), Publisher: IEEE, Pages: 230-230
Eele A, Maciejowski J, Chau T, et al., 2013, Parallelisation of sequential Monte Carlo for real-time control in air traffic management, Pages: 4859-4864, ISSN: 0191-2216
This paper presents the parallelisation of a Sequential Monte Carlo algorithm, and the associated changes required when applied to the problem of conflict resolution and aircraft trajectory control in air traffic management. The target problem is non-linear constrained, non-convex and multi-agent. The new method is shown to have a 98.5% computational time saving over that of a previous sequential implementation, with no degradation in path quality. The computation saving is enough to allow real-time implementation. © 2013 IEEE.
Kurek M, Becker T, Luk W, 2013, Parametric Optimization of Reconfigurable Designs Using Machine Learning, 9th International Applied Reconfigurable Computing Symposium (ARC), Publisher: SPRINGER-VERLAG BERLIN, Pages: 134-145, ISSN: 0302-9743
Petrov Z, Zaykov PG, Cardoso JMP, et al., 2013, An Aspect-Oriented Approach for Designing Safety-Critical Systems, IEEE Aerospace Conference, Publisher: IEEE, ISSN: 1095-323X
Niu X, Chau TCP, Jin Q, et al., 2013, Automating elimination of idle functions by run-time reconfiguration, 21st Annual International IEEE Symposium on Field-Programmable Custom Computing Machines (FCCM), Publisher: IEEE, Pages: 97-104
Eele A, Maciejowski J, Chau T, et al., 2013, Parallelisation of Sequential Monte Carlo for Real-Time Control in Air Traffic Management, 52nd IEEE Annual Conference on Decision and Control (CDC), Publisher: IEEE, Pages: 4853-4858, ISSN: 0743-1546
Grigoras P, Niu X, Coutinho JGF, et al., 2013, Aspect Driven Compilation for Dataflow Designs, IEEE 24th International Conference on Application-Specific Systems, Architectures and Processors (ASAP), Publisher: IEEE, Pages: 18-25, ISSN: 2160-0511
Cattaneo R, Niu X, Pilato C, et al., 2013, A Framework for Effective Exploitation of Partial Reconfiguration in Dataflow Computing, 8th International Workshop on Reconfigurable and Communication-Centric Systems-on-Chip (ReCoSoC), Publisher: IEEE
Arram J, Luk W, Jiang P, 2013, ReconfigurACable Filtered Acceleration of Short Read AlignmentAC, 12th International Conference on Field-Programmable Technology (FPT), Publisher: IEEE, Pages: 438-441
Denholm S, Inouet H, Takenaka T, et al., 2013, Application-Specific Customisation of Market Data Feed Arbitration, 12th International Conference on Field-Programmable Technology (FPT), Publisher: IEEE, Pages: 322-325
Niu X, Coutinho JGF, Wang Y, et al., 2013, Computing nodes in reconfigurable clusters are occupied and released by applications during their, 12th International Conference on Field-Programmable Technology (FPT), Publisher: IEEE, Pages: 214-221
Arram J, Tsoi KH, Luk W, et al., 2013, Reconfigurable Acceleration of Short Read Mapping, 21st Annual International IEEE Symposium on Field-Programmable Custom Computing Machines (FCCM), Publisher: IEEE, Pages: 210-217
Guo C, Luk W, 2013, Accelerating HAC Estimation for Multivariate Time Series, IEEE 24th International Conference on Application-Specific Systems, Architectures and Processors (ASAP), Publisher: IEEE, Pages: 42-49, ISSN: 2160-0511
Chau TCP, Kwok K-W, Chow GCT, et al., 2013, Acceleration of Real-time Proximity Query for Dynamic Active Constraints, 12th International Conference on Field-Programmable Technology (FPT), Publisher: IEEE, Pages: 206-213
Inggs G, Thomas D, Luk W, 2013, A Heterogeneous Computing framework for Computational Finance, 42nd Annual International Conference on Parallel Processing (ICPP), Publisher: IEEE, Pages: 688-697, ISSN: 0190-3918
Ruan H, Huang X, Fu H, et al., 2013, An FPGA-Based Data Flow Engine For Gaussian Copula Model, 21st Annual International IEEE Symposium on Field-Programmable Custom Computing Machines (FCCM), Publisher: IEEE, Pages: 218-225
Niu X, Chau TCP, Jin Q, et al., 2013, Automating resource optimisation in reconfigurable design (abstract only)., Publisher: ACM, Pages: 275-275
Le Masle A, Luk W, 2012, Detecting power attacks on reconfigurable hardware, Pages: 14-19
We present a novel framework to detect power attacks on crypto-systems implemented on reconfigurable hardware. We monitor the device supply voltage with a ring oscillator-based on-chip power monitor. In order to detect the insertion of power measurement circuits onto a device's power rail, a power attack detection strategy taking into account abnormal supply voltages and power rail resistance values is developed. Our strategy is integrated into an on-chip attack detector. The entire framework implementation only takes 3300 LUTs of a Spartan-6 LX45 FPGA, which is 12% of the total area available. Our results on an AES and RSA crypto-system show that our attack detection framework can reach false-positive and false-negative rates as low as 0% over all our test cases if proper operating margins are set. © 2012 IEEE.
Niu X, Jin Q, Luk W, et al., 2012, Exploiting run-time reconfiguration in stencil computation, Pages: 173-180
Stencil computation is computationally intensive and required by many applications. This paper proposes an approach to exploit run-time reconfigurability of field-programmable accelerators for stencil computation. System throughput is optimized by partitioning, analysing and scheduling tasks in applications to remove idle functions. To evaluate the proposed approach, Reverse Time Migration (RTM), a high performance application, is developed. Our optimized runtime reconfigurable solution, which targets a Virtex-6 FPGA in a Maxeler MAX3424A system, can achieves an improved throughput of 102.8 GFlop/s, up to two orders of magnitude faster than the CPU reference designs, 1.59 times faster than the best published GPU and FPGA results, and 1.45 times faster than an optimized static implementation. © 2012 IEEE.
Chau TCP, Luk W, Cheung PYK, et al., 2012, Adaptive sequential Monte Carlo approach for real-time applications, Pages: 527-530
This paper presents an adaptive Sequential Monte Carlo approach for real-time applications. Sequential Monte Carlo method is employed to estimate the states of dynamic systems using weighted particles. The proposed approach reduces the run-time computation complexity by adapting the size of the particle set. Multiple processing elements on FPGAs are dynamically allocated for improved energy efficiency without violating real-time constraints. A robot localisation application is developed based on the proposed approach. Compared to a non-adaptive implementation, the dynamic energy consumption is reduced by up to 70% without affecting the quality of solutions. © 2012 IEEE.
Betkaoui B, Wang Y, Thomas DB, et al., 2012, Parallel FPGA-based all pairs shortest paths for sparse networks: A human brain connectome case study, Pages: 99-104
This paper proposes a highly parallel and scalable reconfigurable design for the All-Pairs Shortest-Paths (APSP) algorithm for very sparse networks. Our work is motivated by a computationally intensive bioinformatics application that employs this memory-latency bound algorithm. The proposed design methodology takes advantage of distributed on-chip memory resources of modern FPGAs to reduce accesses to high-latency off-chip memories. We develop design optimisations that yield different FPGA configurations which are selected at run time based on the input graph data. Using human brain network data, we are able to achieve performance results superior to those from multi-core CPU and GPU, while attaining linear scaling over the number of processors introduced. Our FPGA-based APSP design is over 10 times faster than a quad-core CPU implementation and 2-5 times faster than an AMD Cypress GPU implementation. © 2012 IEEE.
Todman T, Luk W, 2012, Verification of streaming designs by combining symbolic simulation and equivalence checking, Pages: 203-208
As design complexity grows, verification becomes a bottleneck in design development and implementation. This paper describes a novel approach for verifying reconfigurable streaming designs, based on symbolic simulation and equivalence checking. Compared with numerical simulation, symbolic simulation provides a more informative way of showing a design behaved as expected; equivalence checking enables automatic checking of equivalence of symbolic expressions. Our approach has been implemented for designs targeting Maxeler technologies, using an easy-to-use symbolic simulator and the Yices equivalence checker, together with other facilities such as an output combiner to support an automated verification flow. Several benchmarks including, including one-dimensional convolution and finite difference computation, are used to evaluate the proposed approach. © 2012 IEEE.
Jin Q, Becker T, Luk W, et al., 2012, Optimising explicit finite difference option pricing for dynamic constant reconfiguration, Pages: 165-172
This paper demonstrates a novel optimisation methodology to adjust stencil based numerical procedures from the algorithm level, so as to reduce not only the amount of hardware resource consumption per kernel but also the amount of computation required to achieve desired result accuracy, when mapping the algorithm to reconfigurable hardware using dynamic constant reconfiguration. As a result, less area is needed to support run-time reconfiguration, and less computational steps are required in the numerical procedure to obtain a result with given error tolerance. We analyse one thousand fixed point implementations on a Virtex-6 XC6VLX760 FPGA for randomly generated option pricing problems, which are representative of industrial computation. When comparing optimised implementations to the un-optimised ones, the reconfiguration area upper bound is reduced by 22%; the average number of computational steps is reduced by 23%; and the area-computation-time product is reduced by 40%; while the numerical errors of the results are kept below the error tolerant level used in industry. © 2012 IEEE.
The synthesis and mapping of applications to configurable embedded systems is a notoriously hard process. Tools have a wide range of parameters, which interact in very unpredictable ways, thus creating a large and complex design space. When exploring this space, designers must understand the interfaces to the various tools and apply, often manually, a sequence of tool-specific transformations making this an extremely cumbersome and error-prone process. This paper describes the use of aspect-oriented techniques for capturing synthesis strategies for tuning the performance of applications' kernels. We illustrate the use of this approach when designing application-specific architectures generated by a high-level synthesis tool. The results highlight the impact of the various strategies when targeting custom hardware and expose the difficulties in devising these strategies. © 2012 IEEE.
Spacey S, Luk W, Kelly PHJ, et al., 2012, Improving communication latency with the write-only architecture, JOURNAL OF PARALLEL AND DISTRIBUTED COMPUTING, Vol: 72, Pages: 1617-1627, ISSN: 0743-7315
Liu Q, Todman T, Luk W, et al., 2012, Optimizing Hardware Design by Composing Utility-Directed Transformations, IEEE TRANSACTIONS ON COMPUTERS, Vol: 61, Pages: 1800-1812, ISSN: 0018-9340
Pnevmatikatos D, Becker T, Brokalakis A, et al., 2012, FASTER: Facilitating analysis and synthesis technologies for effective reconfiguration, Pages: 234-241
The FASTER project aims to ease the definition, implementation and use of dynamically changing hardware systems. Our motivation stems from the promise reconfigurable systems hold for achieving better performance and extending product functionality and lifetime via the addition of new features that work at hardware speed. This is a clear advantage over the more straightforward software component adaptivity. However, designing a changing hardware system is both challenging and time consuming. The FASTER project will facilitate the use of reconfigurable technology by providing a complete methodology that enables designers to easily specify, analyse, implement and verify applications on platforms with general-purpose processors and acceleration modules implemented in the latest reconfigurable technology. To better adapt to different application requirements, the tool-chain will support both region-based and micro-reconfiguration and provide a flexible run-time system that will efficiently manage the reconfigurable resources. We will use applications from the embedded, high performance computing, and desktop domains to demonstrate the potential benefits of the FASTER tools on metrics such as performance, power consumption and total ownership cost. © 2012 IEEE.
Cheung K, Schultz SR, Luk W, 2012, A large-scale spiking neural network accelerator for FPGA systems, Pages: 113-120, ISSN: 0302-9743
Spiking neural networks (SNN) aim to mimic membrane potential dynamics of biological neurons. They have been used widely in neuromorphic applications and neuroscience modeling studies. We design a parallel SNN accelerator for producing large-scale cortical simulation targeting an off-the-shelf Field-Programmable Gate Array (FPGA)-based system. The accelerator parallelizes synaptic processing with run time proportional to the firing rate of the network. Using only one FPGA, this accelerator is estimated to support simulation of 64K neurons 2.5 times real-time, and achieves a spike delivery rate which is at least 1.4 times faster than a recent GPU accelerator with a benchmark toroidal network. © 2012 Springer-Verlag.
This data is extracted from the Web of Science and reproduced under a licence from Thomson Reuters. You may not copy or re-distribute this data in whole or in part without the written consent of the Science business of Thomson Reuters.