Imperial College London

Professor Peter Y. K. Cheung

Faculty of EngineeringDyson School of Design Engineering

Head of the Dyson School of Design Engineering
 
 
 
//

Contact

 

+44 (0)20 7594 6200p.cheung Website

 
 
//

Assistant

 

Mrs Wiesia Hsissen +44 (0)20 7594 6261

 
//

Location

 

910BElectrical EngineeringSouth Kensington Campus

//

Summary

 

Publications

Publication Type
Year
to

340 results found

Liu J, Bouganis C, Cheung PK, 2013, Domain-specific Progressive Sampling of Face Images, IEEE Global Conference on Signal and Information Processing (GlobalSIP), Publisher: IEEE, Pages: 1021-1024, ISSN: 2376-4066

Conference paper

Chau TCP, Kwok K-W, Chow GCT, Tsoi KH, Lee K-H, Tse Z, Cheung PYK, Luk Wet al., 2013, Acceleration of Real-time Proximity Query for Dynamic Active Constraints, 12th International Conference on Field-Programmable Technology (FPT), Publisher: IEEE, Pages: 206-213

Conference paper

Chau TCP, Niu X, Eele A, Luk W, Cheung PYK, Maciejowski Jet al., 2013, Heterogeneous Reconfigurable System for Adaptive Particle Filters in Real-Time Applications, 9th International Applied Reconfigurable Computing Symposium (ARC), Publisher: SPRINGER-VERLAG BERLIN, Pages: 1-12, ISSN: 0302-9743

Conference paper

Chau TCP, Luk W, Cheung PYK, Eele A, Maciejowski Jet al., 2012, Adaptive sequential Monte Carlo approach for real-time applications, Pages: 527-530

This paper presents an adaptive Sequential Monte Carlo approach for real-time applications. Sequential Monte Carlo method is employed to estimate the states of dynamic systems using weighted particles. The proposed approach reduces the run-time computation complexity by adapting the size of the particle set. Multiple processing elements on FPGAs are dynamically allocated for improved energy efficiency without violating real-time constraints. A robot localisation application is developed based on the proposed approach. Compared to a non-adaptive implementation, the dynamic energy consumption is reduced by up to 70% without affecting the quality of solutions. © 2012 IEEE.

Conference paper

Levine JM, Stott E, Constantinides GA, Cheung PYKet al., 2012, Online Measurement of Timing in Circuits: for Health Monitoring and Dynamic Voltage & Frequency Scaling, 20th IEEE International Symposium on Field-Programmable Custom Computing Machines (FCCM), Publisher: IEEE, Pages: 109-116

Conference paper

Cheung PYK, 2011, Introduction to Special Section FPGA 2009, ACM TRANSACTIONS ON RECONFIGURABLE TECHNOLOGY AND SYSTEMS, Vol: 4, ISSN: 1936-7406

Journal article

Cope B, Cheung PYK, Luk W, Howes Let al., 2011, A systematic design space exploration approach to customising multi-processor architectures: Exemplified using graphics processors, Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), Vol: 6760 LNCS, Pages: 63-83, ISSN: 0302-9743

A systematic approach to customising Homogeneous Multi-Processor (HoMP) architectures is described. The approach involves a novel design space exploration tool and a parameterisable system model. Post-fabrication customisation options for using reconfigurable logic with a HoMP are classified. The adoption of the approach in exploring pre- and post-fabrication customisation options to optimise an architecture's critical paths is then described. The approach and steps are demonstrated using the architecture of a graphics processor. We also analyse on-chip and off-chip memory access for systems with one or more processing elements (PEs), and study the impact of the number of threads per PE on the amount of off-chip memory access and the number of cycles for each output. It is shown that post-fabrication customisation of a graphics processor can provide up to four times performance improvement for negligible area cost. © 2011 Springer-Verlag Berlin Heidelberg.

Journal article

Angelopoulou M, Bouganis C-S, Cheung PYK, 2011, Blur Identification with Assumption Validation for Sensor-based Video Reconstruction and its Implementation on FPGA, IET Computers & Digital Techniques

Journal article

Cheung PYK, Sedcole NP, Wong JS, 2011, Method of Measuring Delay in An Integrated Circuit, US 0095768 A1

A method of measuring signal delay in a integrated circuit comprising applying a common clock signal at a circuit input and output, applying a test signal at the circuit input, detecting a corresponding output signal at the circuit output and detecting whether the test signal and output signal occur in a common part of the clock signal.

Patent

Wong JSJ, Cheung PYK, 2011, Improved Delay Measurement Method in FPGA based on Transition Probability, 19th Annual ACM International Symposium on Field-Programmable Gate Arrays, Publisher: ASSOC COMPUTING MACHINERY, Pages: 163-172

Conference paper

Cope B, Cheung PYK, Luk W, Howes Let al., 2011, A systematic design space exploration approach to customising multi-processor architectures: Exemplified using graphics processors, Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), Vol: 6760 LNCS, Pages: 63-83, ISSN: 0302-9743

© 2011, Springer-Verlag Berlin Heidelberg. A systematic approach to customising Homogeneous Multi-Processor (HoMP) architectures is described. The approach involves a novel design space exploration tool and a parameterisable system model. Post-fabrication customisation options for using reconfigurable logic with a HoMP are classified. The adoption of the approach in exploring pre- and post-fabrication customisation options to optimise an architecture’s critical paths is then described. The approach and steps are demonstrated using the architecture of a graphics processor. We also analyse on-chip and off-chip memory access for systems with one or more processing elements (PEs), and study the impact of the number of threads per PE on the amount of off-chip memory access and the number of cycles for each output. It is shown that post-fabrication customisation of a graphics processor can provide up to four times performance improvement for negligible area cost.

Journal article

Mak T, Cheung PYK, Lam KP, Luk Wet al., 2011, Adaptive routing in network-on-chips using a dynamic-programming network, IEEE Transactions on Industrial Electronics, Vol: 58, Pages: 3701-3716

Journal article

Levine JM, Stott EA, Constantinides GA, Cheung PYKet al., 2011, Health monitoring of live circuits in FPGAs based on time delay measurement (abstract only), Pages: 284-284

Conference paper

Lysaght P, Luk W, Cheung PYK, 2010, Proceedings - 2010 International Conference on Field Programmable Logic and Applications, FPL 2010: Message from the steering committee, Proceedings - 2010 International Conference on Field Programmable Logic and Applications, FPL 2010

Journal article

Wu Y, Kuvinichkul P, Cheung PYK, Demiris Yet al., 2010, Towards Anthropomorphic Robot Thereminist, International Conference on Robotics and Biomimetics (ROBIO), Publisher: IEEE, Pages: 235-240

Theremin is an electronic musical instrument considered to be the most difficult to play which requires the player's hands to have high precision and stability as any position change within proximity of the instrument's antennae can make a difference to the pitch or volume. In a different direction to previous developments of Theremin playing robots, we propose a Humanoid Thereminist System that goes beyond using only one degree of freedom which will open up the possibility for robot to acquire more complex skills, such as aerial fingering and include musical expressions in playing the Theremin. The proposed system consists of two phases, namely calibration phase and playing phase which can be executed independently. During the playing phase, the System takes input from a MIDI file and performs path planning using a combination of minimum energy strategy in joint space and feedback error correction for next playing note. Three experiments have been conducted to evaluate the developed system quantitatively and qualitatively by playing a selection of music files. The experiments have demonstrated that the proposed system can effectively utilise multiple degrees of freedoms while maintaining minimum pitch error margins.

Conference paper

Smith AM, Constantinides GA, Cheung PYK, 2010, FPGA Architecture Optimization Using Geometric Programming, IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, Vol: 29, Pages: 1163-1176, ISSN: 0278-0070

This paper is concerned with the application of geometric programming to the design of homogeneous field programmable gate array (FPGA) architectures. The paper builds on an increasing body of work concerned with modeling reconfigurable architectures, and presents a full area and delay model of an FPGA. We use a geometric programming framework to show how transistor sizing and high-level architecture parameter selection can now be solved as a concurrent optimization problem. We validate the model through the use of simulation program with integrated circuit emphasis (SPICE) models and the versatile place and route (VPR) FPGA architecture simulation tool. Not only does the optimization framework allow architectures to be optimized orders of magnitude faster than previous work, but the combined optimization can lead to different architectural conclusions compared to conventional methods by exploring the coupling between the two sets of optimization variables. Specifically, we show that as delay takes more significance in the objective of the optimization, there should be more lookup tables in a logic block, whereas conventional techniques suggest that there should be fewer lookup tables in an FPGA logic block.

Journal article

Kahoul A, Smith AM, Constantinides GA, Cheung PYKet al., 2010, Efficient Heterogeneous Architecture Floorplan Optimization using Analytical Methods, ACM Transactions on Reconfigurable Technology and Systems, Vol:  

Journal article

Smith AM, Constantinides GA, Cheung PYK, 2010, An Automated Flow for Arithmetic Component Generation in Field-Programmable Gate Arrays, ACM Transactions on Reconfigurable Technology and Systems, Vol:  

Journal article

M Angelopoulou CB, Cheung PYK, 2010, A sensor-based approach to linear blur identification for real-time video enhancement

Conference paper

Cope B, Cheung PYK, Luk W, Howes Let al., 2010, Performance comparison of graphics processors to reconfigurable logic: a case study, IEEE Transactions on Computers, Vol: 54, Pages: 433-448, ISSN: 0018-9340

A systematic approach to the comparison of the graphics processor (GPU) and reconfigurable logic is defined in terms of three throughput drivers. The approach is applied to five case study algorithms, characterized by their arithmetic complexity, memory access requirements, and data dependence, and two target devices: the nVidia GeForce 7900 GTX GPU and a Xilinx Virtex-4 field programmable gate array (FPGA). Two orders of magnitude speedup, over a general-purpose processor, is observed for each device for arithmetic intensive algorithms. An FPGA is superior, over a GPU, for algorithms requiring large numbers of regular memory accesses, while the GPU is superior for algorithms with variable data reuse. In the presence of data dependence, the implementation of a customized data path in an FPGA exceeds GPU performance by up to eight times. The trends of the analysis to newer and future technologies are analyzed.

Journal article

Bouganis C, Pournara I, Cheung PYK, 2010, Exploration of Heterogeneous FPGAs for Mapping Linear Projection Designs, IEEE Transactions on Very Large Scale Integration (VLSI) Systems, Vol: 18

Journal article

Jamieson P, Becker T, Cheung PYK, Luk W, Rissa T, Pitkanen Tet al., 2010, Benchmarking and evaluating reconfigurable architectures targeting the mobile domain, ACM Transactions on Design Automation of Electronic Systems (TODAES), Vol: 15

We present the GroundHog 2009 benchmarking suite that evaluates the power consumption of reconfigurable technology for applications targeting the mobile computing domain. This benchmark suite includes seven designs; one design targets fine-grained FPGA fabrics allowing for quick state-of-the-art evaluation, and six designs are specified at a high level allowing them to target a range of existing and future reconfigurable technologies. Each of the six designs can be stimulated with the help of synthetically generated input stimuli created by an open-source tool included in the downloadable suite. Another tool is included to help verify the correctness of each implemented design. To demonstrate the potential of this benchmark suite, we evaluate the power consumption of two modern industrial FPGAs targeting the mobile domain. Also, we show how an academic FPGA framework, VPR 5.0, that has been updated for power estimates can be used to estimates the power consumption of different FPGA architectures and an open-source CAD flow mapping to these architectures.

Journal article

Mak T, Sedcole P, Cheung PYK, Luk Wet al., 2010, Wave-pipelined intra-chip signaling for on-FPGA communications, Integration, the VLSI Journal, Vol: 43, Pages: 188-201

On-FPGA communication is becoming more problematic as the long interconnection performance is deteriorating in technology scaling. In this paper, we address this issue by proposing a novel wave-pipelined signaling scheme to achieve substantial throughput improvement in FPGAs. A new analytical model capturing the electrical characteristics in FPGA interconnects is presented. Based on the model, throughput and power consumption of a wave-pipelined link have been derived analytically and compared to the conventional synchronous links. Two circuit designs are proposed to realize wave-pipelined link using FPGA fabrics. The proposed approaches are also compared with conventional synchronous and asynchronous pipelining techniques. It is shown that the wave-pipelined approach can achieve up to 5.7 times improvement in throughput and 13% improvement in power consumption versus conventional delay-based on-chip communication schemes. Also, trade-offs between power, throughput and area consumption between the proposed and conventional designs are studied. The wave-pipelining approach provides a new alternative for on-FPGA communications and can potentially become a promising solution to mitigate the future interconnect scaling challenge.

Journal article

Lopez S, Sarmiento R, Potter PG, Luk W, Cheung PYKet al., 2010, Exploration of Hardware Sharing for Image Encoders, Design, Automation and Test in Europe Conference and Exhibition (DATE), Publisher: IEEE, Pages: 1737-1742, ISSN: 1530-1591

Conference paper

Becker T, Luk W, Cheung PYK, 2010, Energy-aware optimisation for run-time reconfiguration, 18th IEEE Annual International Symposium on Field-Programmable Custom Computing Machines (FCCM), Publisher: IEEE COMPUTER SOC, Pages: 55-62

Conference paper

Jones DH, Powell A, Bouganis C-S, Cheung PYKet al., 2010, A Salient Region Detector for GPU Using a Cellular Automata Architecture, 17th International Conference on Neural Information Processing, Publisher: SPRINGER-VERLAG BERLIN, Pages: 501-508, ISSN: 0302-9743

Conference paper

Becker T, Jamieson P, Luk W, Cheung PYK, Rissa Tet al., 2010, Power characterisation for fine-grain reconfigurable fabrics, International Journal of Reconfigurable Computing, Vol: 2010

This paper proposes a benchmarking methodology for characterising the power consumption of the fine-grain fabric in reconfigurable architectures. This methodology is part of the GroundHog 2009 power benchmarking suite. It covers active and inactive power as well as advanced low-power modes. A method based on random number generators is adopted for comparing activity modes. We illustrate our approach using five field-programmable gate arrays (FPGAs) that span a range of process technologies: Xilinx Virtex-II Pro, Spartan-3E, Spartan-3AN, Virtex-5, and Silicon Blue iCE65. We find that, despite improvements through process technology and low-power modes, current devices need further improvements to be sufficiently power efficient for mobile applications. The Silicon Blue device demonstrates that performance can be traded off to achieve lower leakage.

Journal article

Stott EA, Wong JS, Sedcole NP, Cheung PYKet al., 2010, Degradation in FPGAs: measurement and modelling, International symposium on field programmable gate arrays, Pages: 229-238-229-238

Conference paper

Stott EA, Sedcole NP, Cheung PYK, 2010, Fault tolerance and reliability in field-programmable gate arrays, Pages: 196-210-196-210

Conference paper

Mak T, Cheung PYK, Luk W, Lam KPet al., 2009, A DP-network for optimal dynamic routing in network-on-chip, Pages: 119-127

Dynamic routing is desirable because of its substantial improvement in communication bandwidth and intelligent adaptation to faulty links and congested traffics. However, implementation of adaptive routing in a network-on-chip (NoC) system is not trivial and further complicated by the requirements of deadlock-free and real-time optimal decision making. In this paper, we present a deadlock-free routing architecture which employs a dynamic programming (DP) network to provide on-the-fly optimal path planning and network monitoring for packet switching. Also, a new routing strategy called k-step look ahead is introduced. This new strategy can substantially reduced the size of routing table and maintain a high quality of adaptation which leads to a scalable dynamic routing solution with minimal hardware overhead. Our results based on a cycle-accurate simulator demonstrate the effectiveness of the DP-network, which outperforms both the deterministic and adaptive routing algorithms in average delay on various traffic scenarios by 22.3%. Moreover, the hardware overhead for DP-network is insignificant based on the results obtained from the hardware implementations. Copyright 2009 ACM.

Conference paper

This data is extracted from the Web of Science and reproduced under a licence from Thomson Reuters. You may not copy or re-distribute this data in whole or in part without the written consent of the Science business of Thomson Reuters.

Request URL: http://wlsprd.imperial.ac.uk:80/respub/WEB-INF/jsp/search-html.jsp Request URI: /respub/WEB-INF/jsp/search-html.jsp Query String: id=00001081&limit=30&person=true&page=2&respub-action=search.html