124 results found
Ahmadi N, Constandinou T, Bouganis C, 2019, Decoding Hand Kinematics from Local Field Potentials Using Long Short-Term Memory (LSTM) Network, 2019 9th International IEEE/EMBS Conference on Neural Engineering (NER 2019), Pages: 1-5
Local eld potential (LFP) has gained increasing interest as an alternative input signal for brain-machine interfaces (BMIs) due to its informative features, long-term stability, and low frequency content. However, despite these interesting properties, LFP-based BMIs have been reported to yield low decoding performances compared to spike-based BMIs. In this paper, we propose a new decoder based on long short-term memory (LSTM) network which aims to improve the decoding performance of LFP-based BMIs. We compare of ine decoding performance of the proposed LSTM decoder to a commonly used Kalman lter (KF) decoder on hand kinematics prediction tasks from multichannel LFPs. We also benchmark the performance of LFP-driven LSTM decoder against KF decoder driven by two types of spike signals: singleunit activity (SUA) and multi-unit activity (MUA). Our results show that LFP-driven LSTM decoder achieves signi cantly better decoding performance than LFP-, SUA-, and MUAdrivenKF decoders. This suggests that LFPs coupled with LSTM decoder could provide high decoding performance, robust, and low power BMIs.
Ahmadi N, Cavuto M, Feng P, et al., 2019, Towards a Distributed, Chronically-Implantable Neural Interface, IEEE/EMBS Conference on Neural Engineering (NER), Pages: 1-6
We present a platform technology encompassing a family of innovations that together aim to tackle key challenges with existing implantable brain machine interfaces. The ENGINI (Empowering Next Generation Implantable Neural Interfaces) platform utilizes a 3-tier network (external processor, cranial transponder, intracortical probes) to inductively couple power to, and communicate data from, a distributed array of freely-floating mm-scale probes. Novel features integrated into each probe include: (1) an array of niobium microwires for observing local field potentials (LFPs) along the cortical column; (2) ultra-low power instrumentation for signal acquisition and data reduction; (3) an autonomous, self-calibrating wireless transceiver for receiving power and transmitting data; and (4) a hermetically-sealed micropackage suitable for chronic use. We are additionally engineering a surgical tool, to facilitate manual and robot-assisted insertion, within a streamlined neurosurgical workflow. Ongoing work is focused on system integration and preclinical testing.
Venieris S, Bouganis C-S, 2019, fpgaConvNet: Mapping Regular and Irregular Convolutional Neural Networks on FPGAs, IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, Vol: 30, Pages: 326-342, ISSN: 2162-237X
Boikos K, Bouganis C-S, A Scalable FPGA-based Architecture for Depth Estimation in SLAM
The current state of the art of Simultaneous Localisation and Mapping, orSLAM, on low power embedded systems is about sparse localisation and mappingwith low resolution results in the name of efficiency. Meanwhile, research inthis field has provided many advances for information rich processing andsemantic understanding, combined with high computational requirements forreal-time processing. This work provides a solution to bridging this gap, inthe form of a scalable SLAM-specific architecture for depth estimation fordirect semi-dense SLAM. Targeting an off-the-shelf FPGA-SoC this acceleratorarchitecture achieves a rate of more than 60 mapped frames/sec at a resolutionof 640x480 achieving performance on par to a highly-optimised parallelimplementation on a high-end desktop CPU with an order of magnitude improvedpower consumption. Furthermore, the developed architecture is combined with ourprevious work for the task of tracking, to form the first complete acceleratorfor semi-dense SLAM on FPGAs, establishing the state of the art in the area ofembedded low-power systems.
Kyrkou C, Theocharides T, Bouganis C-S, et al., 2018, Boosting the hardware-efficiency of cascade support vector machines for embedded classification applications, International Journal of Parallel Programming, Vol: 46, Pages: 1220-1246, ISSN: 0885-7458
Support Vector Machines (SVMs) are considered as a state-of-the-art classification algorithm capable of high accuracy rates for a different range of applications. When arranged in a cascade structure, SVMs can efficiently handle problems where the majority of data belongs to one of the two classes, such as image object classification, and hence can provide speedups over monolithic (single) SVM classifiers. However, the SVM classification process is still computationally demanding due to the number of support vectors. Consequently, in this paper we propose a hardware architecture optimized for cascaded SVM processing to boost performance and hardware efficiency, along with a hardware reduction method in order to reduce the overheads from the implementation of additional stages in the cascade, leading to significant resource and power savings. The architecture was evaluated for the application of object detection on 800×600 resolution images on a Spartan 6 Industrial Video Processing FPGA platform achieving over 30 frames-per-second. Moreover, by utilizing the proposed hardware reduction method we were able to reduce the utilization of FPGA custom-logic resources by ∼30%, and simultaneously observed ∼20% peak power reduction compared to a baseline implementation.
Ahmadi N, Constandinou TG, Bouganis C-S, 2018, Estimation of neuronal firing rate using Bayesian Adaptive Kernel Smoother (BAKS), PLOS ONE, Vol: 13, ISSN: 1932-6203
Kouris A, Venieris SI, Bouganis C-S, 2018, Cascade^CNN: Pushing the Performance Limits of Quantisation in Convolutional Neural Networks, 2018 28th International Conference on Field Programmable Logic and Applications (FPL), Publisher: IEEE
Ahmadi N, Constandinou TG, Bouganis C-S, 2018, Spike Rate Estimation Using Bayesian Adaptive Kernel Smoother (BAKS) and Its Application to Brain Machine Interfaces., Pages: 2547-2550, ISSN: 1557-170X
Brain Machine Interfaces (BMIs) mostly utilise spike rate as an input feature for decoding a desired motor output as it conveys a useful measure to the underlying neuronal activity. The spike rate is typically estimated by a using non-overlap binning method that yields a coarse estimate. There exist several methods that can produce a smooth estimate which could potentially improve the decoding performance. However, these methods are relatively computationally heavy for real-time BMIs. To address this issue, we propose a new method for estimating spike rate that is able to yield a smooth estimate and also amenable to real-time BMIs. The proposed method, referred to as Bayesian adaptive kernel smoother (BAKS), employs kernel smoothing technique that considers the bandwidth as a random variable with prior distribution which is adaptively updated through a Bayesian framework. With appropriate selection of prior distribution and kernel function, an analytical expression can be achieved for the kernel bandwidth. We apply BAKS and evaluate its impact on offline BMI decoding performance using Kalman filter. The results reveal that BAKS can improve the decoding performance compared to the binning method. This suggests the feasibility and the potential use of BAKS for real-time BMIs.
Venieris SI, Kouris A, Bouganis C-S, 2018, Toolflows for Mapping Convolutional Neural Networks on FPGAs: A Survey and Future Directions., ACM Comput. Surv., Vol: 51, Pages: 56:1-56:1, ISSN: 0360-0300
n the past decade, Convolutional Neural Networks (CNNs) have demonstrated state-of-the-art performancein various Artificial Intelligence tasks. To accelerate the experimentation and development of CNNs, severalsoftware frameworks have been released, primarily targeting power-hungry CPUs and GPUs. In this context,reconfigurable hardware in the form of FPGAs constitutes a potential alternative platform that can be integratedin the existing deep learning ecosystem to provide a tunable balance between performance, power consumptionand programmability. In this paper, a survey of the existing CNN-to-FPGA toolflows is presented, comprising acomparative study of their key characteristics which include the supported applications, architectural choices,design space exploration methods and achieved performance. Moreover, major challenges and objectivesintroduced by the latest trends in CNN algorithmic research are identified and presented. Finally, a uniformevaluation methodology is proposed, aiming at the comprehensive, complete and in-depth evaluation ofCNN-to-FPGA toolflows.
Vasileiadis M, Malassiotis S, Giakoumis D, et al., 2018, Robust Human Pose Tracking For Realistic Service Robot Applications, 16th IEEE International Conference on Computer Vision (ICCV), Publisher: IEEE, Pages: 1363-1372, ISSN: 2473-9936
Robust human pose estimation and tracking plays an integral role in assistive service robot applications, as it provides information regarding the body pose and motion of the user in a scene. Even though current solutions provide high-accuracy results in controlled environments, they fail to successfully deal with problems encountered under real-life situations such as tracking initialization and failure, body part intersection, large object handling and partial-view body-part tracking. This paper presents a framework tailored for deployment under real-life situations addressing the above limitations. The framework is based on the articulated 3D-SDF data representation model, and has been extended with complementary mechanisms for addressing the above challenges. Extensive evaluation on public datasets demonstrates the framework's state-of-the-art performance, while experimental results on a challenging realistic human motion dataset exhibit its robustness in real life scenarios.
Kouris A, Bouganis C-S, 2018, Learning to Fly by MySelf: A Self-Supervised CNN-based Approach for Autonomous Navigation, 25th IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), Publisher: IEEE, Pages: 5216-5223, ISSN: 2153-0858
Kyrkou C, Plastiras G, Theocharides T, et al., 2018, DroNet: efficient convolutional neural network detector for real-time UAV applications, Design, Automation and Test in Europe Conference and Exhibition (DATE), Publisher: IEEE, Pages: 967-972, ISSN: 1530-1591
Unmanned Aerial Vehicles (drones) are emerging as a promising technology for both environmental and infrastructure monitoring, with broad use in a plethora of applications. Many such applications require the use of computer vision algorithms in order to analyse the information captured from an on-board camera. Such applications include detecting vehicles for emergency response and traffic monitoring. This paper therefore, explores the trade-offs involved in the development of a single-shot object detector based on deep convolutional neural networks (CNNs) that can enable UAVs to perform vehicle detection under a resource constrained environment such as in a UAV. The paper presents a holistic approach for designing such systems; the data collection and training stages, the CNN architecture, and the optimizations necessary to efficiently map such a CNN on a lightweight embedded processing platform suitable for deployment on UAVs. Through the analysis we propose a CNN architecture that is capable of detecting vehicles from aerial UAV images and can operate between 5-18 frames-per-second for a variety of platforms with an overall accuracy of ~ 95%. Overall, the proposed architecture is suitable for UAV applications, utilizing low-power embedded processors that can be deployed on commercial UAVs.
Shafique M, Theocharides T, Bouganis C-S, et al., 2018, An overview of next-generation architectures for machine learning: Roadmap, opportunities and challenges in the IoT era., Publisher: IEEE, Pages: 827-832
Rizakis M, Venieris SI, Kouris A, et al., 2018, Approximate FPGA-Based LSTMs Under Computation Time Constraints., Publisher: Springer, Pages: 3-15
Kouris A, Venieris SI, Bouganis C-S, 2018, CascadeCNN: Pushing the performance limits of quantisation.
Venieris SI, Bouganis C-S, 2018, f-CNNx: A Toolflow for Mapping Multiple Convolutional Neural Networks on FPGAs.
Venieris SI, Kouris A, Bouganis C-S, 2018, Deploying Deep Neural Networks in the Embedded Space.
de Souza Rosa L, Bouganis C-S, Bonato V, 2018, Scaling Up Modulo Scheduling For High-Level Synthesis, IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, Pages: 1-1, ISSN: 0278-0070
Venieris SI, Bouganis C-S, fpgaConvNet: A Toolflow for Mapping Diverse Convolutional Neural Networks on Embedded FPGAs, Conference on Neural Information Processing Systems
In recent years, Convolutional Neural Networks (ConvNets) have become anenabling technology for a wide range of novel embedded Artificial Intelligencesystems. Across the range of applications, the performance needs varysignificantly, from high-throughput video surveillance to the very low-latencyrequirements of autonomous cars. In this context, FPGAs can provide a potentialplatform that can be optimally configured based on the different performanceneeds. However, the complexity of ConvNet models keeps increasing making theirmapping to an FPGA device a challenging task. This work presents fpgaConvNet,an end-to-end framework for mapping ConvNets on FPGAs. The proposed frameworkemploys an automated design methodology based on the Synchronous Dataflow (SDF)paradigm and defines a set of SDF transformations in order to efficientlyexplore the architectural design space. By selectively optimising forthroughput, latency or multiobjective criteria, the presented tool is able toefficiently explore the design space and generate hardware designs fromhigh-level ConvNet specifications, explicitly optimised for the performancemetric of interest. Overall, our framework yields designs that improve theperformance by up to 6.65x over highly optimised embedded GPU designs for thesame power constraints in embedded environments.
Bouganis C-S, Gorgon M, Bonato V, 2017, Special issue on applied reconfigurable computing, MICROPROCESSORS AND MICROSYSTEMS, Vol: 52, Pages: 1-1, ISSN: 0141-9331
Liu S, Mingas G, Bouganis C-S, 2017, An Unbiased MCMC FPGA-Based Accelerator in the Land of Custom Precision Arithmetic, IEEE TRANSACTIONS ON COMPUTERS, Vol: 66, Pages: 745-758, ISSN: 0018-9340
Mingas G, Bottolo L, Bouganis C-S, 2017, Particle MCMC algorithms and architectures for accelerating inference in state-space models, INTERNATIONAL JOURNAL OF APPROXIMATE REASONING, Vol: 83, Pages: 413-433, ISSN: 0888-613X
Vavouras M, Duarte RP, Armato A, et al., 2017, A Hybrid ASIC/FPGA Fault-Tolerant Artificial Pancreas, International Conference on Embedded Computer Systems - Architectures, Modeling and Simulation (SAMOS), Publisher: IEEE, Pages: 261-267
Venieris SI, Bouganis C-S, 2017, Latency-Driven Design for FPGA-based Convolutional Neural Networks, 27th International Conference on Field Programmable Logic and Applications (FPL), Publisher: IEEE, ISSN: 1946-1488
Liu S, Bouganis C-S, 2017, Communication-Aware MCMC Method for Big Data Applications on FPGAs, 25th IEEE Annual International Symposium on Field-Programmable Custom Computing Machines (FCCM), Publisher: IEEE, Pages: 9-16
Boikos K, Bouganis C-S, 2017, A high-performance system-on-chip architecture for direct tracking for SLAM., Publisher: IEEE, Pages: 1-7
Venieris SI, Bouganis C-S, 2017, fpgaConvNet: Automated Mapping of Convolutional Neural Networks on FPGAs (Abstract Only)., International Symposium on Field-Programmable Gate Arrays, Publisher: ACM, Pages: 291-292
Rabieah MB, Bouganis C-S, 2016, FPGASVM: A Framework for Accelerating Kernelized Support Vector Machine., BigMine-2016, Publisher: JMLR.org, Pages: 68-84
Liu J, Bouganis C, Cheung PYK, 2016, Context-based image acquisition from memory in digital systems, Journal of Real-Time Image Processing, ISSN: 1861-8200
Mingas G, Bouganis C-S, 2016, Population-Based MCMC on Multi-Core CPUs, GPUs and FPGAs, IEEE TRANSACTIONS ON COMPUTERS, Vol: 65, Pages: 1283-1296, ISSN: 0018-9340
This data is extracted from the Web of Science and reproduced under a licence from Thomson Reuters. You may not copy or re-distribute this data in whole or in part without the written consent of the Science business of Thomson Reuters.