Publications

Journal article

Kormushev P, Ugurlu B, Caldwell DG, Tsagarakis NGet al., 2019,

Learning to exploit passive compliance for energy-efficient gait generation on a compliant humanoid

, Autonomous Robots, Vol: 43, Pages: 79-95, ISSN: 0929-5593

Modern humanoid robots include not only active compliance but also passive compliance. Apart from improved safety and dependability, availability of passive elements, such as springs, opens up new possibilities for improving the energy efficiency. With this in mind, this paper addresses the challenging open problem of exploiting the passive compliance for the purpose of energy efficient humanoid walking. To this end, we develop a method comprising two parts: an optimization part that finds an optimal vertical center-of-mass trajectory, and a walking pattern generator part that uses this trajectory to produce a dynamically-balanced gait. For the optimization part, we propose a reinforcement learning approach that dynamically evolves the policy parametrization during the learning process. By gradually increasing the representational power of the policy parametrization, it manages to find better policies in a faster and computationally efficient way. For the walking generator part, we develop a variable-center-of-mass-height ZMP-based bipedal walking pattern generator. The method is tested in real-world experiments with the bipedal robot COMAN and achieves a significant 18% reduction in the electric energy consumption by learning to efficiently use the passive compliance of the robot.

Conference paper

Wang M-Y, Kogkas AA, Darzi A, Mylonas GPet al., 2019,

Free-view, 3D gaze-guided, assistive robotic system for activities of daily living

, 25th IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), Publisher: IEEE, Pages: 2355-2361, ISSN: 2153-0858

Patients suffering from quadriplegia have limited body motion which prevents them from performing daily activities. We have developed an assistive robotic system with an intuitive free-view gaze interface. The user's point of regard is estimated in 3D space while allowing free head movement and is combined with object recognition and trajectory planning. This framework allows the user to interact with objects using fixations. Two operational modes have been implemented to cater for different eventualities. The automatic mode performs a pre-defined task associated with a gaze-selected object, while the manual mode allows gaze control of the robot's end-effector position on the user's frame of reference. User studies reported effortless operation in automatic mode. A manual pick and place task achieved a success rate of 100% on the users' first attempt.

Conference paper

Zolotas M, Elsdon J, Demiris Y, 2019,

Head-mounted augmented reality for explainable robotic wheelchair assistance

, IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), Publisher: IEEE, ISSN: 2153-0866

Robotic wheelchairs with built-in assistive fea-tures, such as shared control, are an emerging means ofproviding independent mobility to severely disabled individuals.However, patients often struggle to build a mental model oftheir wheelchair’s behaviour under different environmentalconditions. Motivated by the desire to help users bridge thisgap in perception, we propose a novel augmented realitysystem using a Microsoft Hololens as a head-mounted aid forwheelchair navigation. The system displays visual feedback tothe wearer as a way of explaining the underlying dynamicsof the wheelchair’s shared controller and its predicted futurestates. To investigate the influence of different interface designoptions, a pilot study was also conducted. We evaluated theacceptance rate and learning curve of an immersive wheelchairtraining regime, revealing preliminary insights into the potentialbeneficial and adverse nature of different augmented realitycues for assistive navigation. In particular, we demonstrate thatcare should be taken in the presentation of information, witheffort-reducing cues for augmented information acquisition (forexample, a rear-view display) being the most appreciated.

Conference paper

Wang R, Amadori P, Demiris Y, 2019,

Real-time workload classification during driving using hyperNetworks

, International Conference on Intelligent Robots and Systems (IROS 2018), Publisher: IEEE, ISSN: 2153-0866

Classifying human cognitive states from behavioral and physiological signals is a challenging problem with important applications in robotics. The problem is challenging due to the data variability among individual users, and sensor artifacts. In this work, we propose an end-to-end framework for real-time cognitive workload classification with mixture Hyper Long Short Term Memory Networks (m-HyperLSTM), a novelvariant of HyperNetworks. Evaluating the proposed approach on an eye-gaze pattern dataset collected from simulated driving scenarios of different cognitive demands, we show that the proposed framework outperforms previous baseline methods and achieves 83.9% precision and 87.8% recall during test. We also demonstrate the merit of our proposed architecture by showing improved performance over other LSTM-basedmethods

Conference paper

Vrielink TJCO, Puyal JG-B, Kogkas A, Darzi A, Mylonas Get al., 2019,

Intuitive gaze-control of a robotized flexible endoscope

, 25th IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), Publisher: IEEE, Pages: 1776-1782, ISSN: 2153-0858

Flexible endoscopy is a routinely performed procedure that has predominantly remained unchanged for decades despite its many challenges. This paper introduces a novel, more intuitive and ergonomic platform that can be used with any flexible endoscope, allowing easier navigation and manipulation. A standard endoscope is robotized and a gaze control system based on eye-tracking is developed and implemented, allowing hands-free manipulation. The system characteristics and step response has been evaluated using visual servoing. Further, the robotized system has been compared with a manually controlled endoscope during a user study. The users (n=11) showed a preference for the gaze controlled endoscope and a lower task load when the task was performed with the gaze control. In addition, gaze control was related to a higher success rate and a lower time to perform the task. The results presented validate the system's technical performance and demonstrate the intuitiveness of hands-free gaze control in flexible endoscopy.

Conference paper

Zhao M, Oude Vrielink J, Kogkas A, Mylonas G, Elson Det al., 2019,

Prototype Designs of a Cable-driven Parallel Robot for Transoral Laser Surgery

, Hamlyn Symposium on Medical Robotics

Cite

Book chapter

Di Veroli C, Le CA, Lemaire T, Makabu E, Nur A, Ooi V, Park JY, Sanna F, Chacon R, Demiris Yet al., 2019,

LibRob: An autonomous assistive librarian

, Pages: 15-26, ISBN: 9783030253318

This study explores how new robotic systems can help library users efficiently locate the book they require. A survey conducted among Imperial College students has shown an absence of a time-efficient and organised method to find the books they are looking for in the college library. The solution implemented, LibRob, is an automated assistive robot that gives guidance to the users in finding the book they are searching for in an interactive manner to deliver a more satisfactory experience. LibRob is able to process a search request either by speech or by text and return a list of relevant books by author, subject or title. Once the user selects the book of interest, LibRob guides them to the shelf containing the book, then returns to its base station on completion. Experimental results demonstrate that the robot reduces the time necessary to find a book by 47.4%, and left 80% of the users satisfied with their experience, proving that human-robot interactions can greatly improve the efficiency of basic activities within a library environment.

Abstract
Cite
Citations: 2

Journal article

Debrunner T, Saeedi Gharahbolagh S, Kelly P, 2019,

AUKE: Automatic Kernel Code Generation for an analogue SIMD Focal-plane Sensor-Processor Array

, ACM Transactions on Architecture and Code Optimization, Vol: 15, ISSN: 1544-3973

Focal-plane Sensor-Processor Arrays (FPSPs) are new imaging devices with parallel Single Instruction Multiple Data (SIMD) computational capabilities built into every pixel. Compared to traditional imaging devices, FPSPs allow for massive pixel-parallel execution of image processing algorithms. This enables the application of certain algorithms at extreme frame rates (>10,000 frames per second). By performing some early-stage processing in-situ, systems incorporating FPSPs can consume less power compared to conventional approaches using standard digital cameras. In this article, we explore code generation for an FPSP whose 256 × 256 processors operate on analogue signal data, leading to further opportunities for power reduction—and additional code synthesis challenges.While rudimentary image processing algorithms have been demonstrated on FPSPs before, progress with higher-level computer vision algorithms has been sparse due to the unique architecture and limits of the devices. This article presents a code generator for convolution filters for the SCAMP-5 FPSP, with applications in many high-level tasks such as convolutional neural networks, pose estimation, and so on. The SCAMP-5 FPSP has no effective multiply operator. Convolutions have to be implemented through sequences of more primitive operations such as additions, subtractions, and multiplications/divisions by two. We present a code generation algorithm to optimise convolutions by identifying common factors in the different weights and by determining an optimised pattern of pixel-to-pixel data movements to exploit them. We present evaluation in terms of both speed and energy consumption for a suite of well-known convolution filters. Furthermore, an application of the method is shown by the implementation of a Viola-Jones face detection algorithm.

Book chapter

Kogkas A, Ezzat A, Thakkar R, Darzi A, Mylonas Get al., 2019,

Free-View, 3D Gaze-Guided Robotic Scrub Nurse

, Editors: Shen, Liu, Peters, Staib, Essert, Zhou, Yap, Khan, Publisher: SPRINGER INTERNATIONAL PUBLISHING AG, Pages: 164-172, ISBN: 978-3-030-32253-3

Author Web Link
Cite
Citations: 5

Conference paper

Choi J, Chang HJ, Fischer T, Yun S, Lee K, Jeong J, Demiris Y, Choi JYet al., 2018,

Context-aware deep feature compression for high-speed visual tracking

, IEEE Conference on Computer Vision and Pattern Recognition, Publisher: Institute of Electrical and Electronics Engineers, Pages: 479-488, ISSN: 1063-6919

We propose a new context-aware correlation filter based tracking framework to achieve both high computational speed and state-of-the-art performance among real-time trackers. The major contribution to the high computational speed lies in the proposed deep feature compression that is achieved by a context-aware scheme utilizing multiple expert auto-encoders; a context in our framework refers to the coarse category of the tracking target according to appearance patterns. In the pre-training phase, one expert auto-encoder is trained per category. In the tracking phase, the best expert auto-encoder is selected for a given target, and only this auto-encoder is used. To achieve high tracking performance with the compressed feature map, we introduce extrinsic denoising processes and a new orthogonality loss term for pre-training and fine-tuning of the expert auto-encoders. We validate the proposed context-aware framework through a number of experiments, where our method achieves a comparable performance to state-of-the-art trackers which cannot run in real-time, while running at a significantly fast speed of over 100 fps.

Journal article

Moulin-Frier C, Fischer T, Petit M, Pointeau G, Puigbo JY, Pattacini U, Low SC, Camilleri D, Nguyen P, Hoffmann M, Chang HJ, Zambelli M, Mealier AL, Damianou A, Metta G, Prescott TJ, Demiris Y, Dominey PF, Verschure PFMJet al., 2018,

DAC-h3: A Proactive Robot Cognitive Architecture to Acquire and Express Knowledge About the World and the Self

, IEEE Transactions on Cognitive and Developmental Systems, Vol: 10, Pages: 1005-1022, ISSN: 2379-8920

This paper introduces a cognitive architecture for a humanoid robot to engage in a proactive, mixed-initiative exploration and manipulation of its environment, where the initiative can originate from both the human and the robot. The framework, based on a biologically-grounded theory of the brain and mind, integrates a reactive interaction engine, a number of state-of-the art perceptual and motor learning algorithms, as well as planning abilities and an autobiographical memory. The architecture as a whole drives the robot behavior to solve the symbol grounding problem, acquire language capabilities, execute goal-oriented behavior, and express a verbal narrative of its own experience in the world. We validate our approach in human-robot interaction experiments with the iCub humanoid robot, showing that the proposed cognitive architecture can be applied in real time within a realistic scenario and that it can be used with naive users.

Journal article

Chang HJ, Fischer T, Petit M, Zambelli M, Demiris Yet al., 2018,

Learning kinematic structure correspondences using multi-order similarities

, IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol: 40, Pages: 2920-2934, ISSN: 0162-8828

We present a novel framework for finding the kinematic structure correspondences between two articulated objects in videos via hypergraph matching. In contrast to appearance and graph alignment based matching methods, which have been applied among two similar static images, the proposed method finds correspondences between two dynamic kinematic structures of heterogeneous objects in videos. Thus our method allows matching the structure of objects which have similar topologies or motions, or a combination of the two. Our main contributions are summarised as follows: (i)casting the kinematic structure correspondence problem into a hypergraph matching problem by incorporating multi-order similarities with normalising weights, (ii)introducing a structural topology similarity measure by aggregating topology constrained subgraph isomorphisms, (iii)measuring kinematic correlations between pairwise nodes, and (iv)proposing a combinatorial local motion similarity measure using geodesic distance on the Riemannian manifold. We demonstrate the robustness and accuracy of our method through a number of experiments on synthetic and real data, showing that various other recent and state of the art methods are outperformed. Our method is not limited to a specific application nor sensor, and can be used as building block in applications such as action recognition, human motion retargeting to robots, and articulated object manipulation.

Conference paper

Wang K, Shah A, Kormushev P, 2018,

SLIDER: A Bipedal Robot with Knee-less Legs and Vertical Hip Sliding Motion

, 21st International Conference on Climbing and Walking Robots and Support Technologies for Mobile Machines (CLAWAR 2018)

Journal article

Sarabia M, Young N, Canavan K, Edginton T, Demiris Y, Vizcaychipi MPet al., 2018,

Assistive robotic technology to combat social isolation in acute hospital settings

, International Journal of Social Robotics, Vol: 10, Pages: 607-620, ISSN: 1875-4791

Social isolation in hospitals is a well established risk factor for complications such as cognitive decline and depression. Assistive robotic technology has the potential to combat this problem, but first it is critical to investigate how hospital patients react to this technology. In order to address this question, we introduced a remotely operated NAO humanoid robot which conversed, made jokes, played music, danced and exercised with patients in a London hospital. In total, 49 patients aged between 18–100 took part in the study, 7 of whom had dementia. Our results show that a majority of patients enjoyed their interaction with NAO. We also found that age and dementia significantly affect the interaction, whereas gender does not. These results indicate that hospital patients enjoy socialising with robots, opening new avenues for future research into the potential health benefits of a social robotic companion.

Journal article

Saeedi Gharahbolagh S, Bodin B, Wagstaff H, Nisbet A, Nardi L, Mawer J, Melot N, Palomar O, Vespa E, Gorgovan C, Webb A, Clarkson J, Tomusk E, Debrunner T, Kaszyk K, Gonzalez P, Rodchenko A, Riley G, Kotselidis C, Franke B, OBoyle M, Davison A, Kelly P, Lujan M, Furber Set al., 2018,

Navigating the landscape for real-time localisation and mapping for robotics, virtual and augmented reality

, Proceedings of the IEEE, Vol: 106, Pages: 2020-2039, ISSN: 0018-9219

Visual understanding of 3-D environments in real time, at low power, is a huge computational challenge. Often referred to as simultaneous localization and mapping (SLAM), it is central to applications spanning domestic and industrial robotics, autonomous vehicles, and virtual and augmented reality. This paper describes the results of a major research effort to assemble the algorithms, architectures, tools, and systems software needed to enable delivery of SLAM, by supporting applications specialists in selecting and configuring the appropriate algorithm and the appropriate hardware, and compilation pathway, to meet their performance, accuracy, and energy consumption goals. The major contributions we present are: 1) tools and methodology for systematic quantitative evaluation of SLAM algorithms; 2) automated, machine-learning-guided exploration of the algorithmic and implementation design space with respect to multiple objectives; 3) end-to-end simulation tools to enable optimization of heterogeneous, accelerated architectures for the specific algorithmic requirements of the various SLAM algorithmic approaches; and 4) tools for delivering, where appropriate, accelerated, adaptive SLAM solutions in a managed, JIT-compiled, adaptive runtime context.

Conference paper

Pardo F, Levdik V, Kormushev P, 2018,

Q-map: A convolutional approach for goal-oriented reinforcement learning.

Goal-oriented learning has become a core concept in reinforcement learning(RL), extending the reward signal as a sole way to define tasks. However, asparameterizing value functions with goals increases the learning complexity,efficiently reusing past experience to update estimates towards several goalsat once becomes desirable but usually requires independent updates per goal.Considering that a significant number of RL environments can support spatialcoordinates as goals, such as on-screen location of the character in ATARI orSNES games, we propose a novel goal-oriented agent called Q-map that utilizesan autoencoder-like neural network to predict the minimum number of stepstowards each coordinate in a single forward pass. This architecture is similarto Horde with parameter sharing and allows the agent to discover correlationsbetween visual patterns and navigation. For example learning how to use aladder in a game could be transferred to other ladders later. We show how thisnetwork can be efficiently trained with a 3D variant of Q-learning to updatethe estimates towards all goals at once. While the Q-map agent could be usedfor a wide range of applications, we propose a novel exploration mechanism inplace of epsilon-greedy that relies on goal selection at a desired distancefollowed by several steps taken towards it, allowing long and coherentexploratory steps in the environment. We demonstrate the accuracy andgeneralization qualities of the Q-map agent on a grid-world environment andthen demonstrate the efficiency of the proposed exploration mechanism on thenotoriously difficult Montezuma's Revenge and Super Mario All-Stars games.

Conference paper

Fischer T, Chang HJ, Demiris Y, 2018,

RT-GENE: Real-time eye gaze estimation in natural environments

, European Conference on Computer Vision, Publisher: Springer Verlag, Pages: 339-357, ISSN: 0302-9743

In this work, we consider the problem of robust gaze estimation in natural environments. Large camera-to-subject distances and high variations in head pose and eye gaze angles are common in such environments. This leads to two main shortfalls in state-of-the-art methods for gaze estimation: hindered ground truth gaze annotation and diminished gaze estimation accuracy as image resolution decreases with distance. We first record a novel dataset of varied gaze and head pose images in a natural environment, addressing the issue of ground truth annotation by measuring head pose using a motion capture system and eye gaze using mobile eyetracking glasses. We apply semantic image inpainting to the area covered by the glasses to bridge the gap between training and testing images by removing the obtrusiveness of the glasses. We also present a new real-time algorithm involving appearance-based deep convolutional neural networks with increased capacity to cope with the diverse images in the new dataset. Experiments with this network architecture are conducted on a number of diverse eye-gaze datasets including our own, and in cross dataset evaluations. We demonstrate state-of-the-art performance in terms of estimation accuracy in all experiments, and the architecture performs well even on lower resolution images.

Conference paper

Nguyen P, Fischer T, Chang HJ, Pattacini U, Metta G, Demiris Yet al., 2018,

Transferring visuomotor learning from simulation to the real world for robotics manipulation tasks

, IEEE/RSJ International Conference on Intelligent Robots and Systems, Publisher: IEEE, Pages: 6667-6674, ISSN: 2153-0866

Hand-eye coordination is a requirement for many manipulation tasks including grasping and reaching. However, accurate hand-eye coordination has shown to be especially difficult to achieve in complex robots like the iCub humanoid. In this work, we solve the hand-eye coordination task using a visuomotor deep neural network predictor that estimates the arm's joint configuration given a stereo image pair of the arm and the underlying head configuration. As there are various unavoidable sources of sensing error on the physical robot, we train the predictor on images obtained from simulation. The images from simulation were modified to look realistic using an image-to-image translation approach. In various experiments, we first show that the visuomotor predictor provides accurate joint estimates of the iCub's hand in simulation. We then show that the predictor can be used to obtain the systematic error of the robot's joint measurements on the physical iCub robot. We demonstrate that a calibrator can be designed to automatically compensate this error. Finally, we validate that this enables accurate reaching of objects while circumventing manual fine-calibration of the robot.

Conference paper

Chacon Quesada R, Demiris Y, 2018,

Augmented reality control of smart wheelchair using eye-gaze–enabled selection of affordances

, https://www.idiap.ch/workshop/iros2018/files/, IROS 2018 Workshop on Robots for Assisted Living

In this paper we present a novel augmented reality head mounted display user interface for controlling a robotic wheelchair for people with limited mobility. To lower the cognitive requirements needed to control the wheelchair, we propose integration of a smart wheelchair with an eye-tracking enabled head-mounted display. We propose a novel platform that integrates multiple user interface interaction methods for aiming at and selecting affordances derived by on-board perception capabilities such as laser-scanner readings and cameras. We demonstrate the effectiveness of the approach by evaluating our platform in two realistic scenarios: 1) Door detection, where the affordance corresponds to a Door object and the Go-Through action and 2) People detection, where the affordance corresponds to a Person and the Approach action. To the best of our knowledge, this is the first demonstration of a augmented reality head-mounted display user interface for controlling a smart wheelchair.

Conference paper

Saputra RP, Kormushev P, 2018,

Casualty detection from 3D point cloud data for autonomous ground mobile rescue robots

, SSRR 2018, Publisher: IEEE

One of the most important features of mobilerescue robots is the ability to autonomously detect casualties,i.e. human bodies, which are usually lying on the ground. Thispaper proposes a novel method for autonomously detectingcasualties lying on the ground using obtained 3D point-clouddata from an on-board sensor, such as an RGB-D camera ora 3D LIDAR, on a mobile rescue robot. In this method, theobtained 3D point-cloud data is projected onto the detectedground plane, i.e. floor, within the point cloud. Then, thisprojected point cloud is converted into a grid-map that isused afterwards as an input for the algorithm to detecthuman body shapes. The proposed method is evaluated byperforming detections of a human dummy, placed in differentrandom positions and orientations, using an on-board RGB-Dcamera on a mobile rescue robot called ResQbot. To evaluatethe robustness of the casualty detection method to differentcamera angles, the orientation of the camera is set to differentangles. The experimental results show that using the point-clouddata from the on-board RGB-D camera, the proposed methodsuccessfully detects the casualty in all tested body positions andorientations relative to the on-board camera, as well as in alltested camera angles.

Imperial College London

Latest News

Robotics Forum

Publications

Learning to exploit passive compliance for energy-efficient gait generation on a compliant humanoid

Free-view, 3D gaze-guided, assistive robotic system for activities of daily living

Head-mounted augmented reality for explainable robotic wheelchair assistance

Real-time workload classification during driving using hyperNetworks

Intuitive gaze-control of a robotized flexible endoscope

Prototype Designs of a Cable-driven Parallel Robot for Transoral Laser Surgery

LibRob: An autonomous assistive librarian

AUKE: Automatic Kernel Code Generation for an analogue SIMD Focal-plane Sensor-Processor Array

Free-View, 3D Gaze-Guided Robotic Scrub Nurse

Context-aware deep feature compression for high-speed visual tracking

DAC-h3: A Proactive Robot Cognitive Architecture to Acquire and Express Knowledge About the World and the Self

Learning kinematic structure correspondences using multi-order similarities

SLIDER: A Bipedal Robot with Knee-less Legs and Vertical Hip Sliding Motion

Assistive robotic technology to combat social isolation in acute hospital settings

Navigating the landscape for real-time localisation and mapping for robotics, virtual and augmented reality

Q-map: A convolutional approach for goal-oriented reinforcement learning.

RT-GENE: Real-time eye gaze estimation in natural environments

Transferring visuomotor learning from simulation to the real world for robotics manipulation tasks

Augmented reality control of smart wheelchair using eye-gaze–enabled selection of affordances

Casualty detection from 3D point cloud data for autonomous ground mobile rescue robots

Publications

Search or filter publications

Filter by type:

Filter by year:

Results

Search results

Prototype Designs of a Cable-driven Parallel Robot for Transoral Laser Surgery

SLIDER: A Bipedal Robot with Knee-less Legs and Vertical Hip Sliding Motion

Q-map: A convolutional approach for goal-oriented reinforcement learning.

Augmented reality control of smart wheelchair using eye-gaze–enabled selection of affordances