Below is a list of all relevant publications authored by Robotics Forum members.
- Showing results for:
- Reset all filters
Conference paperZolotas M, Elsdon J, Demiris Y, 2019,
Robotic wheelchairs with built-in assistive fea-tures, such as shared control, are an emerging means ofproviding independent mobility to severely disabled individuals.However, patients often struggle to build a mental model oftheir wheelchair’s behaviour under different environmentalconditions. Motivated by the desire to help users bridge thisgap in perception, we propose a novel augmented realitysystem using a Microsoft Hololens as a head-mounted aid forwheelchair navigation. The system displays visual feedback tothe wearer as a way of explaining the underlying dynamicsof the wheelchair’s shared controller and its predicted futurestates. To investigate the influence of different interface designoptions, a pilot study was also conducted. We evaluated theacceptance rate and learning curve of an immersive wheelchairtraining regime, revealing preliminary insights into the potentialbeneficial and adverse nature of different augmented realitycues for assistive navigation. In particular, we demonstrate thatcare should be taken in the presentation of information, witheffort-reducing cues for augmented information acquisition (forexample, a rear-view display) being the most appreciated.
Conference paperWang R, Amadori P, Demiris Y, 2019,
Classifying human cognitive states from behavioral and physiological signals is a challenging problem with important applications in robotics. The problem is challenging due to the data variability among individual users, and sensor artifacts. In this work, we propose an end-to-end framework for real-time cognitive workload classification with mixture Hyper Long Short Term Memory Networks (m-HyperLSTM), a novelvariant of HyperNetworks. Evaluating the proposed approach on an eye-gaze pattern dataset collected from simulated driving scenarios of different cognitive demands, we show that the proposed framework outperforms previous baseline methods and achieves 83.9% precision and 87.8% recall during test. We also demonstrate the merit of our proposed architecture by showing improved performance over other LSTM-basedmethods
Book chapterKogkas A, Ezzat A, Thakkar R, et al., 2019,
Book chapterDi Veroli C, Le CA, Lemaire T, et al., 2019,
This study explores how new robotic systems can help library users efficiently locate the book they require. A survey conducted among Imperial College students has shown an absence of a time-efficient and organised method to find the books they are looking for in the college library. The solution implemented, LibRob, is an automated assistive robot that gives guidance to the users in finding the book they are searching for in an interactive manner to deliver a more satisfactory experience. LibRob is able to process a search request either by speech or by text and return a list of relevant books by author, subject or title. Once the user selects the book of interest, LibRob guides them to the shelf containing the book, then returns to its base station on completion. Experimental results demonstrate that the robot reduces the time necessary to find a book by 47.4%, and left 80% of the users satisfied with their experience, proving that human-robot interactions can greatly improve the efficiency of basic activities within a library environment.
Conference paperZhao M, Oude Vrielink J, Kogkas A, et al., 2019,
Prototype Designs of a Cable-driven Parallel Robot for Transoral Laser Surgery, Hamlyn Symposium on Medical Robotics
Journal articleDebrunner T, Saeedi Gharahbolagh S, Kelly P, 2019,
Focal-plane Sensor-Processor Arrays (FPSPs) are new imaging devices with parallel Single Instruction Multiple Data (SIMD) computational capabilities built into every pixel. Compared to traditional imaging devices, FPSPs allow for massive pixel-parallel execution of image processing algorithms. This enables the application of certain algorithms at extreme frame rates (>10,000 frames per second). By performing some early-stage processing in-situ, systems incorporating FPSPs can consume less power compared to conventional approaches using standard digital cameras. In this article, we explore code generation for an FPSP whose 256 × 256 processors operate on analogue signal data, leading to further opportunities for power reduction—and additional code synthesis challenges.While rudimentary image processing algorithms have been demonstrated on FPSPs before, progress with higher-level computer vision algorithms has been sparse due to the unique architecture and limits of the devices. This article presents a code generator for convolution filters for the SCAMP-5 FPSP, with applications in many high-level tasks such as convolutional neural networks, pose estimation, and so on. The SCAMP-5 FPSP has no effective multiply operator. Convolutions have to be implemented through sequences of more primitive operations such as additions, subtractions, and multiplications/divisions by two. We present a code generation algorithm to optimise convolutions by identifying common factors in the different weights and by determining an optimised pattern of pixel-to-pixel data movements to exploit them. We present evaluation in terms of both speed and energy consumption for a suite of well-known convolution filters. Furthermore, an application of the method is shown by the implementation of a Viola-Jones face detection algorithm.
Conference paperChoi J, Chang HJ, Fischer T, et al., 2018,
We propose a new context-aware correlation filter based tracking framework to achieve both high computational speed and state-of-the-art performance among real-time trackers. The major contribution to the high computational speed lies in the proposed deep feature compression that is achieved by a context-aware scheme utilizing multiple expert auto-encoders; a context in our framework refers to the coarse category of the tracking target according to appearance patterns. In the pre-training phase, one expert auto-encoder is trained per category. In the tracking phase, the best expert auto-encoder is selected for a given target, and only this auto-encoder is used. To achieve high tracking performance with the compressed feature map, we introduce extrinsic denoising processes and a new orthogonality loss term for pre-training and fine-tuning of the expert auto-encoders. We validate the proposed context-aware framework through a number of experiments, where our method achieves a comparable performance to state-of-the-art trackers which cannot run in real-time, while running at a significantly fast speed of over 100 fps.
Journal articleChang HJ, Fischer T, Petit M, et al., 2018,
We present a novel framework for finding the kinematic structure correspondences between two articulated objects in videos via hypergraph matching. In contrast to appearance and graph alignment based matching methods, which have been applied among two similar static images, the proposed method finds correspondences between two dynamic kinematic structures of heterogeneous objects in videos. Thus our method allows matching the structure of objects which have similar topologies or motions, or a combination of the two. Our main contributions are summarised as follows: (i)casting the kinematic structure correspondence problem into a hypergraph matching problem by incorporating multi-order similarities with normalising weights, (ii)introducing a structural topology similarity measure by aggregating topology constrained subgraph isomorphisms, (iii)measuring kinematic correlations between pairwise nodes, and (iv)proposing a combinatorial local motion similarity measure using geodesic distance on the Riemannian manifold. We demonstrate the robustness and accuracy of our method through a number of experiments on synthetic and real data, showing that various other recent and state of the art methods are outperformed. Our method is not limited to a specific application nor sensor, and can be used as building block in applications such as action recognition, human motion retargeting to robots, and articulated object manipulation.
Journal articleMoulin-Frier C, Fischer T, Petit M, et al., 2018,
DAC-h3: A Proactive Robot Cognitive Architecture to Acquire and Express Knowledge About the World and the Self, IEEE Transactions on Cognitive and Developmental Systems, Vol: 10, Pages: 1005-1022, ISSN: 2379-8920
This paper introduces a cognitive architecture for a humanoid robot to engage in a proactive, mixed-initiative exploration and manipulation of its environment, where the initiative can originate from both the human and the robot. The framework, based on a biologically-grounded theory of the brain and mind, integrates a reactive interaction engine, a number of state-of-the art perceptual and motor learning algorithms, as well as planning abilities and an autobiographical memory. The architecture as a whole drives the robot behavior to solve the symbol grounding problem, acquire language capabilities, execute goal-oriented behavior, and express a verbal narrative of its own experience in the world. We validate our approach in human-robot interaction experiments with the iCub humanoid robot, showing that the proposed cognitive architecture can be applied in real time within a realistic scenario and that it can be used with naive users.
Conference paperWang K, Shah A, Kormushev P, 2018,
SLIDER: A Bipedal Robot with Knee-less Legs and Vertical Hip Sliding Motion, 21st International Conference on Climbing and Walking Robots and Support Technologies for Mobile Machines (CLAWAR 2018)
Journal articleSarabia M, Young N, Canavan K, et al., 2018,
Social isolation in hospitals is a well established risk factor for complications such as cognitive decline and depression. Assistive robotic technology has the potential to combat this problem, but first it is critical to investigate how hospital patients react to this technology. In order to address this question, we introduced a remotely operated NAO humanoid robot which conversed, made jokes, played music, danced and exercised with patients in a London hospital. In total, 49 patients aged between 18–100 took part in the study, 7 of whom had dementia. Our results show that a majority of patients enjoyed their interaction with NAO. We also found that age and dementia significantly affect the interaction, whereas gender does not. These results indicate that hospital patients enjoy socialising with robots, opening new avenues for future research into the potential health benefits of a social robotic companion.
Journal articleSaeedi Gharahbolagh S, Bodin B, Wagstaff H, et al., 2018,
Navigating the landscape for real-time localisation and mapping for robotics, virtual and augmented reality, Proceedings of the IEEE, Vol: 106, Pages: 2020-2039, ISSN: 0018-9219
Visual understanding of 3-D environments in real time, at low power, is a huge computational challenge. Often referred to as simultaneous localization and mapping (SLAM), it is central to applications spanning domestic and industrial robotics, autonomous vehicles, and virtual and augmented reality. This paper describes the results of a major research effort to assemble the algorithms, architectures, tools, and systems software needed to enable delivery of SLAM, by supporting applications specialists in selecting and configuring the appropriate algorithm and the appropriate hardware, and compilation pathway, to meet their performance, accuracy, and energy consumption goals. The major contributions we present are: 1) tools and methodology for systematic quantitative evaluation of SLAM algorithms; 2) automated, machine-learning-guided exploration of the algorithmic and implementation design space with respect to multiple objectives; 3) end-to-end simulation tools to enable optimization of heterogeneous, accelerated architectures for the specific algorithmic requirements of the various SLAM algorithmic approaches; and 4) tools for delivering, where appropriate, accelerated, adaptive SLAM solutions in a managed, JIT-compiled, adaptive runtime context.
Conference paperFischer T, Chang HJ, Demiris Y, 2018,
In this work, we consider the problem of robust gaze estimation in natural environments. Large camera-to-subject distances and high variations in head pose and eye gaze angles are common in such environments. This leads to two main shortfalls in state-of-the-art methods for gaze estimation: hindered ground truth gaze annotation and diminished gaze estimation accuracy as image resolution decreases with distance. We first record a novel dataset of varied gaze and head pose images in a natural environment, addressing the issue of ground truth annotation by measuring head pose using a motion capture system and eye gaze using mobile eyetracking glasses. We apply semantic image inpainting to the area covered by the glasses to bridge the gap between training and testing images by removing the obtrusiveness of the glasses. We also present a new real-time algorithm involving appearance-based deep convolutional neural networks with increased capacity to cope with the diverse images in the new dataset. Experiments with this network architecture are conducted on a number of diverse eye-gaze datasets including our own, and in cross dataset evaluations. We demonstrate state-of-the-art performance in terms of estimation accuracy in all experiments, and the architecture performs well even on lower resolution images.
Conference paperPardo F, Levdik V, Kormushev P, 2018,
Q-map: A convolutional approach for goal-oriented reinforcement learning.
Goal-oriented learning has become a core concept in reinforcement learning(RL), extending the reward signal as a sole way to define tasks. However, asparameterizing value functions with goals increases the learning complexity,efficiently reusing past experience to update estimates towards several goalsat once becomes desirable but usually requires independent updates per goal.Considering that a significant number of RL environments can support spatialcoordinates as goals, such as on-screen location of the character in ATARI orSNES games, we propose a novel goal-oriented agent called Q-map that utilizesan autoencoder-like neural network to predict the minimum number of stepstowards each coordinate in a single forward pass. This architecture is similarto Horde with parameter sharing and allows the agent to discover correlationsbetween visual patterns and navigation. For example learning how to use aladder in a game could be transferred to other ladders later. We show how thisnetwork can be efficiently trained with a 3D variant of Q-learning to updatethe estimates towards all goals at once. While the Q-map agent could be usedfor a wide range of applications, we propose a novel exploration mechanism inplace of epsilon-greedy that relies on goal selection at a desired distancefollowed by several steps taken towards it, allowing long and coherentexploratory steps in the environment. We demonstrate the accuracy andgeneralization qualities of the Q-map agent on a grid-world environment andthen demonstrate the efficiency of the proposed exploration mechanism on thenotoriously difficult Montezuma's Revenge and Super Mario All-Stars games.
Conference paperNguyen P, Fischer T, Chang HJ, et al., 2018,
Hand-eye coordination is a requirement for many manipulation tasks including grasping and reaching. However, accurate hand-eye coordination has shown to be especially difficult to achieve in complex robots like the iCub humanoid. In this work, we solve the hand-eye coordination task using a visuomotor deep neural network predictor that estimates the arm's joint configuration given a stereo image pair of the arm and the underlying head configuration. As there are various unavoidable sources of sensing error on the physical robot, we train the predictor on images obtained from simulation. The images from simulation were modified to look realistic using an image-to-image translation approach. In various experiments, we first show that the visuomotor predictor provides accurate joint estimates of the iCub's hand in simulation. We then show that the predictor can be used to obtain the systematic error of the robot's joint measurements on the physical iCub robot. We demonstrate that a calibrator can be designed to automatically compensate this error. Finally, we validate that this enables accurate reaching of objects while circumventing manual fine-calibration of the robot.
Conference paperChacon Quesada R, Demiris Y, 2018,
Augmented reality control of smart wheelchair using eye-gaze–enabled selection of affordances, https://www.idiap.ch/workshop/iros2018/files/, IROS 2018 Workshop on Robots for Assisted Living
In this paper we present a novel augmented reality head mounted display user interface for controlling a robotic wheelchair for people with limited mobility. To lower the cognitive requirements needed to control the wheelchair, we propose integration of a smart wheelchair with an eye-tracking enabled head-mounted display. We propose a novel platform that integrates multiple user interface interaction methods for aiming at and selecting affordances derived by on-board perception capabilities such as laser-scanner readings and cameras. We demonstrate the effectiveness of the approach by evaluating our platform in two realistic scenarios: 1) Door detection, where the affordance corresponds to a Door object and the Go-Through action and 2) People detection, where the affordance corresponds to a Person and the Approach action. To the best of our knowledge, this is the first demonstration of a augmented reality head-mounted display user interface for controlling a smart wheelchair.
Conference paperSaputra RP, Kormushev P, 2018,
One of the most important features of mobilerescue robots is the ability to autonomously detect casualties,i.e. human bodies, which are usually lying on the ground. Thispaper proposes a novel method for autonomously detectingcasualties lying on the ground using obtained 3D point-clouddata from an on-board sensor, such as an RGB-D camera ora 3D LIDAR, on a mobile rescue robot. In this method, theobtained 3D point-cloud data is projected onto the detectedground plane, i.e. floor, within the point cloud. Then, thisprojected point cloud is converted into a grid-map that isused afterwards as an input for the algorithm to detecthuman body shapes. The proposed method is evaluated byperforming detections of a human dummy, placed in differentrandom positions and orientations, using an on-board RGB-Dcamera on a mobile rescue robot called ResQbot. To evaluatethe robustness of the casualty detection method to differentcamera angles, the orientation of the camera is set to differentangles. The experimental results show that using the point-clouddata from the on-board RGB-D camera, the proposed methodsuccessfully detects the casualty in all tested body positions andorientations relative to the on-board camera, as well as in alltested camera angles.
Conference paperPittiglio G, Kogkas A, Vrielink JO, et al., 2018,
Dynamic Control of Cable Driven Parallel Robots with Unknown Cable Stiffness: a Joint Space Approach, IEEE International Conference on Robotics and Automation (ICRA), Publisher: IEEE COMPUTER SOC, Pages: 948-955, ISSN: 1050-4729
Conference paperVrielink TJCO, Chao M, Darzi A, et al., 2018,
Gastrointestinal (GI) cancers account for 1.5 million deaths worldwide. Endoscopic Submucosal Dissection (ESD) is an advanced therapeutic endoscopy technique with superior clinical outcome due to the minimally invasive and en bloc removal of tumours. In the western world, ESD is seldom carried out, due to its complex and challenging nature. Various surgical systems are being developed to make this therapy accessible, however, these solutions have shown limited operational workspace, dexterity, or low force exertion capabilities. The current paper shows the ESD CYCLOPS system, a bimanual surgical robotic attachment that can be mounted at the end of any flexible endoscope. The system is able to achieve forces of up to 46N, and showed a mean error of 0.217mm during an elliptical tracing task. The workspace and instrument dexterity is shown by pre-clinical ex vivo trials, in which ESD is successfully performed by a GI surgeon. The system is currently undergoing pre-clinical in vivo validation.
Conference paperRunciman M, Darzi A, Mylonas G, 2018,
Deployable disposable self-propelling and variable stiffness devices for minimally invasive surgery, Conference on New Technologies for Computer/Robot Assisted Surgery
This data is extracted from the Web of Science and reproduced under a licence from Thomson Reuters. You may not copy or re-distribute this data in whole or in part without the written consent of the Science business of Thomson Reuters.