Publications

Journal article

Zhao M, Oude Vrielink TJC, Kogkas A, Runciman M, Elson D, Mylonas Get al., 2020,

LaryngoTORS: a novel cable-driven parallel robotic system for transoral laser phonosurgery

, IEEE Robotics and Automation Letters, Vol: 5, Pages: 1516-1523, ISSN: 2377-3766

Transoral laser phonosurgery is a commonly used surgical procedure in which a laser beam is used to perform incision, ablation or photocoagulation of laryngeal tissues. Two techniques are commonly practiced: free beam and fiber delivery. For free beam delivery, a laser scanner is integrated into a surgical microscope to provide an accurate laser scanning pattern. This approach can only be used under direct line of sight, which may cause increased postoperative pain to the patient and injury, is uncomfortable for the surgeon during prolonged operations, the manipulability is poor and extensive training is required. In contrast, in the fiber delivery technique, a flexible fiber is used to transmit the laser beam and therefore does not require direct line of sight. However, this can only achieve manual level accuracy, repeatability and velocity, and does not allow for pattern scanning. Robotic systems have been developed to overcome the limitations of both techniques. However, these systems offer limited workspace and degrees-of-freedom (DoF), limiting their clinical applicability. This work presents the LaryngoTORS, a robotic system that aims at overcoming the limitations of the two techniques, by using a cable-driven parallel mechanism (CDPM) attached at the end of a curved laryngeal blade for controlling the end tip of the laser fiber. The system allows autonomous generation of scanning patterns or user driven freepath scanning. Path scan validation demonstrated errors as low as 0.054±0.028 mm and high repeatability of 0.027±0.020 mm (6×2 mm arc line). Ex vivo tests on chicken tissue have been carried out. The results show the ability of the system to overcome limitations of current methods with high accuracy and repeatability using the superior fiber delivery approach.

Journal article

Liow L, Clark A, Rojas N, 2020,

OLYMPIC: a modular, tendon-driven prosthetic hand with novel finger and wrist coupling mechanisms

, IEEE Robotics and Automation Letters, Vol: 5, Pages: 299-306, ISSN: 2377-3766

Prosthetic hands, while having shown significant progress in affordability, typically suffer from limited repairability, specifically by the user themselves. Several modular hands have been proposed to address this, but these solutions require handling of intricate components or are unsuitable for prosthetic use due to the large volume and weight resulting from added mechanical complexity to achieve this modularity. In this paper, we propose a fully modular design for a prosthetic hand with finger and wrist level modularity, allowing the removal and attachment of tendon-driven fingers without the need for tools, retendoning, and rewiring. Our innovative design enables placement of the motors behind the hand for remote actuation of the tendons, which are contained solely within the fingers. Details of the novel coupling-transmission mechanisms enabling this are presented; and the capabilities of a prototype using a control-independent grasping benchmark are discussed. The modular detachment torque of the fingers is also computed to analyse the trade-off between intentional removal and the ability to withstand external loads. Experiment results demonstrate that the prosthetic hand is able to grasp a wide range of household and food items, of different shape, size, and weight, without resulting in the ejection of fingers, while allowing a user to remove them easily using a single hand.

Journal article

Gao Y, Chang HJ, Demiris Y, 2020,

User modelling using multimodal information for personalised dressing assistance

, IEEE Access, Vol: 8, Pages: 45700-45714, ISSN: 2169-3536

Conference paper

Nunes UM, Demiris Y, 2020,

Online unsupervised learning of the 3D kinematic structure of arbitrary rigid bodies

, IEEE/CVF International Conference on Computer Vision (ICCV), Publisher: IEEE Computer Soc, Pages: 3808-3816, ISSN: 1550-5499

This work addresses the problem of 3D kinematic structure learning of arbitrary articulated rigid bodies from RGB-D data sequences. Typically, this problem is addressed by offline methods that process a batch of frames, assuming that complete point trajectories are available. However, this approach is not feasible when considering scenarios that require continuity and fluidity, for instance, human-robot interaction. In contrast, we propose to tackle this problem in an online unsupervised fashion, by recursively maintaining the metric distance of the scene's 3D structure, while achieving real-time performance. The influence of noise is mitigated by building a similarity measure based on a linear embedding representation and incorporating this representation into the original metric distance. The kinematic structure is then estimated based on a combination of implicit motion and spatial properties. The proposed approach achieves competitive performance both quantitatively and qualitatively in terms of estimation accuracy, even compared to offline methods.

Conference paper

Pardo F, Levdik V, Kormushev P, 2020,

Scaling all-goals updates in reinforcement learning using convolutional neural networks

, 34th AAAI Conference on Artificial Intelligence (AAAI 2020), Publisher: Association for the Advancement of Artificial Intelligence, Pages: 5355-5362, ISSN: 2374-3468

Being able to reach any desired location in the environmentcan be a valuable asset for an agent. Learning a policy to nav-igate between all pairs of states individually is often not fea-sible. Anall-goals updatingalgorithm uses each transitionto learn Q-values towards all goals simultaneously and off-policy. However the expensive numerous updates in parallellimited the approach to small tabular cases so far. To tacklethis problem we propose to use convolutional network archi-tectures to generate Q-values and updates for a large numberof goals at once. We demonstrate the accuracy and generaliza-tion qualities of the proposed method on randomly generatedmazes and Sokoban puzzles. In the case of on-screen goalcoordinates the resulting mapping from frames todistance-mapsdirectly informs the agent about which places are reach-able and in how many steps. As an example of applicationwe show that replacing the random actions inε-greedy ex-ploration by several actions towards feasible goals generatesbetter exploratory trajectories on Montezuma’s Revenge andSuper Mario All-Stars games.

Conference paper

Chacon-Quesada R, Demiris Y, 2020,

Augmented reality controlled smart wheelchair using dynamic signifiers for affordance representation

, 2019 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), Publisher: IEEE

The design of augmented reality interfaces for people with mobility impairments is a novel area with great potential, as well as multiple outstanding research challenges. In this paper we present an augmented reality user interface for controlling a smart wheelchair with a head-mounted display to provide assistance for mobility restricted people. Our motivation is to reduce the cognitive requirements needed to control a smart wheelchair. A key element of our platform is the ability to control the smart wheelchair using the concepts of affordances and signifiers. In addition to the technical details of our platform, we present a baseline study by evaluating our platform through user-trials of able-bodied individuals and two different affordances: 1) Door Go Through and 2) People Approach. To present these affordances to the user, we evaluated fixed symbol based signifiers versus our novel dynamic signifiers in terms of ease to understand the suggested actions and its relation with the objects. Our results show a clear preference for dynamic signifiers. In addition, we show that the task load reported by participants is lower when controlling the smart wheelchair with our augmented reality user interface compared to using the joystick, which is consistent with their qualitative answers.

Conference paper

Saputra RP, Rakicevic N, Kormushev P, 2020,

Sim-to-real learning for casualty detection from ground projected point cloud data

, 2019 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS 2019), Publisher: IEEE

This paper addresses the problem of human body detection-particularly a human body lying on the ground (a.k.a. casualty)-using point cloud data. This ability to detect a casualty is one of the most important features of mobile rescue robots, in order for them to be able to operate autonomously. We propose a deep-learning-based casualty detection method using a deep convolutional neural network (CNN). This network is trained to be able to detect a casualty using a point-cloud data input. In the method we propose, the point cloud input is pre-processed to generate a depth image-like ground-projected heightmap. This heightmap is generated based on the projected distance of each point onto the detected ground plane within the point cloud data. The generated heightmap-in image form-is then used as an input for the CNN to detect a human body lying on the ground. To train the neural network, we propose a novel sim-to-real approach, in which the network model is trained using synthetic data obtained in simulation and then tested on real sensor data. To make the model transferable to real data implementations, during the training we adopt specific data augmentation strategies with the synthetic training data. The experimental results show that data augmentation introduced during the training process is essential for improving the performance of the trained model on real data. More specifically, the results demonstrate that the data augmentations on raw point-cloud data have contributed to a considerable improvement of the trained model performance.

Conference paper

Matheson E, Secoli R, Galvan S, Baena FRYet al., 2020,

Human-robot visual interface for 3D steering of a flexible, bioinspired needle for neurosurgery

, 2019 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), Publisher: IEEE

Robotic minimally invasive surgery has been a subject of intense research and development over the last three decades, due to the clinical advantages it holds for patients and doctors alike. Particularly for drug delivery mechanisms, higher precision and the ability to follow complex trajectories in three dimensions (3D), has led to interest in flexible, steerable needles such as the programmable bevel-tip needle (PBN). Steering in 3D, however, holds practical challenges for surgeons, as interfaces are traditionally designed for straight line paths. This work presents a pilot study undertaken to evaluate a novel human-machine visual interface for the steering of a robotic PBN, where both qualitative evaluation of the interface and quantitative evaluation of the performance of the subjects in following a 3D path are measured. A series of needle insertions are performed in phantom tissue (gelatin) by the experiment subjects. User could adequately use the system with little training and low workload, and reach the target point at the end of the path with millimeter range accuracy.

Conference paper

Zolotas M, Demiris Y, 2020,

Towards explainable shared control using augmented reality

, IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS 2019), Publisher: IEEE, Pages: 3020-3026

Shared control plays a pivotal role in establishing effective human-robot interactions. Traditional control-sharing methods strive to complement a human’s capabilities at safely completing a task, and thereby rely on users forming a mental model of the expected robot behaviour. However, these methods can often bewilder or frustrate users whenever their actions do not elicit the intended system response, forming a misalignment between the respective internal models of the robot and human. To resolve this model misalignment, we introduce Explainable Shared Control as a paradigm in which assistance and information feedback are jointly considered. Augmented reality is presented as an integral component of this paradigm, by visually unveiling the robot’s inner workings to human operators. Explainable Shared Control is instantiated and tested for assistive navigation in a setup involving a robotic wheelchair and a Microsoft HoloLens with add-on eye tracking. Experimental results indicate that the introduced paradigm facilitates transparent assistance by improving recovery times from adverse events associated with model misalignment.

Journal article

Runciman M, Avery J, Zhao M, Darzi A, Mylonas GPet al., 2020,

Deployable, variable stiffness, cable driven robot for minimally invasive surgery

, Frontiers in Robotics and AI, Vol: 6, Pages: 1-16, ISSN: 2296-9144

Minimally Invasive Surgery (MIS) imposes a trade-off between non-invasive access and surgical capability. Treatment of early gastric cancers over 20 mm in diameter can be achieved by performing Endoscopic Submucosal Dissection (ESD) with a flexible endoscope; however, this procedure is technically challenging, suffers from extended operation times and requires extensive training. To facilitate the ESD procedure, we have created a deployable cable driven robot that increases the surgical capabilities of the flexible endoscope while attempting to minimize the impact on the access that they offer. Using a low-profile inflatable support structure in the shape of a hollow hexagonal prism, our robot can fold around the flexible endoscope and, when the target site has been reached, achieve a 73.16% increase in volume and increase its radial stiffness. A sheath around the variable stiffness structure delivers a series of force transmission cables that connect to two independent tubular end-effectors through which standard flexible endoscopic instruments can pass and be anchored. Using a simple control scheme based on the length of each cable, the pose of the two instruments can be controlled by haptic controllers in each hand of the user. The forces exerted by a single instrument were measured, and a maximum magnitude of 8.29 N observed along a single axis. The working channels and tip control of the flexible endoscope remain in use in conjunction with our robot and were used during a procedure imitating the demands of ESD was successfully carried out by a novice user. Not only does this robot facilitate difficult surgical techniques, but it can be easily customized and rapidly produced at low cost due to a programmatic design approach.

Conference paper

Johns E, Liu S, Davison A, 2020,

End-to-end multi-task learning with attention

, The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2019, Publisher: IEEE

We propose a novel multi-task learning architecture, which allows learning of task-specific feature-level attention. Our design, the Multi-Task Attention Network (MTAN), consists of a single shared network containing a global feature pool, together with a soft-attention module for each task. These modules allow for learning of task-specific features from the global features, whilst simultaneously allowing for features to be shared across different tasks. The architecture can be trained end-to-end and can be built upon any feed-forward neural network, is simple to implement, and is parameter efficient. We evaluate our approach on a variety of datasets, across both image-to-image predictions and image classification tasks. We show that our architecture is state-of-the-art in multi-task learning compared to existing methods, and is also less sensitive to various weighting schemes in the multi-task loss function. Code is available at https://github.com/lorenmt/mtan.

Journal article

Escribano Macias J, Angeloudis P, Ochieng W, 2020,

Optimal hub selection for rapid medical deliveries using unmanned aerial vehicles

, Transportation Research Part C: Emerging Technologies, Vol: 110, Pages: 56-80, ISSN: 0968-090X

Unmanned Aerial Vehicles (UAVs) are being increasingly deployed in humanitarian response operations. Beyond regulations, vehicle range and integration with the humanitarian supply chain inhibit their deployment. To address these issues, we present a novel bi-stage operational planning approach that consists of a trajectory optimisation algorithm (that considers multiple flight stages), and a hub selection-routing algorithm that incorporates a new battery management heuristic. We apply the algorithm to a hypothetical response mission in Taiwan after the Chi-Chi earthquake of 1999 considering mission duration and distribution fairness. Our analysis indicates that UAV fleets can be used to provide rapid relief to populations of 20,000 individuals in under 24 h. Additionally, the proposed methodology achieves significant reductions in mission duration and battery stock requirements with respect to conservative energy estimations and other heuristics.

Journal article

Zambelli M, Cully A, Demiris Y, 2020,

Multimodal representation models for prediction and control from partial information

, Robotics and Autonomous Systems, Vol: 123, ISSN: 0921-8890

Similar to humans, robots benefit from interacting with their environment through a number of different sensor modalities, such as vision, touch, sound. However, learning from different sensor modalities is difficult, because the learning model must be able to handle diverse types of signals, and learn a coherent representation even when parts of the sensor inputs are missing. In this paper, a multimodal variational autoencoder is proposed to enable an iCub humanoid robot to learn representations of its sensorimotor capabilities from different sensor modalities. The proposed model is able to (1) reconstruct missing sensory modalities, (2) predict the sensorimotor state of self and the visual trajectories of other agents actions, and (3) control the agent to imitate an observed visual trajectory. Also, the proposed multimodal variational autoencoder can capture the kinematic redundancy of the robot motion through the learned probability distribution. Training multimodal models is not trivial due to the combinatorial complexity given by the possibility of missing modalities. We propose a strategy to train multimodal models, which successfully achieves improved performance of different reconstruction models. Finally, extensive experiments have been carried out using an iCub humanoid robot, showing high performance in multiple reconstruction, prediction and imitation tasks.

Conference paper

Buizza C, Fischer T, Demiris Y, 2020,

Real-time multi-person pose tracking using data assimilation

, IEEE Winter Conference on Applications of Computer Vision, Publisher: IEEE

We propose a framework for the integration of data assimilation and machine learning methods in human pose estimation, with the aim of enabling any pose estimation method to be run in real-time, whilst also increasing consistency and accuracy. Data assimilation and machine learning are complementary methods: the former allows us to make use of information about the underlying dynamics of a system but lacks the flexibility of a data-based model, which we can instead obtain with the latter. Our framework presents a real-time tracking module for any single or multi-person pose estimation system. Specifically, tracking is performed by a number of Kalman filters initiated for each new person appearing in a motion sequence. This permits tracking of multiple skeletons and reduces the frequency that computationally expensive pose estimation has to be run, enabling online pose tracking. The module tracks for N frames while the pose estimates are calculated for frame (N+1). This also results in increased consistency of person identification and reduced inaccuracies due to missing joint locations and inversion of left-and right-side joints.

Conference paper

Liu S, Davison A, Johns E, 2019,

Self-supervised generalisation with meta auxiliary learning

, 33rd Conference on Neural Information Processing Systems (NeurIPS 2019), Publisher: Neural Information Processing Systems Foundation, Inc.

Learning with auxiliary tasks can improve the ability of a primary task to generalise.However, this comes at the cost of manually labelling auxiliary data. We propose anew method which automatically learns appropriate labels for an auxiliary task,such that any supervised learning task can be improved without requiring access toany further data. The approach is to train two neural networks: a label-generationnetwork to predict the auxiliary labels, and a multi-task network to train theprimary task alongside the auxiliary task. The loss for the label-generation networkincorporates the loss of the multi-task network, and so this interaction between thetwo networks can be seen as a form of meta learning with a double gradient. Weshow that our proposed method, Meta AuXiliary Learning (MAXL), outperformssingle-task learning on 7 image datasets, without requiring any additional data.We also show that MAXL outperforms several other baselines for generatingauxiliary labels, and is even competitive when compared with human-definedauxiliary labels. The self-supervised nature of our method leads to a promisingnew direction towards automated generalisation. Source code can be found athttps://github.com/lorenmt/maxl.

Journal article

Rakicevic N, Kormushev P, 2019,

Active learning via informed search in movement parameter space for efficient robot task learning and transfer

, Autonomous Robots, Vol: 43, Pages: 1917-1935, ISSN: 0929-5593

Learning complex physical tasks via trial-and-error is still challenging for high-degree-of-freedom robots. Greatest challenges are devising a suitable objective function that defines the task, and the high sample complexity of learning the task. We propose a novel active learning framework, consisting of decoupled task model and exploration components, which does not require an objective function. The task model is specific to a task and maps the parameter space, defining a trial, to the trial outcome space. The exploration component enables efficient search in the trial-parameter space to generate the subsequent most informative trials, by simultaneously exploiting all the information gained from previous trials and reducing the task model’s overall uncertainty. We analyse the performance of our framework in a simulation environment and further validate it on a challenging bimanual-robot puck-passing task. Results show that the robot successfully acquires the necessary skills after only 100 trials without any prior information about the task or target positions. Decoupling the framework’s components also enables efficient skill transfer to new environments which is validated experimentally.

Journal article

Neerincx MA, van Vught W, Henkemans OB, Oleari E, Broekens J, Peters R, Kaptein F, Demiris Y, Kiefer B, Fumagalli D, Bierman Bet al., 2019,

Socio-cognitive engineering of a robotic partner for child's diabetes self-management

, Frontiers in Robotics and AI, Vol: 6, Pages: 1-16, ISSN: 2296-9144

Social or humanoid robots do hardly show up in “the wild,” aiming at pervasive and enduring human benefits such as child health. This paper presents a socio-cognitive engineering (SCE) methodology that guides the ongoing research & development for an evolving, longer-lasting human-robot partnership in practice. The SCE methodology has been applied in a large European project to develop a robotic partner that supports the daily diabetes management processes of children, aged between 7 and 14 years (i.e., Personal Assistant for a healthy Lifestyle, PAL). Four partnership functions were identified and worked out (joint objectives, agreements, experience sharing, and feedback & explanation) together with a common knowledge-base and interaction design for child's prolonged disease self-management. In an iterative refinement process of three cycles, these functions, knowledge base and interactions were built, integrated, tested, refined, and extended so that the PAL robot could more and more act as an effective partner for diabetes management. The SCE methodology helped to integrate into the human-agent/robot system: (a) theories, models, and methods from different scientific disciplines, (b) technologies from different fields, (c) varying diabetes management practices, and (d) last but not least, the diverse individual and context-dependent needs of the patients and caregivers. The resulting robotic partner proved to support the children on the three basic needs of the Self-Determination Theory: autonomy, competence, and relatedness. This paper presents the R&D methodology and the human-robot partnership framework for prolonged “blended” care of children with a chronic disease (children could use it up to 6 months; the robot in the hospitals and diabetes camps, and its avatar at home). It represents a new type of human-agent/robot systems with an evolving collective intelligence. The underlying ontology and design rationale can be used

Conference paper

Schettino V, Demiris Y, 2019,

Inference of user-intention in remote robot wheelchair assistance using multimodal interfaces

, IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), Publisher: IEEE, Pages: 4600-4606, ISSN: 2153-0858

Conference paper

Vespa E, Funk N, Kelly PHJ, Leutenegger Set al., 2019,

Adaptive-resolution octree-based volumetric SLAM

, 7th International Conference on 3D Vision (3DV), Publisher: IEEE COMPUTER SOC, Pages: 654-662, ISSN: 2378-3826

We introduce a novel volumetric SLAM pipeline for the integration and rendering of depth images at an adaptive level of detail. Our core contribution is a fusion algorithm which dynamically selects the appropriate integration scale based on the effective sensor resolution given the distance from the observed scene, addressing aliasing issues, reconstruction quality, and efficiency simultaneously. We implement our approach using an efficient octree structure which supports multi-resolution rendering allowing for online frame-to-model alignment. Our qualitative and quantitative experiments demonstrate significantly improved reconstruction quality and up to six-fold execution time speed-ups compared to single resolution grids.

Conference paper

Cortacero K, Fischer T, Demiris Y, 2019,

RT-BENE: A Dataset and Baselines for Real-Time Blink Estimation in Natural Environments

, IEEE International Conference on Computer Vision Workshops, Publisher: Institute of Electrical and Electronics Engineers Inc.

In recent years gaze estimation methods have made substantial progress, driven by the numerous application areas including human-robot interaction, visual attention estimation and foveated rendering for virtual reality headsets. However, many gaze estimation methods typically assume that the subject's eyes are open; for closed eyes, these methods provide irregular gaze estimates. Here, we address this assumption by first introducing a new open-sourced dataset with annotations of the eye-openness of more than 200,000 eye images, including more than 10,000 images where the eyes are closed. We further present baseline methods that allow for blink detection using convolutional neural networks. In extensive experiments, we show that the proposed baselines perform favourably in terms of precision and recall. We further incorporate our proposed RT-BENE baselines in the recently presented RT-GENE gaze estimation framework where it provides a real-time inference of the openness of the eyes. We argue that our work will benefit both gaze estimation and blink estimation methods, and we take steps towards unifying these methods.

Imperial College London

Latest News

Robotics Forum

Publications

LaryngoTORS: a novel cable-driven parallel robotic system for transoral laser phonosurgery

OLYMPIC: a modular, tendon-driven prosthetic hand with novel finger and wrist coupling mechanisms

User modelling using multimodal information for personalised dressing assistance

Online unsupervised learning of the 3D kinematic structure of arbitrary rigid bodies

Scaling all-goals updates in reinforcement learning using convolutional neural networks

Augmented reality controlled smart wheelchair using dynamic signifiers for affordance representation

Sim-to-real learning for casualty detection from ground projected point cloud data

Human-robot visual interface for 3D steering of a flexible, bioinspired needle for neurosurgery

Towards explainable shared control using augmented reality

Deployable, variable stiffness, cable driven robot for minimally invasive surgery

End-to-end multi-task learning with attention

Optimal hub selection for rapid medical deliveries using unmanned aerial vehicles

Multimodal representation models for prediction and control from partial information

Real-time multi-person pose tracking using data assimilation

Self-supervised generalisation with meta auxiliary learning

Active learning via informed search in movement parameter space for efficient robot task learning and transfer

Socio-cognitive engineering of a robotic partner for child's diabetes self-management

Inference of user-intention in remote robot wheelchair assistance using multimodal interfaces

Adaptive-resolution octree-based volumetric SLAM

RT-BENE: A Dataset and Baselines for Real-Time Blink Estimation in Natural Environments

Publications

Search or filter publications

Filter by type:

Filter by year:

Results

Search results

Scaling all-goals updates in reinforcement learning using convolutional neural networks

Real-time multi-person pose tracking using data assimilation

Self-supervised generalisation with meta auxiliary learning

Inference of user-intention in remote robot wheelchair assistance using multimodal interfaces

RT-BENE: A Dataset and Baselines for Real-Time Blink Estimation in Natural Environments