Imperial College London

DrPetarKormushev

Faculty of EngineeringDyson School of Design Engineering

Lecturer
 
 
 
//

Contact

 

+44 (0)20 7594 9235p.kormushev Website

 
 
//

Location

 

25 Exhibition Road, 3rd floor, Dyson BuildingDyson BuildingSouth Kensington Campus

//

Summary

 

Publications

Publication Type
Year
to

92 results found

Saputra RP, Rakicevic N, Kormushev P, 2019, Sim-to-real learning for casualty detection from ground projected point cloud data, 2019 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS 2019), Publisher: IEEE

This paper addresses the problem of human bodydetection—particularly a human body lying on the ground(a.k.a. casualty)—using point cloud data. This ability to detect acasualty is one of the most important features of mobile rescuerobots, in order for them to be able to operate autonomously.We propose a deep-learning-based casualty detection methodusing a deep convolutional neural network (CNN). This networkis trained to be able to detect a casualty using a point-clouddata input. In the method we propose, the point cloud input ispre-processed to generate a depth image-like ground-projectedheightmap. This heightmap is generated based on the projecteddistance of each point onto the detected ground plane within thepoint cloud data. The generated heightmap—in image form—isthen used as an input for the CNN to detect a human bodylying on the ground. To train the neural network, we proposea novel sim-to-real approach, in which the network model istrained using synthetic data obtained in simulation and thentested on real sensor data. To make the model transferableto real data implementations, during the training we adoptspecific data augmentation strategies with the synthetic trainingdata. The experimental results show that data augmentationintroduced during the training process is essential for improvingthe performance of the trained model on real data. Morespecifically, the results demonstrate that the data augmentationson raw point-cloud data have contributed to a considerableimprovement of the trained model performance.

Conference paper

Falck F, Doshi S, Smuts N, Lingi J, Rants K, Kormushev Pet al., 2019, Human-centered manipulation and navigation with robot DE NIRO

Social assistance robots in health and elderly care have the potential tosupport and ease human lives. Given the macrosocial trends of aging andlong-lived populations, robotics-based care research mainly focused on helpingthe elderly live independently. In this paper, we introduce Robot DE NIRO, aresearch platform that aims to support the supporter (the caregiver) and alsooffers direct human-robot interaction for the care recipient. Augmented byseveral sensors, DE NIRO is capable of complex manipulation tasks. It reliablyinteracts with humans and can autonomously and swiftly navigate throughdynamically changing environments. We describe preliminary experiments in ademonstrative scenario and discuss DE NIRO's design and capabilities. We putparticular emphases on safe, human-centered interaction procedures implementedin both hardware and software, including collision avoidance in manipulationand navigation as well as an intuitive perception stack through speech and facerecognition.

Working paper

AlAttar A, Rouillard L, Kormushev P, 2019, Autonomous air-hockey playing cobot using optimal control and vision-based Bayesian tracking, Towards Autonomous Robotic Systems, Publisher: Springer, ISSN: 0302-9743

This paper presents a novel autonomous air-hockey playing collaborative robot (cobot) that provides human-like gameplay against human opponents. Vision-based Bayesian tracking of the puck and striker are used in an Analytic Hierarchy Process (AHP)-based probabilistic tactical layer for high-speed perception. The tactical layer provides commands for an active control layer that controls the Cartesian position and yaw angle of a custom end effector. The active layer uses optimal control of the cobot’s posture inside the task nullspace. The kinematic redundancy is resolved using a weighted Moore-Penrose pseudo-inversion technique. Experiments with human players show high-speed human-like gameplay with potential applications in the growing field of entertainment robotics.

Conference paper

Falck F, Larppichet K, Kormushev P, 2019, DE VITO: A dual-arm, high degree-of-freedom, lightweight, inexpensive, passive upper-limb exoskeleton for robot teleoperation, TAROS: Annual Conference Towards Autonomous Robotic Systems, Publisher: Springer, ISSN: 0302-9743

While robotics has made significant advances in perception, planning and control in recent decades, the vast majority of tasks easily completed by a human, especially acting in dynamic, unstructured environments, are far from being autonomously performed by a robot. Teleoperation, remotely controlling a slave robot by a human operator, can be a realistic, complementary transition solution that uses the motion intelligence of a human in complex tasks while exploiting the robot’s autonomous reliability and precision in less challenging situations.We introduce DE VITO, a seven degree-of-freedom, dual-arm upper-limb exoskeleton that passively measures the pose of a human arm. DE VITO is a lightweight, simplistic and energy-efficient design with a total material cost of at least an order of magnitude less than previous work. Given the estimated human pose, we implement both joint and Cartesian space kinematic control algorithms and present qualitative experimental results on various complex manipulation tasks teleoperating Robot DE NIRO, a research platform for mobile manipulation, that demonstrate the functionality of DE VITO. We provide the CAD models, open-source code and supplementary videos of DE VITO at http://www.imperial.ac.uk/robot-intelligence/robots/de_vito/.

Conference paper

Rakicevic N, Kormushev P, 2019, Active learning via informed search in movement parameter space for efficient robot task learning and transfer, Autonomous Robots, ISSN: 0929-5593

Learning complex physical tasks via trial-and-error is still challenging for high-degree-of-freedom robots. Greatest challenges are devising a suitable objective function that defines the task, and the high sample complexity of learning the task. We propose a novel active learning framework, consisting of decoupled task model and exploration components, which does not require an objective function. The task model is specific to a task and maps the parameter space, defining a trial, to the trial outcome space. The exploration component enables efficient search in the trial-parameter space to generate the subsequent most informative trials, by simultaneously exploiting all the information gained from previous trials and reducing the task model’s overall uncertainty. We analyse the performance of our framework in a simulation environment and further validate it on a challenging bimanual-robot puck-passing task. Results show that the robot successfully acquires the necessary skills after only 100 trials without any prior information about the task or target positions. Decoupling the framework’s components also enables efficient skill transfer to new environments which is validated experimentally.

Journal article

Tavakoli A, Levdik V, Islam R, Kormushev Pet al., 2019, Prioritizing starting states for reinforcement learning

Online, off-policy reinforcement learning algorithms are able to use anexperience memory to remember and replay past experiences. In prior work, thisapproach was used to stabilize training by breaking the temporal correlationsof the updates and avoiding the rapid forgetting of possibly rare experiences.In this work, we propose a conceptually simple framework that uses anexperience memory to help exploration by prioritizing the starting states fromwhich the agent starts acting in the environment, importantly, in a fashionthat is also compatible with on-policy algorithms. Given the capacity torestart the agent in states corresponding to its past observations, we achievethis objective by (i) enabling the agent to restart in states belonging tosignificant past experiences (e.g., nearby goals), and (ii) promoting fastercoverage of the state space through starting from a more diverse set of states.While, using a good priority measure to identify significant past transitions,we expect case (i) to more considerably help exploration in certain domains(e.g., sparse reward tasks), we hypothesize that case (ii) will generally bebeneficial, even without any prioritization. We show empirically that ourapproach improves learning performance for both off-policy and on-policy deepreinforcement learning methods, with most notable gains in highly sparse rewardtasks.

Working paper

Wang K, Shah A, Kormushev P, 2018, SLIDER: A Bipedal Robot with Knee-less Legs and Vertical Hip Sliding Motion, 21st International Conference on Climbing and Walking Robots and Support Technologies for Mobile Machines (CLAWAR 2018)

Conference paper

Pardo F, Levdik V, Kormushev P, 2018, Q-map: A convolutional approach for goal-oriented reinforcement learning.

Goal-oriented learning has become a core concept in reinforcement learning(RL), extending the reward signal as a sole way to define tasks. However, asparameterizing value functions with goals increases the learning complexity,efficiently reusing past experience to update estimates towards several goalsat once becomes desirable but usually requires independent updates per goal.Considering that a significant number of RL environments can support spatialcoordinates as goals, such as on-screen location of the character in ATARI orSNES games, we propose a novel goal-oriented agent called Q-map that utilizesan autoencoder-like neural network to predict the minimum number of stepstowards each coordinate in a single forward pass. This architecture is similarto Horde with parameter sharing and allows the agent to discover correlationsbetween visual patterns and navigation. For example learning how to use aladder in a game could be transferred to other ladders later. We show how thisnetwork can be efficiently trained with a 3D variant of Q-learning to updatethe estimates towards all goals at once. While the Q-map agent could be usedfor a wide range of applications, we propose a novel exploration mechanism inplace of epsilon-greedy that relies on goal selection at a desired distancefollowed by several steps taken towards it, allowing long and coherentexploratory steps in the environment. We demonstrate the accuracy andgeneralization qualities of the Q-map agent on a grid-world environment andthen demonstrate the efficiency of the proposed exploration mechanism on thenotoriously difficult Montezuma's Revenge and Super Mario All-Stars games.

Working paper

Saputra RP, Kormushev P, 2018, Casualty detection for mobile rescue robots via ground-projected point clouds, Towards Autonomous Robotic Systems (TAROS) 2018, Publisher: Springer, Cham, Pages: 473-475, ISSN: 0302-9743

In order to operate autonomously, mobile rescue robots needto be able to detect human casualties in disaster situations. In this paper,we propose a novel method for autonomous detection of casualties lyingdown on the ground based on point-cloud data. This data can be obtainedfrom different sensors, such as an RGB-D camera or a 3D LIDAR sensor.The method is based on a ground-projected point-cloud (GPPC) imageto achieve human body shape detection. A preliminary experiment hasbeen conducted using the RANSAC method for floor detection and, theHOG feature and the SVM classifier to detect human body shape. Theresults show that the proposed method succeeds to identify a casualtyfrom point-cloud data in a wide range of viewing angles.

Conference paper

Pardo F, Tavakoli A, Levdik V, Kormushev Pet al., 2018, Time limits in reinforcement learning, International Conference on Machine Learning, Pages: 4042-4051

In reinforcement learning, it is common to let anagent interact for a fixed amount of time with itsenvironment before resetting it and repeating theprocess in a series of episodes. The task that theagent has to learn can either be to maximize itsperformance over (i) that fixed period, or (ii) anindefinite period where time limits are only usedduring training to diversify experience. In thispaper, we provide a formal account for how timelimits could effectively be handled in each of thetwo cases and explain why not doing so can causestate-aliasing and invalidation of experience re-play, leading to suboptimal policies and traininginstability. In case (i), we argue that the termi-nations due to time limits are in fact part of theenvironment, and thus a notion of the remainingtime should be included as part of the agent’s in-put to avoid violation of the Markov property. Incase (ii), the time limits are not part of the envi-ronment and are only used to facilitate learning.We argue that this insight should be incorporatedby bootstrapping from the value of the state atthe end of each partial episode. For both cases,we illustrate empirically the significance of ourconsiderations in improving the performance andstability of existing reinforcement learning algo-rithms, showing state-of-the-art results on severalcontrol tasks.

Conference paper

Wang K, Shah A, Kormushev P, 2018, SLIDER: a novel bipedal walking robot without knees, Towards Autonomous Robotic Systems (TAROS) 2018, Publisher: Springer International Publishing AG, part of Springer Nature, Pages: 471-472, ISSN: 0302-9743

In this work, we propose a novel mobile rescue robot equipped with an immersive stereoscopic teleperception and a teleoperation control. This robot is designed with the capability to perform safely a casualty-extraction procedure. We have built a proof-of-concept mobile rescue robot called ResQbot for the experimental platform. An approach called “loco-manipulation” is used to perform the casualty-extraction procedure using the platform. The performance of this robot is evaluated in terms of task accomplishment and safety by conducting a mock rescue experiment. We use a custom-made human-sized dummy that has been sensorised to be used as the casualty. In terms of safety, we observe several parameters during the experiment including impact force, acceleration, speed and displacement of the dummy’s head. We also compare the performance of the proposed immersive stereoscopic teleperception to conventional monocular teleperception. The results of the experiments show that the observed safety parameters are below key safety thresholds which could possibly lead to head or neck injuries. Moreover, the teleperception comparison results demonstrate an improvement in task-accomplishment performance when the operator is using the immersive teleperception.

Conference paper

Saputra RP, Kormushev P, 2018, ResQbot: a mobile rescue robot with immersive teleperception for casualty extraction, Towards Autonomous Robotic Systems (TAROS) 2018, Publisher: Springer International Publishing AG, part of Springer Nature, Pages: 209-220, ISSN: 0302-9743

In this work, we propose a novel mobile rescue robot equipped with an immersive stereoscopic teleperception and a teleoperation control. This robot is designed with the capability to perform safely a casualty-extraction procedure. We have built a proof-of-concept mobile rescue robot called ResQbot for the experimental platform. An approach called “loco-manipulation” is used to perform the casualty-extraction procedure using the platform. The performance of this robot is evaluated in terms of task accomplishment and safety by conducting a mock rescue experiment. We use a custom-made human-sized dummy that has been sensorised to be used as the casualty. In terms of safety, we observe several parameters during the experiment including impact force, acceleration, speed and displacement of the dummy’s head. We also compare the performance of the proposed immersive stereoscopic teleperception to conventional monocular teleperception. The results of the experiments show that the observed safety parameters are below key safety thresholds which could possibly lead to head or neck injuries. Moreover, the teleperception comparison results demonstrate an improvement in task-accomplishment performance when the operator is using the immersive teleperception.

Conference paper

Saputra RP, Kormushev P, 2018, Casualty detection from 3D point cloud data for autonomous ground mobile rescue robots, SSRR 2018, Publisher: IEEE

One of the most important features of mobilerescue robots is the ability to autonomously detect casualties,i.e. human bodies, which are usually lying on the ground. Thispaper proposes a novel method for autonomously detectingcasualties lying on the ground using obtained 3D point-clouddata from an on-board sensor, such as an RGB-D camera ora 3D LIDAR, on a mobile rescue robot. In this method, theobtained 3D point-cloud data is projected onto the detectedground plane, i.e. floor, within the point cloud. Then, thisprojected point cloud is converted into a grid-map that isused afterwards as an input for the algorithm to detecthuman body shapes. The proposed method is evaluated byperforming detections of a human dummy, placed in differentrandom positions and orientations, using an on-board RGB-Dcamera on a mobile rescue robot called ResQbot. To evaluatethe robustness of the casualty detection method to differentcamera angles, the orientation of the camera is set to differentangles. The experimental results show that using the point-clouddata from the on-board RGB-D camera, the proposed methodsuccessfully detects the casualty in all tested body positions andorientations relative to the on-board camera, as well as in alltested camera angles.

Conference paper

Saputra RP, Kormushev P, 2018, ResQbot: A mobile rescue robot for casualty extraction, 2018 ACM/IEEE International Conference on Human-Robot Interaction (HRI 2018), Publisher: Association for Computing Machinery, Pages: 239-240

Performing search and rescue missions in disaster-struck environments is challenging. Despite the advances in the robotic search phase of the rescue missions, few works have been focused on the physical casualty extraction phase. In this work, we propose a mobile rescue robot that is capable of performing a safe casualty extraction routine. To perform this routine, this robot adopts a loco-manipulation approach. We have designed and built a mobile rescue robot platform called ResQbot as a proof of concept of the proposed system. We have conducted preliminary experiments using a sensorised human-sized dummy as a victim, to confirm that the platform is capable of performing a safe casualty extraction procedure.

Conference paper

Kormushev P, Ugurlu B, Caldwell DG, Tsagarakis NGet al., 2019, Learning to exploit passive compliance for energy-efficient gait generation on a compliant humanoid, Autonomous Robots, ISSN: 1573-7527

Modern humanoid robots include not only active compliance but also passive compliance. Apart from improved safety and dependability, availability of passive elements, such as springs, opens up new possibilities for improving the energy efficiency. With this in mind, this paper addresses the challenging open problem of exploiting the passive compliance for the purpose of energy efficient humanoid walking. To this end, we develop a method comprising two parts: an optimization part that finds an optimal vertical center-of-mass trajectory, and a walking pattern generator part that uses this trajectory to produce a dynamically-balanced gait. For the optimization part, we propose a reinforcement learning approach that dynamically evolves the policy parametrization during the learning process. By gradually increasing the representational power of the policy parametrization, it manages to find better policies in a faster and computationally efficient way. For the walking generator part, we develop a variable-center-of-mass-height ZMP-based bipedal walking pattern generator. The method is tested in real-world experiments with the bipedal robot COMAN and achieves a significant 18% reduction in the electric energy consumption by learning to efficiently use the passive compliance of the robot.

Journal article

Tavakoli A, Pardo F, Kormushev P, 2018, Action branching architectures for deep reinforcement learning, AAAI 2018, Publisher: AAAI

Discrete-action algorithms have been central to numerousrecent successes of deep reinforcement learning. However,applying these algorithms to high-dimensional action tasksrequires tackling the combinatorial increase of the numberof possible actions with the number of action dimensions.This problem is further exacerbated for continuous-actiontasks that require fine control of actions via discretization.In this paper, we propose a novel neural architecture fea-turing a shared decision module followed by several net-workbranches, one for each action dimension. This approachachieves a linear increase of the number of network outputswith the number of degrees of freedom by allowing a level ofindependence for each individual action dimension. To illus-trate the approach, we present a novel agent, called Branch-ing Dueling Q-Network (BDQ), as a branching variant ofthe Dueling Double Deep Q-Network (Dueling DDQN). Weevaluate the performance of our agent on a set of challeng-ing continuous control tasks. The empirical results show thatthe proposed agent scales gracefully to environments with in-creasing action dimensionality and indicate the significanceof the shared decision module in coordination of the dis-tributed action branches. Furthermore, we show that the pro-posed agent performs competitively against a state-of-the-art continuous control algorithm, Deep Deterministic PolicyGradient (DDPG).

Conference paper

Kanajar P, Caldwell DG, Kormushev P, 2017, Climbing over large obstacles with a humanoid robot via multi-contact motion planning, IEEE RO-MAN 2017: 26th IEEE International Symposium on Robot and Human Interactive Communication, Publisher: IEEE, Pages: 1202-1209

Incremental progress in humanoid robot locomotion over the years has achieved important capabilities such as navigation over flat or uneven terrain, stepping over small obstacles and climbing stairs. However, the locomotion research has mostly been limited to using only bipedal gait and only foot contacts with the environment, using the upper body for balancing without considering additional external contacts. As a result, challenging locomotion tasks like climbing over large obstacles relative to the size of the robot have remained unsolved. In this paper, we address this class of open problems with an approach based on multi-body contact motion planning guided through physical human demonstrations. Our goal is to make the humanoid locomotion problem more tractable by taking advantage of objects in the surrounding environment instead of avoiding them. We propose a multi-contact motion planning algorithm for humanoid robot locomotion which exploits the whole-body motion and multi-body contacts including both the upper and lower body limbs. The proposed motion planning algorithm is applied to a challenging task of climbing over a large obstacle. We demonstrate successful execution of the climbing task in simulation using our multi-contact motion planning algorithm initialized via a transfer from real-world human demonstrations of the task and further optimized.

Conference paper

Tavakoli A, Pardo F, Kormushev P, 2017, Action Branching Architectures for Deep Reinforcement Learning, Deep Reinforcement Learning Symposium, 31st Conference on Neural Information Processing Systems (NIPS 2017)

Conference paper

Rakicevic N, Kormushev P, 2017, Efficient Robot Task Learning and Transfer via Informed Search in Movement Parameter Space, Workshop on Acting and Interacting in the Real World: Challenges in Robot Learning, 31st Conference on Neural Information Processing Systems (NIPS 2017)

Conference paper

Kormushev P, Ahmadzadeh SR, 2016, Robot Learning for Persistent Autonomy, Handling Uncertainty and Networked Structure in Robot Control, Editors: Busoniu, Tamás, Publisher: Springer International Publishing, Pages: 3-28, ISBN: 978-3-319-26327-4

Book chapter

Ahmadzadeh SR, Kormushev P, 2016, Visuospatial Skill Learning, Handling Uncertainty and Networked Structure in Robot Control, Editors: Busoniu, Tamás, Publisher: Springer International Publishing, Pages: 75-99, ISBN: 978-3-319-26327-4

Book chapter

Maurelli F, Lane D, Kormushev P, Caldwell D, Carreras M, Salvi J, Fox M, Long D, Kyriakopoulos K, Karras Get al., 2016, The PANDORA project: a success story in AUV autonomy, OCEANS Conference 2016, Publisher: IEEE, ISSN: 0197-7385

This paper presents some of the results of the EU-funded project PANDORA - Persistent Autonomy Through Learning Adaptation Observation and Re-planning. The project was three and a half years long and involved several organisations across Europe. The application domain is underwater inspection and intervention, a topic particularly interesting for the oil and gas sector, whose representatives constituted the Industrial Advisory Board. Field trials were performed at The Underwater Centre, in Loch Linnhe, Scotland, and in harbour conditions close to Girona, Spain.

Conference paper

Palomeras N, Carrera A, Hurtós N, Karras GC, Bechlioulis CP, Cashmore M, Magazzeni D, Long D, Fox M, Kyriakopoulos KJ, Kormushev P, Salvi J, Carreras Met al., 2016, Toward persistent autonomous intervention in a subsea panel, Autonomous Robots, Vol: 40, Pages: 1279-1306

Journal article

Jamisola RS, Kormushev P, Roberts RG, Caldwell DGet al., 2016, Task-Space Modular Dynamics for Dual-Arms Expressed through a Relative Jacobian, Journal of Intelligent & Robotic Systems, Pages: 1-14, ISSN: 1573-0409

Journal article

Kryczka P, Kormushev P, Tsagarakis N, Caldwell DGet al., 2015, Online Regeneration of Bipedal Walking Gait Optimizing Footstep Placement and Timing

Conference paper

Kormushev P, Demiris Y, Caldwell DG, 2015, Kinematic-free Position Control of a 2-DOF Planar Robot Arm

Conference paper

Carrera A, Palomeras N, Hurtós N, Kormushev P, Carreras Met al., 2015, Cognitive System for Autonomous Underwater Intervention, Pattern Recognition Letters, ISSN: 0167-8655

Journal article

Jamali N, Kormushev P, Carrera A, Carreras M, Caldwell DGet al., 2015, Underwater Robot-Object Contact Perception using Machine Learning on Force/Torque Sensor Feedback

Conference paper

Ahmadzadeh SR, Paikan A, Mastrogiovanni F, Natale L, Kormushev P, Caldwell DGet al., 2015, Learning Symbolic Representations of Actions from Human Demonstrations

Conference paper

Carrera A, Palomeras N, Hurtos N, Kormushev P, Carreras Met al., 2015, Learning multiple strategies to perform a valve turning with underwater currents using an I-AUV

Conference paper

This data is extracted from the Web of Science and reproduced under a licence from Thomson Reuters. You may not copy or re-distribute this data in whole or in part without the written consent of the Science business of Thomson Reuters.

Request URL: http://wlsprd.imperial.ac.uk:80/respub/WEB-INF/jsp/search-html.jsp Request URI: /respub/WEB-INF/jsp/search-html.jsp Query String: respub-action=search.html&id=00873100&limit=30&person=true