96 results found
Rakicevic N, Kormushev P, 2019, Active learning via informed search in movement parameter space for efficient robot task learning and transfer, Autonomous Robots, ISSN: 0929-5593
Learning complex physical tasks via trial-and-error is still challenging for high-degree-of-freedom robots. Greatest challenges are devising a suitable objective function that defines the task, and the high sample complexity of learning the task. We propose a novel active learning framework, consisting of decoupled task model and exploration components, which does not require an objective function. The task model is specific to a task and maps the parameter space, defining a trial, to the trial outcome space. The exploration component enables efficient search in the trial-parameter space to generate the subsequent most informative trials, by simultaneously exploiting all the information gained from previous trials and reducing the task model’s overall uncertainty. We analyse the performance of our framework in a simulation environment and further validate it on a challenging bimanual-robot puck-passing task. Results show that the robot successfully acquires the necessary skills after only 100 trials without any prior information about the task or target positions. Decoupling the framework’s components also enables efficient skill transfer to new environments which is validated experimentally.
Kormushev P, Ugurlu B, Caldwell DG, et al., 2019, Learning to exploit passive compliance for energy-efficient gait generation on a compliant humanoid, AUTONOMOUS ROBOTS, Vol: 43, Pages: 79-95, ISSN: 0929-5593
Wang K, Shah A, Kormushev P, 2018, SLIDER: A Bipedal Robot with Knee-less Legs and Vertical Hip Sliding Motion, 21st International Conference on Climbing and Walking Robots and Support Technologies for Mobile Machines (CLAWAR 2018)
Saputra RP, Kormushev P, 2018, Casualty Detection from 3D Point Cloud Data for Autonomous Ground Mobile Rescue Robots
© 2018 IEEE. One of the most important features of mobile rescue robots is the ability to autonomously detect casualties, i.e. human bodies, which are usually lying on the ground. This paper proposes a novel method for autonomously detecting casualties lying on the ground using obtained 3D point-cloud data from an on-board sensor, such as an RGB-D camera or a 3D LIDAR, on a mobile rescue robot. In this method, the obtained 3D point-cloud data is projected onto the detected ground plane, i.e. floor, within the point cloud. Then, this projected point cloud is converted into a grid-map that is used afterwards as an input for the algorithm to detect human body shapes. The proposed method is evaluated by performing detections of a human dummy, placed in different random positions and orientations, using an on-board RGB-D camera on a mobile rescue robot called ResQbot. To evaluate the robustness of the casualty detection method to different camera angles, the orientation of the camera is set to different angles. The experimental results show that using the point-cloud data from the on-board RGB-D camera, the proposed method successfully detects the casualty in all tested body positions and orientations relative to the on-board camera, as well as in all tested camera angles.
Pardo F, Tavakoli A, Levdik V, et al., 2018, Time limits in reinforcement learning, International Conference on Machine Learning, Pages: 4042-4051
In reinforcement learning, it is common to let anagent interact for a fixed amount of time with itsenvironment before resetting it and repeating theprocess in a series of episodes. The task that theagent has to learn can either be to maximize itsperformance over (i) that fixed period, or (ii) anindefinite period where time limits are only usedduring training to diversify experience. In thispaper, we provide a formal account for how timelimits could effectively be handled in each of thetwo cases and explain why not doing so can causestate-aliasing and invalidation of experience re-play, leading to suboptimal policies and traininginstability. In case (i), we argue that the termi-nations due to time limits are in fact part of theenvironment, and thus a notion of the remainingtime should be included as part of the agent’s in-put to avoid violation of the Markov property. Incase (ii), the time limits are not part of the envi-ronment and are only used to facilitate learning.We argue that this insight should be incorporatedby bootstrapping from the value of the state atthe end of each partial episode. For both cases,we illustrate empirically the significance of ourconsiderations in improving the performance andstability of existing reinforcement learning algo-rithms, showing state-of-the-art results on severalcontrol tasks.
Saputra RP, Kormushev P, 2018, ResQbot: A Mobile Rescue Robot for Casualty Extraction, Pages: 239-240
© 2018 Authors. Performing search and rescue missions in disaster-struck environments is challenging. Despite the advances in the robotic search phase of the rescue missions, few works have been focused on the physical casualty extraction phase. In this work, we propose a mobile rescue robot that is capable of performing a safe casualty extraction routine. To perform this routine, this robot adopts a loco-manipulation approach. We have designed and built a mobile rescue robot platform called ResQbot as a proof of concept of the proposed system. We have conducted preliminary experiments using a sensorised human-sized dummy as a victim, to confirm that the platform is capable of performing a safe casualty extraction procedure.
Saputra RP, Kormushev P, 2018, ResQbot: A mobile rescue robot with immersive teleperception for casualty extraction, Pages: 209-220, ISSN: 0302-9743
© Springer International Publishing AG, part of Springer Nature 2018. In this work, we propose a novel mobile rescue robot equipped with an immersive stereoscopic teleperception and a teleoperation control. This robot is designed with the capability to perform safely a casualty-extraction procedure. We have built a proof-of-concept mobile rescue robot called ResQbot for the experimental platform. An approach called “loco-manipulation” is used to perform the casualty-extraction procedure using the platform. The performance of this robot is evaluated in terms of task accomplishment and safety by conducting a mock rescue experiment. We use a custom-made human-sized dummy that has been sensorised to be used as the casualty. In terms of safety, we observe several parameters during the experiment including impact force, acceleration, speed and displacement of the dummy’s head. We also compare the performance of the proposed immersive stereoscopic teleperception to conventional monocular teleperception. The results of the experiments show that the observed safety parameters are below key safety thresholds which could possibly lead to head or neck injuries. Moreover, the teleperception comparison results demonstrate an improvement in task-accomplishment performance when the operator is using the immersive teleperception.
Saputra RP, Kormushev P, 2018, Casualty detection for mobile rescue robots via ground-projected point clouds, Pages: 473-475, ISSN: 0302-9743
© Springer International Publishing AG, part of Springer Nature 2018. In order to operate autonomously, mobile rescue robots need to be able to detect human casualties in disaster situations. In this paper, we propose a novel method for autonomous detection of casualties lying down on the ground based on point-cloud data. This data can be obtained from different sensors, such as an RGB-D camera or a 3D LIDAR sensor. The method is based on a ground-projected point-cloud (GPPC) image to achieve human body shape detection. A preliminary experiment has been conducted using the RANSAC method for floor detection and, the HOG feature and the SVM classifier to detect human body shape. The results show that the proposed method succeeds to identify a casualty from point-cloud data in a wide range of viewing angles.
Pardo F, Tavakoli A, Levdik V, et al., 2018, Time limits in reinforcement learning, Pages: 6443-6452
©35th International Conference on Machine Learning, ICML 2018.All Rights Reserved. In reinforcement learning, it is common to let an agent interact for a fixed amount of time with its environment before resetting it and repeating the process in a series of episodes. The task that the agent has to learn can either be to maximize its performance over (i) that fixed period, or (ii) an indefinite period where time limits are only used during training to diversify experience. In this paper, we provide a formal account for how time limits could effectively be handled in each of the two cases and explain why not doing so can cause state-aliasing and invalidation of experience replay, leading to suboptimal policies and training instability. In case (i), we argue that the terminations due to time limits are in fact part of the environment, and thus a notion of the remaining time should be included as part of the agent's input to avoid violation of the Markov property. In case (ii), the time limits are not part of the environment and are only used to facilitate learning. We argue that this insight should be incorporated by bootstrapping from the value of the state at the end of each partial episode. For both cases, we illustrate empirically the significance of our considerations in improving the performance and stability of existing reinforcement learning algorithms, showing state-of-the-art results on several control tasks.
Tavakoli A, Pardo F, Kormushev P, 2018, Action branching architectures for deep reinforcement learning, Pages: 4131-4138
Copyright © 2018, Association for the Advancement of Artificial Intelligence (www.aaai.org). All rights reserved. Discrete-action algorithms have been central to numerous recent successes of deep reinforcement learning. However, applying these algorithms to high-dimensional action tasks requires tackling the combinatorial increase of the number of possible actions with the number of action dimensions. This problem is further exacerbated for continuous-action tasks that require fine control of actions via discretization. In this paper, we propose a novel neural architecture featuring a shared decision module followed by several network branches, one for each action dimension. This approach achieves a linear increase of the number of network outputs with the number of degrees of freedom by allowing a level of independence for each individual action dimension. To illustrate the approach, we present a novel agent, called Branching Dueling Q-Network (BDQ), as a branching variant of the Dueling Double Deep Q-Network (Dueling DDQN). We evaluate the performance of our agent on a set of challenging continuous control tasks. The empirical results show that the proposed agent scales gracefully to environments with increasing action dimensionality and indicate the significance of the shared decision module in coordination of the distributed action branches. Furthermore, we show that the proposed agent performs competitively against a state-of-the-art continuous control algorithm, Deep Deterministic Policy Gradient (DDPG).
Wang K, Shah A, Kormushev P, 2018, SLIDER: A novel bipedal walking robot without knees, Pages: 471-472, ISSN: 0302-9743
© Springer International Publishing AG, part of Springer Nature 2018. This extended abstract describes our work on SLIDER: a novel bipedal robot with knee-less legs and hip sliding motion. Compared with conventional anthropomorphic leg design, SLIDER’s leg design enables the robot to have very lightweight legs and is suitable to perform agile locomotion. To validate this design, we created a dynamics model and implemented a walk pattern generator capable of walking with a speed of 0.18 m/s in Gazebo. Currently, a physical prototype is under construction for real-world testing. The initial mechanical design and the control strategy for SLIDER are introduced.
Kanajar P, Caldwell DG, Kormushev P, 2017, Climbing over large obstacles with a humanoid robot via multi-contact motion planning, IEEE RO-MAN 2017: 26th IEEE International Symposium on Robot and Human Interactive Communication, Publisher: IEEE, Pages: 1202-1209
Incremental progress in humanoid robot locomotion over the years has achieved important capabilities such as navigation over flat or uneven terrain, stepping over small obstacles and climbing stairs. However, the locomotion research has mostly been limited to using only bipedal gait and only foot contacts with the environment, using the upper body for balancing without considering additional external contacts. As a result, challenging locomotion tasks like climbing over large obstacles relative to the size of the robot have remained unsolved. In this paper, we address this class of open problems with an approach based on multi-body contact motion planning guided through physical human demonstrations. Our goal is to make the humanoid locomotion problem more tractable by taking advantage of objects in the surrounding environment instead of avoiding them. We propose a multi-contact motion planning algorithm for humanoid robot locomotion which exploits the whole-body motion and multi-body contacts including both the upper and lower body limbs. The proposed motion planning algorithm is applied to a challenging task of climbing over a large obstacle. We demonstrate successful execution of the climbing task in simulation using our multi-contact motion planning algorithm initialized via a transfer from real-world human demonstrations of the task and further optimized.
Tavakoli A, Pardo F, Kormushev P, 2017, Action Branching Architectures for Deep Reinforcement Learning, Deep Reinforcement Learning Symposium, 31st Conference on Neural Information Processing Systems (NIPS 2017)
Rakicevic N, Kormushev P, 2017, Efficient Robot Task Learning and Transfer via Informed Search in Movement Parameter Space, Workshop on Acting and Interacting in the Real World: Challenges in Robot Learning, 31st Conference on Neural Information Processing Systems (NIPS 2017)
Palomeras N, Carrera A, Hurtos N, et al., 2016, Toward persistent autonomous intervention in a subsea panel, AUTONOMOUS ROBOTS, Vol: 40, Pages: 1279-1306, ISSN: 0929-5593
Jamisola RS, Kormushev PS, Roberts RG, et al., 2016, Task-Space Modular Dynamics for Dual-Arms Expressed through a Relative Jacobian, JOURNAL OF INTELLIGENT & ROBOTIC SYSTEMS, Vol: 83, Pages: 205-218, ISSN: 0921-0296
Maurelli F, Lane D, Kormushev P, et al., 2016, The PANDORA project: a success story in AUV autonomy, OCEANS Conference, Publisher: IEEE, ISSN: 0197-7385
Carrera A, Palomeras N, Hurtos N, et al., 2015, Cognitive system for autonomous underwater intervention, PATTERN RECOGNITION LETTERS, Vol: 67, Pages: 91-99, ISSN: 0167-8655
Lane DM, Maurelli F, Kormushev P, et al., 2015, PANDORA - Persistent autonomy through learning, adaptation, observation and replanning, Pages: 238-243
© 2015, IFAC (International Federation of Automatic Control) Hosting by Elsevier Ltd. All rights reserved. PANDORA is a EU FP7 project that is developing new computational methods to make underwater robots Persistently Autonomous, significantly reducing the frequency of assistance requests. The aim of the project is to extend the range of tasks that can be carried on autonomously and increase their complexity while reducing the need for operator assistances. Dynamic adaptation to the change of conditions is very important while addressing autonomy in the real world and not just in well-known situation. The key of Pandora is the ability to recognise failure and respond to it, at all levels of abstraction. Under the guidance of major industrial players, validation tasks of inspection, cleaning and valve turning have been trialled with partners' AUVs in Scotland and Spain.
Bimbo J, Kormushev P, Althoefer K, et al., 2015, Global estimation of an object's pose using tactile sensing, ADVANCED ROBOTICS, Vol: 29, Pages: 363-374, ISSN: 0169-1864
Takano W, Asfour T, Kormushev P, 2015, Special Issue on Humanoid Robotics PREFACE, ADVANCED ROBOTICS, Vol: 29, Pages: 301-301, ISSN: 0169-1864
Kormushev P, Ahmadzadeh SR, 2015, Robot learning for persistent autonomy, Studies in Systems, Decision and Control, Pages: 3-28
© Springer International Publishing Switzerland 2015. Autonomous robots are not very good at being autonomous. They work well in structured environments, but fail quickly in the real world facing uncertainty and dynamically changing conditions. In this chapter, we describe robot learning approaches that help to elevate robot autonomy to the next level, the so-called ‘persistent autonomy’. For a robot to be ‘persistently autonomous’ means to be able to perform missions over extended time periods (e.g. days or months) in dynamic, uncertain environments without need for human assistance. In particular, persistent autonomy is extremely important for robots in difficult-to-reach environments such as underwater, rescue, and space robotics. There are many facets of persistent autonomy, such as: coping with uncertainty, reacting to changing conditions, disturbance rejection, fault tolerance, energy efficiency and so on. This chapter presents a collection of robot learning approaches that addressmany of these facets. Experimentswith robot manipulators and autonomous underwater vehicles demonstrate the usefulness of these learning approaches in real world scenarios.
Kryczka P, Kormushev P, Tsagarakis NG, et al., 2015, Online Regeneration of Bipedal Walking Gait Pattern Optimizing Footstep Placement and Timing, IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), Publisher: IEEE, Pages: 3352-3357, ISSN: 2153-0858
Kormushev P, Demiris Y, Caldwell DG, 2015, Kinematic-free Position Control of a 2-DOF Planar Robot Arm, IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), Publisher: IEEE, Pages: 5518-5525, ISSN: 2153-0858
Kormushev P, Demiris Y, Caldwell DG, 2015, Encoderless Position Control of a Two-Link Robot Manipulator, IEEE International Conference on Robotics and Automation (ICRA), Publisher: IEEE COMPUTER SOC, Pages: 943-949, ISSN: 1050-4729
Ahmadzadeh SR, Paikan A, Mastrogiovanni F, et al., 2015, Learning Symbolic Representations of Actions from Human Demonstrations, IEEE International Conference on Robotics and Automation (ICRA), Publisher: IEEE COMPUTER SOC, Pages: 3801-3808, ISSN: 1050-4729
Ahmadzadeh SR, Kormushev P, 2015, Visuospatial skill learning, Studies in Systems, Decision and Control, Pages: 75-99
© Springer International Publishing Switzerland 2015. This chapter introduces Visuospatial Skill Learning (VSL), which is a novel interactive robot learning approach. VSL is based on visual perception that allows a robot to acquire new skills by observing a single demonstration while interacting with a tutor. The focus of VSL is placed on achieving a desired goal configuration of objects relative to another. VSL captures the object’s context for each demonstrated action. This context is the basis of the visuospatial representation and encodes implicitly the relative positioning of the object with respect to multiple other objects simultaneously. VSL is capable of learning and generalizing multi-operation skills from a single demonstration, while requiring minimum a priori knowledge about the environment. Different capabilities of VSL such as learning and generalization of object reconfiguration, classification, and turn-taking interaction are illustrated through both simulation and real-world experiments.
Jamali N, Kormushev P, Vinas AC, et al., 2015, Underwater Robot-Object Contact Perception using Machine Learning on Force/Torque Sensor Feedback, IEEE International Conference on Robotics and Automation (ICRA), Publisher: IEEE COMPUTER SOC, Pages: 3915-3920, ISSN: 1050-4729
Carrera A, Palomeras N, Hurtos N, et al., 2015, Learning multiple strategies to perform a valve turning with underwater currents using an I-AUV, Oceans 2015 Genova, Publisher: IEEE
Jamisola RS, Kormushev P, Caldwell DG, et al., 2015, Modular Relative Jacobian for Dual-Arms and the Wrench Transformation Matrix, Proceedings of the 2015 7th IEEE International Conference on Cybernetics and Intelligent Systems (CIS) And Robotics, Automation and Mechatronics (RAM), Publisher: IEEE
This data is extracted from the Web of Science and reproduced under a licence from Thomson Reuters. You may not copy or re-distribute this data in whole or in part without the written consent of the Science business of Thomson Reuters.