Search or filter publications

Filter by type:

Filter by publication type

Filter by year:

to

Results

  • Showing results for:
  • Reset all filters

Search results

  • CONFERENCE PAPER
    Saputra RP, Kormushev P, 2018,

    Casualty Detection from 3D Point Cloud Data for Autonomous Ground Mobile Rescue Robots

    © 2018 IEEE. One of the most important features of mobile rescue robots is the ability to autonomously detect casualties, i.e. human bodies, which are usually lying on the ground. This paper proposes a novel method for autonomously detecting casualties lying on the ground using obtained 3D point-cloud data from an on-board sensor, such as an RGB-D camera or a 3D LIDAR, on a mobile rescue robot. In this method, the obtained 3D point-cloud data is projected onto the detected ground plane, i.e. floor, within the point cloud. Then, this projected point cloud is converted into a grid-map that is used afterwards as an input for the algorithm to detect human body shapes. The proposed method is evaluated by performing detections of a human dummy, placed in different random positions and orientations, using an on-board RGB-D camera on a mobile rescue robot called ResQbot. To evaluate the robustness of the casualty detection method to different camera angles, the orientation of the camera is set to different angles. The experimental results show that using the point-cloud data from the on-board RGB-D camera, the proposed method successfully detects the casualty in all tested body positions and orientations relative to the on-board camera, as well as in all tested camera angles.

  • CONFERENCE PAPER
    Dutordoir V, Salimbeni H, Deisenroth M, Hensman Jet al., 2018,

    Gaussian Process Conditional Density Estimation

    Conditional Density Estimation (CDE) models deal with estimating conditionaldistributions. The conditions imposed on the distribution are the inputs of themodel. CDE is a challenging task as there is a fundamental trade-off betweenmodel complexity, representational capacity and overfitting. In this work, wepropose to extend the model's input with latent variables and use Gaussianprocesses (GP) to map this augmented input onto samples from the conditionaldistribution. Our Bayesian approach allows for the modeling of small datasets,but we also provide the machinery for it to be applied to big data usingstochastic variational inference. Our approach can be used to model densitieseven in sparse data regions, and allows for sharing learned structure betweenconditions. We illustrate the effectiveness and wide-reaching applicability ofour model on a variety of real-world problems, such as spatio-temporal densityestimation of taxi drop-offs, non-Gaussian noise modeling, and few-shotlearning on omniglot images.

  • CONFERENCE PAPER
    Wang K, Shah A, Kormushev P, 2018,

    SLIDER: A Bipedal Robot with Knee-less Legs and Vertical Hip Sliding Motion

  • JOURNAL ARTICLE
    Creswell A, Bharath AA, 2018,

    Denoising Adversarial Autoencoders.

    , IEEE Trans Neural Netw Learn Syst

    Unsupervised learning is of growing interest because it unlocks the potential held in vast amounts of unlabeled data to learn useful representations for inference. Autoencoders, a form of generative model, may be trained by learning to reconstruct unlabeled input data from a latent representation space. More robust representations may be produced by an autoencoder if it learns to recover clean input samples from corrupted ones. Representations may be further improved by introducing regularization during training to shape the distribution of the encoded data in the latent space. We suggest denoising adversarial autoencoders (AAEs), which combine denoising and regularization, shaping the distribution of latent space using adversarial training. We introduce a novel analysis that shows how denoising may be incorporated into the training and sampling of AAEs. Experiments are performed to assess the contributions that denoising makes to the learning of representations for classification and sample synthesis. Our results suggest that autoencoders trained using a denoising criterion achieve higher classification performance and can synthesize samples that are more consistent with the input data than those trained without a corruption process.

  • CONFERENCE PAPER
    Sæmundsson S, Hofmann K, Deisenroth MP, 2018,

    Meta reinforcement learning with latent variable Gaussian processes

    , Uncertainty in Artificial Intelligence (UAI) 2018, Publisher: Association for Uncertainty in Artificial Intelligence (AUAI)

    Learning from small data sets is critical inmany practical applications where data col-lection is time consuming or expensive, e.g.,robotics, animal experiments or drug design.Meta learning is one way to increase the dataefficiency of learning algorithms by general-izing learned concepts from a set of trainingtasks to unseen, but related, tasks. Often, thisrelationship between tasks is hard coded or re-lies in some other way on human expertise.In this paper, we frame meta learning as a hi-erarchical latent variable model and infer therelationship between tasks automatically fromdata. We apply our framework in a model-based reinforcement learning setting and showthat our meta-learning model effectively gen-eralizes to novel tasks by identifying how newtasks relate to prior ones from minimal data.This results in up to a60%reduction in theaverage interaction time needed to solve taskscompared to strong baselines.

  • CONFERENCE PAPER
    Pardo F, Tavakoli A, Levdik V, Kormushev Pet al., 2018,

    Time limits in reinforcement learning

    , International Conference on Machine Learning, Pages: 4042-4051

    In reinforcement learning, it is common to let anagent interact for a fixed amount of time with itsenvironment before resetting it and repeating theprocess in a series of episodes. The task that theagent has to learn can either be to maximize itsperformance over (i) that fixed period, or (ii) anindefinite period where time limits are only usedduring training to diversify experience. In thispaper, we provide a formal account for how timelimits could effectively be handled in each of thetwo cases and explain why not doing so can causestate-aliasing and invalidation of experience re-play, leading to suboptimal policies and traininginstability. In case (i), we argue that the termi-nations due to time limits are in fact part of theenvironment, and thus a notion of the remainingtime should be included as part of the agent’s in-put to avoid violation of the Markov property. Incase (ii), the time limits are not part of the envi-ronment and are only used to facilitate learning.We argue that this insight should be incorporatedby bootstrapping from the value of the state atthe end of each partial episode. For both cases,we illustrate empirically the significance of ourconsiderations in improving the performance andstability of existing reinforcement learning algo-rithms, showing state-of-the-art results on severalcontrol tasks.

  • CONFERENCE PAPER
    Ceran ET, Gunduz D, Gyorgy A, 2018,

    Average age of information with hybrid ARQ under a resource constraint

    , Wireless Communications and Networking Conference (WCNC), Publisher: IEEE, ISSN: 1525-3511

    Scheduling the transmission of status updates over an error-prone communication channel is studied in order to minimize the long-term average age of information (AoI) at the destination under a constraint on the average number of transmissions at the source node. After each transmission, the source receives an instantaneous ACK/NACK feedback, and decides on the next update without prior knowledge on the success of future transmissions. First, the optimal scheduling policy is studied under different feedback mechanisms when the channel statistics are known; in particular, the standard automatic repeat request (ARQ) and hybrid ARQ (HARQ) protocols are considered. Then, for an unknown environment, an average-cost reinforcement learning (RL) algorithm is proposed that learns the system parameters and the transmission policy in real time. The effectiveness of the proposed methods are verified through numerical simulations.

  • CONFERENCE PAPER
    Kamthe S, Deisenroth MP, 2018,

    Data-Efficient Reinforcement Learning with Probabilistic Model Predictive Control.

    , Artificial Intelligence and Statistics, Publisher: PMLR, Pages: 1701-1710
  • CONFERENCE PAPER
    Saputra RP, Kormushev P, 2018,

    ResQbot: A Mobile Rescue Robot for Casualty Extraction

    , Pages: 239-240

    © 2018 Authors. Performing search and rescue missions in disaster-struck environments is challenging. Despite the advances in the robotic search phase of the rescue missions, few works have been focused on the physical casualty extraction phase. In this work, we propose a mobile rescue robot that is capable of performing a safe casualty extraction routine. To perform this routine, this robot adopts a loco-manipulation approach. We have designed and built a mobile rescue robot platform called ResQbot as a proof of concept of the proposed system. We have conducted preliminary experiments using a sensorised human-sized dummy as a victim, to confirm that the platform is capable of performing a safe casualty extraction procedure.

  • JOURNAL ARTICLE
    Kormushev P, Ugurlu B, Caldwell DG, Tsagarakis NGet al., 2018,

    Learning to exploit passive compliance for energy-efficient gait generation on a compliant humanoid

    , Autonomous Robots, Pages: 1-17, ISSN: 0929-5593

    © 2018 Springer Science+Business Media, LLC, part of Springer Nature Modern humanoid robots include not only active compliance but also passive compliance. Apart from improved safety and dependability, availability of passive elements, such as springs, opens up new possibilities for improving the energy efficiency. With this in mind, this paper addresses the challenging open problem of exploiting the passive compliance for the purpose of energy efficient humanoid walking. To this end, we develop a method comprising two parts: an optimization part that finds an optimal vertical center-of-mass trajectory, and a walking pattern generator part that uses this trajectory to produce a dynamically-balanced gait. For the optimization part, we propose a reinforcement learning approach that dynamically evolves the policy parametrization during the learning process. By gradually increasing the representational power of the policy parametrization, it manages to find better policies in a faster and computationally efficient way. For the walking generator part, we develop a variable-center-of-mass-height ZMP-based bipedal walking pattern generator. The method is tested in real-world experiments with the bipedal robot COMAN and achieves a significant 18% reduction in the electric energy consumption by learning to efficiently use the passive compliance of the robot.

This data is extracted from the Web of Science and reproduced under a licence from Thomson Reuters. You may not copy or re-distribute this data in whole or in part without the written consent of the Science business of Thomson Reuters.

Request URL: http://wlsprd.imperial.ac.uk:80/respub/WEB-INF/jsp/search-t4-html.jsp Request URI: /respub/WEB-INF/jsp/search-t4-html.jsp Query String: id=954&limit=10&respub-action=search.html Current Millis: 1544806218332 Current Time: Fri Dec 14 16:50:18 GMT 2018