Below is a list of all relevant publications authored by Robotics Forum members.

Search or filter publications

Filter by type:

Filter by publication type

Filter by year:

to

Results

  • Showing results for:
  • Reset all filters

Search results

  • Journal article
    Grillotti L, Cully A, 2022,

    Unsupervised behaviour discovery with quality-diversity optimisation

    , IEEE Transactions on Evolutionary Computation, Vol: 26, Pages: 1539-1552, ISSN: 1089-778X

    Quality-Diversity algorithms refer to a class of evolutionary algorithms designed to find a collection of diverse and high-performing solutions to a given problem. In robotics, such algorithms can be used for generating a collection of controllers covering most of the possible behaviours of a robot. To do so, these algorithms associate a behavioural descriptor to each of these behaviours. Each behavioural descriptor is used for estimating the novelty of one behaviour compared to the others. In most existing algorithms, the behavioural descriptor needs to be hand-coded, thus requiring prior knowledge about the task to solve. In this paper, we introduce: Autonomous Robots Realising their Abilities, an algorithm that uses a dimensionality reduction technique to automatically learn behavioural descriptors based on raw sensory data. The performance of this algorithm is assessed on three robotic tasks in simulation. The experimental results show that it performs similarly to traditional hand-coded approaches without the requirement to provide any hand-coded behavioural descriptor. In the collection of diverse and high-performing solutions, it also manages to find behaviours that are novel with respect to more features than its hand-coded baselines. Finally, we introduce a variant of the algorithm which is robust to the dimensionality of the behavioural descriptor space.

  • Conference paper
    Lim BWT, Grillotti L, Bernasconi L, Cully Aet al., 2022,

    Dynamics-aware quality-diversity for efficient learning of skill repertoires

    , IEEE International Conference on Robotics and Automation, Publisher: IEEE, Pages: 5360-5366

    Quality-Diversity (QD) algorithms are powerful exploration algorithms that allow robots to discover large repertoires of diverse and high-performing skills. However, QD algorithms are sample inefficient and require millionsof evaluations. In this paper, we propose Dynamics-Aware Quality-Diversity (DA-QD), a framework to improve the sample efficiency of QD algorithms through the use of dynamics models. We also show how DA-QD can then be used for continual acquisition of new skill repertoires. To do so, weincrementally train a deep dynamics model from experience obtained when performing skill discovery using QD. We can then perform QD exploration in imagination with an imagined skill repertoire. We evaluate our approach on three robotic experiments. First, our experiments show DA-QD is 20 timesmore sample efficient than existing QD approaches for skill discovery. Second, we demonstrate learning an entirely new skill repertoire in imagination to perform zero-shot learning. Finally, we show how DA-QD is useful and effective for solving a long horizon navigation task and for damage adaptation in the real world. Videos and source code are available at: https://sites.google.com/view/da-qd.

  • Conference paper
    Lim BWT, Reichenbach A, Cully A, 2022,

    Learning to walk autonomously via reset-free quality-diversity

    , The Genetic and Evolutionary Computation Conference (GECCO)

    Quality-Diversity (QD) algorithms can discover large and complex behavioural repertoires consisting of both diverse and high-performing skills. However, the generation of behavioural repertoires has mainly been limited to simulation environments instead of real-world learning. This is because existing QD algorithms need large numbers of evaluations as well as episodic resets, which require manual human supervision and interventions. This paper proposes Reset-Free Quality-Diversity optimization (RF-QD) as a step towards autonomous learning for robotics in open-ended environments. We build on Dynamics-Aware Quality-Diversity (DA-QD) and introduce a behaviour selection policy that leverages the diversity of the imagined repertoire and environmental information to intelligently select of behaviours that can act as automatic resets. We demonstrate this through a task of learning to walk within defined training zones with obstacles. Our experiments show that we can learn full repertoires of legged locomotion controllers autonomously without manual resets with high sample efficiency in spite of harsh safety constraints. Finally, using an ablation of different target objectives, we show that it is important for RF-QD to have diverse types solutions available for the behaviour selection policy over solutions optimised with a specific objective. Videos and code available at this https URL.

  • Conference paper
    Allard M, Smith Bize S, Chatzilygeroudis K, Cully Aet al., 2022,

    Hierarchical Quality-Diversity For Online Damage Recovery

    , The Genetic and Evolutionary Computation Conference, Publisher: ACM

    Adaptation capabilities, like damage recovery, are crucial for the deployment of robots in complex environments. Several works have demonstrated that using repertoires of pre-trained skills can enable robots to adapt to unforeseen mechanical damages in a few minutes. These adaptation capabilities are directly linked to the behavioural diversity in the repertoire. The more alternatives the robot has to execute a skill, the better are the chances that it can adapt to a new situation. However, solving complex tasks, like maze navigation, usually requires multiple different skills. Finding a large behavioural diversity for these multiple skills often leads to an intractable exponential growth of the number of required solutions.In this paper, we introduce the Hierarchical Trial and Error algorithm, which uses a hierarchical behavioural repertoire to learn diverse skills and leverages them to make the robot more adaptive to different situations. We show that the hierarchical decomposition of skills enables the robot to learn more complex behaviours while keeping the learning of the repertoire tractable. The experiments with a hexapod robot show that our method solves maze navigation tasks with 20% less actions in the most challenging scenarios than the best baseline while having 57% less complete failures.

  • Conference paper
    Grillotti L, Cully A, 2022,

    Relevance-guided unsupervised discovery of abilities with quality-diversity algorithms

    , Genetic and Evolutionary Computation Conference (GECCO), Publisher: ACM, Pages: 77-85

    Quality-Diversity algorithms provide efficient mechanisms to generate large collections of diverse and high-performing solutions, which have shown to be instrumental for solving downstream tasks. However, most of those algorithms rely on a behavioural descriptor to characterise the diversity that is hand-coded, hence requiring prior knowledge about the considered tasks. In this work, we introduce Relevance-guided Unsupervised Discovery of Abilities; a Quality-Diversity algorithm that autonomously finds a behavioural characterisation tailored to the task at hand. In particular, our method introduces a custom diversity metric that leads to higher densities of solutions near the areas of interest in the learnt behavioural descriptor space. We evaluate our approach on a simulated robotic environment, where the robot has to autonomously discover its abilities based on its full sensory data. We evaluated the algorithms on three tasks: navigation to random targets, moving forward with a high velocity, and performing half-rolls. The experimental results show that our method manages to discover collections of solutions that are not only diverse, but also well-adapted to the considered downstream task.

  • Journal article
    Zhang F, Demiris Y, 2022,

    Learning garment manipulation policies toward robot-assisted dressing.

    , Science Robotics, Vol: 7, Pages: eabm6010-eabm6010, ISSN: 2470-9476

    Assistive robots have the potential to support people with disabilities in a variety of activities of daily living, such as dressing. People who have completely lost their upper limb movement functionality may benefit from robot-assisted dressing, which involves complex deformable garment manipulation. Here, we report a dressing pipeline intended for these people and experimentally validate it on a medical training manikin. The pipeline is composed of the robot grasping a hospital gown hung on a rail, fully unfolding the gown, navigating around a bed, and lifting up the user's arms in sequence to finally dress the user. To automate this pipeline, we address two fundamental challenges: first, learning manipulation policies to bring the garment from an uncertain state into a configuration that facilitates robust dressing; second, transferring the deformable object manipulation policies learned in simulation to real world to leverage cost-effective data generation. We tackle the first challenge by proposing an active pre-grasp manipulation approach that learns to isolate the garment grasping area before grasping. The approach combines prehensile and nonprehensile actions and thus alleviates grasping-only behavioral uncertainties. For the second challenge, we bridge the sim-to-real gap of deformable object policy transfer by approximating the simulator to real-world garment physics. A contrastive neural network is introduced to compare pairs of real and simulated garment observations, measure their physical similarity, and account for simulator parameters inaccuracies. The proposed method enables a dual-arm robot to put back-opening hospital gowns onto a medical manikin with a success rate of more than 90%.

  • Journal article
    Cursi F, Bai W, Yeatman EM, Kormushev Pet al., 2022,

    GlobDesOpt: a global optimization framework for optimal robot manipulator design

    , IEEE Access, Vol: 10, Pages: 5012-5023, ISSN: 2169-3536

    Robot design is a major component in robotics, as it allows building robots capable of performing properly in given tasks. However, designing a robot with multiple types of parameters and constraints and defining an optimization function analytically for the robot design problem may be intractable or even impossible. Therefore black-box optimization approaches are generally preferred. In this work we propose GlobDesOpt, a simple-to-use open-source optimization framework for robot design based on global optimization methods. The framework allows selecting various design parameters and optimizing for both single and dual-arm robots. The functionalities of the framework are shown here to optimally design a dual-arm surgical robot, comparing the different two optimization strategies.

  • Journal article
    AlAttar A, Chappell D, Kormushev P, 2022,

    Kinematic-model-free predictive control for robotic manipulator target reaching with obstacle avoidance

    , Frontiers in Robotics and AI, Vol: 9, Pages: 1-9, ISSN: 2296-9144

    Model predictive control is a widely used optimal control method for robot path planning andobstacle avoidance. This control method, however, requires a system model to optimize controlover a finite time horizon and possible trajectories. Certain types of robots, such as softrobots, continuum robots, and transforming robots, can be challenging to model, especiallyin unstructured or unknown environments. Kinematic-model-free control can overcome thesechallenges by learning local linear models online. This paper presents a novel perception-basedrobot motion controller, the kinematic-model-free predictive controller, that is capable of controllingrobot manipulators without any prior knowledge of the robot’s kinematic structure and dynamicparameters and is able to perform end-effector obstacle avoidance. Simulations and physicalexperiments were conducted to demonstrate the ability and adaptability of the controller toperform simultaneous target reaching and obstacle avoidance.

  • Journal article
    Wang K, Fei H, Kormushev P, 2022,

    Fast online optimization for terrain-blind bipedal robot walking with a decoupled actuated SLIP model

    , Frontiers in Robotics and AI, Vol: 9, Pages: 1-11, ISSN: 2296-9144

    We present an online optimization algorithm which enables bipedal robots to blindly walk overvarious kinds of uneven terrains while resisting pushes. The proposed optimization algorithmperforms high level motion planning of footstep locations and center-of-mass height variationsusing the decoupled actuated Spring Loaded Inverted Pendulum (aSLIP) model. The decoupledaSLIP model simplifies the original aSLIP with Linear Inverted Pendulum (LIP) dynamics inhorizontal states and spring dynamics in the vertical state. The motion planning can beformulated as a discrete-time Model Predictive Control (MPC) problem and solved at a frequencyof 1 kHz. The output of the motion planner is fed into an inverse-dynamics based whole bodycontroller for execution on the robot. A key result of this controller is that the feet of the robot arecompliant, which further extends the robot’s ability to be robust to unobserved terrain variations.We evaluate our method in simulation with the bipedal robot SLIDER. Results show the robotcan blindly walk over various uneven terrains including slopes, wave fields and stairs. It can alsoresist pushes of up to 40 N for a duration of 0.1 s while walking on uneven terrain.

  • Conference paper
    Cursi F, Chappell D, Kormushev P, 2022,

    Augmenting loss functions of feedforward neural networks with differential relationships for robot kinematic modelling

    , Ljubljana, Slovenia, 20th International Conference on Advanced Robotics (ICAR), Publisher: IEEE, Pages: 201-207

    Model learning is a crucial aspect of robotics as it enables the use of traditional and consolidated model-based controllers to perform desired motion tasks. However, due to the increasing complexity of robotic structures, modelling robots is becoming more and more challenging, and analytical models are very difficult to build, particularly for redundant robots. Machine learning approaches have shown great capabilities in learning complex mapping and have widely been used in robot model learning and control. Generally, inverse kinematics is learned, directly obtaining the desired control commands given a desired task. However, learning forward kinematics is simpler and allows the computation of the robot Jacobian and enables the exploitation of the optimality of controllers. Nevertheless, typical learning methods have no knowledge about the differential relationship between the position and velocity mappings. In this work, we present two novel loss functions to train feedforward Artificial Neural network (ANN) which incorporate this information in learning the forward kinematic model of robotic structures, and carry out a comparison with standard ANN training using position data only. Simulation results show that incorporating the knowledge of the velocity mapping improves the suitability of the learnt model for control tasks.

  • Conference paper
    Cursi F, Kormushev P, 2021,

    Pre-operative offline optimization of insertion point location for safe and accurate surgical task execution

    , Prague, Czech Republic, IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS 2021), Publisher: IEEE, Pages: 4040-4047

    In robotically assisted surgical procedures thesurgical tool is usually inserted in the patient’s body througha small incision, which acts as a constraint for the motionof the robot, known as remote center of Motion (RCM). Thelocation of the insertion point on the patient’s body has hugeeffects on the performances of the surgical robot. In this workwe present an offline pre-operative framework to identify theoptimal insertion point location in order to guarantee accurateand safe surgical task execution. The approach is validatedusing a serial-link manipulator in conjunction with a surgicalrobotic tool to perform a tumor resection task, while avoidingnearby organs. Results show that the framework is capable ofidentifying the best insertion point ensuring high dexterity, hightracking accuracy, and safety in avoiding nearby organs.

  • Conference paper
    Cursi F, Bai W, Kormushev P, 2021,

    Kalibrot: a simple-to-use Matlab package for robot kinematic calibration

    , Prague, Czech Republic, International Conference on Intelligent Robots and Systems (IROS) 2021, Pages: 8852-8859

    Robot modelling is an essential part to properlyunderstand how a robotic system moves and how to controlit. The kinematic model of a robot is usually obtained byusing Denavit-Hartenberg convention, which relies on a set ofparameters to describe the end-effector pose in a Cartesianspace. These parameters are assigned based on geometricalconsiderations of the robotic structure, however, the assignedvalues may be inaccurate. The purpose of robot kinematiccalibration is therefore to find optimal parameters whichimprove the accuracy of the robot model. In this work wepresent Kalibrot, an open source Matlab package for robotkinematic calibration. Kalibrot has been designed to simplifyrobot calibration and easily assess the calibration results. Besidecomputing the optimal parameters, Kalibrot provides a visualization layer showing the values of the calibrated parameters,what parameters can be identified, and the calibrated roboticstructure. The capabilities of the package are here shownthrough simulated and real world experiments.

  • Conference paper
    La Barbera V, Pardo F, Tassa Y, Daley M, Richards C, Kormushev P, Hutchinson Jet al., 2021,

    OstrichRL: a musculoskeletal ostrich simulation to study bio-mechanical locomotion

    , NeurIPS 2021

    Muscle-actuated control is a research topic of interest spanning different fields, inparticular biomechanics, robotics and graphics. This type of control is particularlychallenging because models are often overactuated, and dynamics are delayed andnon-linear. It is however a very well tested and tuned actuation model that hasundergone millions of years of evolution and that involves interesting propertiesexploiting passive forces of muscle-tendon units and efficient energy storage andrelease. To facilitate research on muscle-actuated simulation, we release a 3Dmusculoskeletal simulation of an ostrich based on the MuJoCo simulator. Ostrichesare one of the fastest bipeds on earth and are therefore an excellent model forstudying muscle-actuated bipedal locomotion. The model is based on CT scans anddissections used to gather actual muscle data such as insertion sites, lengths andpennation angles. Along with this model, we also provide a set of reinforcementlearning tasks, including reference motion tracking and a reaching task with theneck. The reference motion data are based on motion capture clips of variousbehaviors which we pre-processed and adapted to our model. This paper describeshow the model was built and iteratively improved using the tasks. We evaluate theaccuracy of the muscle actuation patterns by comparing them to experimentallycollected electromyographic data from locomoting birds. We believe that this workcan be a useful bridge between the biomechanics, reinforcement learning, graphicsand robotics communities, by providing a fast and easy to use simulation.

  • Conference paper
    Wang K, Saputra RP, Foster JP, Kormushev Pet al., 2021,

    Improved energy efficiency via parallel elastic elements for the straight-legged vertically-compliant robot SLIDER

    , Japan, 24th International Conference on Climbing and Walking Robots and the Support Technologies for Mobile Machines, Publisher: Springer, Pages: 129-140

    Most state-of-the-art bipedal robots are designed to be anthropomorphic, and therefore possess articulated legs with knees. Whilstthis facilitates smoother, human-like locomotion, there are implementation issues that make walking with straight legs difficult. Many robotshave to move with a constant bend in the legs to avoid a singularityoccurring at the knee joints. The actuators must constantly work tomaintain this stance, which can result in the negation of energy-savingtechniques employed. Furthermore, vertical compliance disappears whenthe leg is straight and the robot undergoes high-energy loss events such asimpacts from running and jumping, as the impact force travels throughthe fully extended joints to the hips. In this paper, we attempt to improve energy efficiency in a simple yet effective way: attaching bungeecords as elastic elements in parallel to the legs of a novel, knee-less bipedrobot SLIDER, and show that the robot’s prismatic hip joints preservevertical compliance despite the legs being constantly straight. Due tothe nonlinear dynamics of the bungee cords and various sources of friction, Bayesian Optimization is utilized to find the optimals configurationof bungee cords that achieves the largest reduction in energy consumption. The optimal solution found saves 15% of the energy consumptioncompared to the robot configuration without parallel elastic elements.Additional Video: https://youtu.be/ZTaG9−Dz8A

  • Conference paper
    Cully A, 2021,

    Multi-Emitter MAP-Elites: Improving quality, diversity and convergence speed with heterogeneous sets of emitters

    , Genetic and Evolutionary Computation Conference (GECCO), Publisher: ACM, Pages: 84-92

    Quality-Diversity (QD) optimisation is a new family of learning algorithmsthat aims at generating collections of diverse and high-performing solutions.Among those algorithms, MAP-Elites is a simple yet powerful approach that hasshown promising results in numerous applications. In this paper, we introduce anovel algorithm named Multi-Emitter MAP-Elites (ME-MAP-Elites) that improvesthe quality, diversity and convergence speed of MAP-Elites. It is based on therecently introduced concept of emitters, which are used to drive thealgorithm's exploration according to predefined heuristics. ME-MAP-Elitesleverages the diversity of a heterogeneous set of emitters, in which eachemitter type is designed to improve differently the optimisation process.Moreover, a bandit algorithm is used to dynamically find the best emitter setdepending on the current situation. We evaluate the performance ofME-MAP-Elites on six tasks, ranging from standard optimisation problems (in 100dimensions) to complex locomotion tasks in robotics. Our comparisons againstMAP-Elites and existing approaches using emitters show that ME-MAP-Elites isfaster at providing collections of solutions that are significantly morediverse and higher performing. Moreover, in the rare cases where no fruitfulsynergy can be found between the different emitters, ME-MAP-Elites isequivalent to the best of the compared algorithms.

  • Conference paper
    Rakicevic N, Cully A, Kormushev P, 2021,

    Policy manifold search: exploring the manifold hypothesis for diversity-based neuroevolution

    , Genetic and Evolutionary Computation Conference (GECCO '21), Pages: 901-909

    Neuroevolution is an alternative to gradient-based optimisation that has the potential to avoid local minima and allows parallelisation. The main limiting factor is that usually it does not scale well with parameter space dimensionality. Inspired by recent work examining neural network intrinsic dimension and loss landscapes, we hypothesise that there exists a low-dimensional manifold, embedded in the policy network parameter space, around which a high-density of diverse and useful policies are located. This paper proposes a novel method for diversity-based policy search via Neuroevolution, that leverages learned representations of the policy network parameters, by performing policy search in this learned representation space. Our method relies on the Quality-Diversity (QD) framework which provides a principled approach to policy search, and maintains a collection of diverse policies, used as a dataset for learning policy representations. Further, we use the Jacobian of the inverse-mapping function to guide the search in the representation space. This ensures that the generated samples remain in the high-density regions, after mapping back to the original space. Finally, we evaluate our contributions on four continuous-control tasks in simulated environments, and compare to diversity-based baselines.

  • Journal article
    Saputra RP, Rakicevic N, Kuder I, Bilsdorfer J, Gough A, Dakin A, Cocker ED, Rock S, Harpin R, Kormushev Pet al., 2021,

    ResQbot 2.0: an improved design of a mobile rescue robot with an inflatable neck securing device for safe casualty extraction

    , Applied Sciences, Vol: 11, Pages: 1-18, ISSN: 2076-3417

    Despite the fact that a large number of research studies have been conducted in the field of searchand rescue robotics, significantly little attention has been given to the development of rescue robotscapable of performing physical rescue interventions, including loading and transporting victims toa safe zone—i.e. casualty extraction tasks. The aim of this study is to develop a mobile rescue robotthat could assist first responders when saving casualties from a danger area by performing a casualty extraction procedure, whilst ensuring that no additional injury is caused by the operation andno additional lives are put at risk. In this paper, we present a novel design of ResQbot 2.0—a mobilerescue robot designed for performing the casualty extraction task. This robot is a stretcher-type casualty extraction robot, which is a significantly improved version of the initial proof-of-concept prototype, ResQbot (retrospectively referred to as ResQbot 1.0), that has been developed in our previous work. The proposed designs and development of the mechanical system of ResQbot 2.0, as wellas the method for safely loading a full body casualty onto the robot’s ‘stretcher bed’, are describedin detail based on the conducted literature review, evaluation of our previous work and feedbackprovided by medical professionals. To verify the proposed design and the casualty extraction procedure, we perform simulation experiments in Gazebo physics engine simulator. The simulationresults demonstrate the capability of ResQbot 2.0 to successfully carry out safe casualty extractions

  • Conference paper
    Frazelle C, Walker I, AlAttar A, Kormushev Pet al., 2021,

    Kinematic-model-free control for space operations with continuum Manipulators

    , USA, IEEE Conference on Aerospace, Publisher: IEEE, Pages: 1-11, ISSN: 1095-323X

    Continuum robots have strong potential for application in Space environments. However, their modeling is challenging in comparison with traditional rigid-link robots. The Kinematic-Model-Free (KMF) robot control method has been shown to be extremely effective in permitting a rigid-link robot to learn approximations of local kinematics and dynamics (“kinodynamics”) at various points in the robot's task space. These approximations enable the robot to follow various trajectories and even adapt to changes in the robot's kinematic structure. In this paper, we present the adaptation of the KMF method to a three-section, nine degrees-of-freedom continuum manipulator for both planar and spatial task spaces. Using only an external 3D camera, we show that the KMF method allows the continuum robot to converge to various desired set points in the robot's task space, avoiding the complexities inherent in solving this problem using traditional inverse kinematics. The success of the method shows that a continuum robot can “learn” enough information from an external camera to reach and track desired points and trajectories, without needing knowledge of exact shape or position of the robot. We similarly apply the method in a simulated example of a continuum robot performing an inspection task on board the ISS.

  • Journal article
    AlAttar A, Cursi F, Kormushev P, 2021,

    Kinematic-model-free redundancy resolution using multi-point tracking and control for robot manipulation

    , Applied Sciences, Vol: 11, Pages: 1-15, ISSN: 2076-3417

    Abstract: Robots have been predominantly controlled using conventional control methods that require prior knowledge of the robots’ kinematic and dynamic models. These controllers can be challenging to tune and cannot directly adapt to changes in kinematic structure or dynamic properties. On the other hand, model-learning controllers can overcome such challenges.Our recently proposed model-learning orientation controller has shown promising ability to simul6 taneously control a three-degrees-of-freedom robot manipulator’s end-effector pose. However, this controller does not perform optimally with robots of higher degrees-of-freedom nor does it resolve redundancies. The research presented in this paper extends the state-of-the-art kinematic9 model-free controller to perform pose control of hyper-redundant robot manipulators and resolve redundancies by tracking and controlling multiple points along the robot’s serial chain. The results show that with more control points, the controller is able to reach desired poses in fewer steps, yielding an improvement of up to 66%, and capable of achieving complex configurations. The algorithm was validated by running the simulation 100 times and it was found that 82% of the times the robot successfully reached the desired target pose within 150 steps.

  • Conference paper
    Tavakoli A, Fatemi M, Kormushev P, 2021,

    Learning to represent action values as a hypergraph on the action vertices

    , Vienna, Austria, International Conference on Learning Representations

    Action-value estimation is a critical component of many reinforcement learning(RL) methods whereby sample complexity relies heavily on how fast a good estimator for action value can be learned. By viewing this problem through the lens ofrepresentation learning, good representations of both state and action can facilitateaction-value estimation. While advances in deep learning have seamlessly drivenprogress in learning state representations, given the specificity of the notion ofagency to RL, little attention has been paid to learning action representations. Weconjecture that leveraging the combinatorial structure of multi-dimensional actionspaces is a key ingredient for learning good representations of action. To test this,we set forth the action hypergraph networks framework—a class of functions forlearning action representations in multi-dimensional discrete action spaces with astructural inductive bias. Using this framework we realise an agent class basedon a combination with deep Q-networks, which we dub hypergraph Q-networks.We show the effectiveness of our approach on a myriad of domains: illustrativeprediction problems under minimal confounding effects, Atari 2600 games, anddiscretised physical control benchmarks.

This data is extracted from the Web of Science and reproduced under a licence from Thomson Reuters. You may not copy or re-distribute this data in whole or in part without the written consent of the Science business of Thomson Reuters.

Request URL: http://www.imperial.ac.uk:80/respub/WEB-INF/jsp/search-t4-html.jsp Request URI: /respub/WEB-INF/jsp/search-t4-html.jsp Query String: id=1128&limit=20&respub-action=search.html Current Millis: 1680212253171 Current Time: Thu Mar 30 22:37:33 BST 2023