Imperial College London

Dr A. Aldo Faisal

Faculty of EngineeringDepartment of Bioengineering

Professor of AI & Neuroscience
 
 
 
//

Contact

 

+44 (0)20 7594 6373a.faisal Website

 
 
//

Assistant

 

Miss Teresa Ng +44 (0)20 7594 8300

 
//

Location

 

4.08Royal School of MinesSouth Kensington Campus

//

Summary

 

Publications

Publication Type
Year
to

179 results found

Auepanwiriyakul C, Waibel S, Songa J, Bentley P, Faisal Aet al., 2020, Accuracy and Acceptability of Wearable Motion Tracking for Inpatient Monitoring using Smartwatches, Sensors, ISSN: 1424-8220

Journal article

Haar Millo S, van Assel C, Faisal A, 2020, Motor learning in real-world pool billiards, Scientific Reports, ISSN: 2045-2322

Journal article

Ortega San Miguel P, Zhao T, Faisal AA, 2020, HYGRIP: Full-stack characterisation of neurobehavioural signals (fNIRS, EEG, EMG, force and breathing) during a bimanual grip force control task, Frontiers in Neuroscience, ISSN: 1662-453X

Brain-computer interfaces (BCIs) have achieved important milestones in recent years, but the major number of breakthroughs in the continuous control of movement have focused on invasive neural interfaces with motor cortex or peripheral nerves. In contrast, non-invasive BCIs have made primarily progress in continuous decoding using event-related data, while the direct decoding of movement command or muscle force from brain data is an open challenge.Multi-modal signals from human cortex, obtained from mobile brain imaging that combines oxygenation and electrical neuronal signals, do not yet exploit their full potential due to the lack of computational techniques able to fuse and decode these hybrid measurements.To stimulate the research community and machine learning techniques closer to the state-of-the-art in artificial intelligence we release herewith a holistic data set of hybrid non-invasive measures for continuous force decoding: the Hybrid Dynamic Grip (HYGRIP) data set. We aim to provide a complete data set, that comprises the target force for the left/right hand, cortical brain signals in form of electroencephalography (EEG) with high temporal resolution and functional near-infrared spectroscopy (fNIRS) that captures in higher spatial resolution a BOLD-like cortical brain response, as well as the muscle activity (EMG) of the grip muscles, the force generated at the grip sensor (force), as well as confounding noise sources, such as breathing and eye movement activity during the task.In total, 14 right-handed subjects performed a uni-manual dynamic grip force task within $25-50\%$ of each hand's maximum voluntary contraction. HYGRIP is intended as a benchmark with two open challenges and research questions for grip-force decoding.First, the exploitation and fusion of data from brain signals spanning very different time-scales, as EEG changes about three orders of magnitude faster than fNIRS.Second, the decoding of whole-brain signals associated with the use of

Journal article

Haar Millo S, Faisal A, 2020, Brain activity reveals multiple motor-learning mechanisms in a real-world task, Frontiers in Human Neuroscience, Vol: 14, ISSN: 1662-5161

Many recent studies found signatures of motor learning in neural beta oscillations (13–30Hz), and specifically in the post-movement beta rebound (PMBR). All these studies were in controlled laboratory-tasks in which the task designed to induce the studied learning mechanism. Interestingly, these studies reported opposing dynamics of the PMBR magnitude over learning for the error-based and reward-based tasks (increase versus decrease, respectively). Here we explored the PMBR dynamics during real-world motor-skill-learning in a billiards task using mobile-brain-imaging. Our EEG recordings highlight the opposing dynamics of PMBR magnitudes (increase versus decrease) between different subjects performing the same task. The groups of subjects, defined by their neural dynamics, also showed behavioural differences expected for different learning mechanisms. Our results suggest that when faced with the complexity of the real-world different subjects might use different learning mechanisms for the same complex task. We speculate that all subjects combine multi-modal mechanisms of learning, but different subjects have different predominant learning mechanisms.

Journal article

Rito Lima I, Haar Millo S, Di Grassi L, Faisal Aet al., 2020, Neurobehavioural signatures in race car driving: a case study, Scientific Reports, Vol: 10, Pages: 1-9, ISSN: 2045-2322

Recent technological developments in mobile brain and body imaging are enabling new frontiers of real-world neuroscience. Simultaneous recordings of body movement and brain activity from highly skilled individuals as they demonstrate their exceptional skills in real-world settings, can shed new light on the neurobehavioural structure of human expertise. Driving is a real-world skill which many of us acquire to different levels of expertise. Here we ran a case-study on a subject with the highest level of driving expertise—a Formula E Champion. We studied the driver’s neural and motor patterns while he drove a sports car on the “Top Gear” race track under extreme conditions (high speed, low visibility, low temperature, wet track). His brain activity, eye movements and hand/foot movements were recorded. Brain activity in the delta, alpha, and beta frequency bands showed causal relation to hand movements. We herein demonstrate the feasibility of using mobile brain and body imaging even in very extreme conditions (race car driving) to study the sensory inputs, motor outputs, and brain states which characterise complex human skills.

Journal article

Shafti SA, Tjomsland J, Dudley W, Faisal Aet al., 2020, Real-world human-robot collaborative reinforcement learning, IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), Publisher: IEEE, ISSN: 2153-0866

The intuitive collaboration of humans and intel-ligent robots (embodied AI) in the real-world is an essentialobjective for many desirable applications of robotics. Whilstthere is much research regarding explicit communication, wefocus on how humans and robots interact implicitly, on motoradaptation level. We present a real-world setup of a human-robot collaborative maze game, designed to be non-trivial andonly solvable through collaboration, by limiting the actions torotations of two orthogonal axes, and assigning each axes to oneplayer. This results in neither the human nor the agent beingable to solve the game on their own. We use deep reinforcementlearning for the control of the robotic agent, and achieve resultswithin 30 minutes of real-world play, without any type ofpre-training. We then use this setup to perform systematicexperiments on human/agent behaviour and adaptation whenco-learning a policy for the collaborative game. We presentresults on how co-policy learning occurs over time between thehuman and the robotic agent resulting in each participant’sagent serving as a representation of how they would play thegame. This allows us to relate a person’s success when playingwith different agents than their own, by comparing the policyof the agent with that of their own agent.

Conference paper

Shafti A, Haar S, Zaldivar RM, Guilleminot P, Faisal AAet al., 2020, Learning to play the piano with the Supernumerary Robotic 3rd Thumb, Publisher: Cold Spring Harbor Laboratory

We wanted to study the ability of our brains and bodies to be augmented by supernumerary robot limbs, here extra fingers. We developed a mechanically highly functional supernumerary robotic 3rd thumb actuator, the SR3T, and interfaced it with human users enabling them to play the piano with 11 fingers. We devised a set of measurement protocols and behavioural “biomarkers”, the Human Augmentation Motor Coordination Assessment (HAMCA), which allowed us a priori to predict how well each individual human user could, after training, play the piano with a two-thumbs-hand. To evaluate augmented music playing ability we devised a simple musical score, as well as metrics for assessing the accuracy of playing the score. We evaluated the SR3T (supernumerary robotic 3rd thumb) on 12 human subjects including 6 naïve and 6 experienced piano players. We demonstrated that humans can learn to play the piano with a 6-fingered hand within one hour of training. For each subject we could predict individually, based solely on their HAMCA performance before training, how well they were able to perform with the extra robotic thumb, after training (training end-point performance). Our work demonstrates the feasibility of robotic human augmentation with supernumerary robotic limbs within short time scales. We show how linking the neuroscience of motor learning with dexterous robotics and human-robot interfacing can be used to inform a priori how far individual motor impaired patients or healthy manual workers could benefit from robotic augmentation solutions.

Working paper

Albert-Smet I, McPherson D, Navaie W, Stocker T, Faisal AAet al., 2020, Regulations & exemptions during the COVID-19 pandemic for new medical technology, health services & data

The rapid evolution of the COVID-19 pandemic has sparked a large unmet need for new or additional medical technology and healthcare services to be made available urgently. Healthcare, Academic, Government and Industry organizations and individuals have risen to this challenge by designing, developing, manufacturing or implementing innovation. However, both they and healthcare stakeholders are hampered as it is unclear how to introduce and deploy the products of this innovation quickly and legally within the healthcare system. Our paper outlines the key regulations and processes innovators need to comply with, and how these change during a public health emergency via dedicated exemptions. Our work includes references to the formal documents regarding UK healthcare regulation and governance, and is meant to serve as a guide for those who wish to act quickly but are uncertain of the legal and regulatory pathways that allow new a device or service to be fast-tracked.

Report

Haar S, Sundar G, Faisal A, 2020, Embodied virtual reality for the study of real-world motor learning, Publisher: bioRxiv

Abstract Background The motor learning literature focuses on relatively simple laboratory-tasks due to their highly controlled manner and the ease to apply different manipulations to induce learning and adaptation. In recent work we introduced a billiards paradigm and demonstrated the feasibility of real-world neuroscience using wearables for naturalistic full-body motion tracking and mobile brain imaging. Here we developed an embodied virtual reality (VR) environment to our real-world billiards paradigm, which allows us to control the visual feedback for this complex real-world task, while maintaining the sense of embodiment. Methods The setup was validated by comparing real-world ball trajectories with the embodied VR trajectories, calculated by the physics engine. We then ran our real-world learning protocol in the embodied VR. 10 healthy human subjects played repeated trials of the same billiard shot when they held the physical cue and hit a physical ball on the table while seeing it all in VR. Results We found comparable learning trends in the embodied VR to those we previously reported in the real-world task. Conclusions Embodied VR can be used for learning real-world tasks in a highly controlled VR environment which enables applying visual manipulations, common in laboratory-tasks and in rehabilitation, to a real-world full-body task. Such a setup can be used for rehabilitation, where the use of VR is gaining popularity but the transfer to the real-world is currently limited, presumably, due to the lack of embodiment. The embodied VR enables to manipulate feedback and apply perturbations to isolate and assess interactions between specific motor learning components mechanisms, thus enabling addressing the current questions of motor-learning in real-world tasks.

Working paper

Haar S, Faisal A, 2020, Neural biomarkers of multiple motor-learning mechanisms in a real-world task, Publisher: bioRxiv

Abstract Many recent studies found signatures of motor learning in neural beta oscillations (13–30Hz), and specifically in the post-movement beta rebound (PMBR). All these studies were in simplified laboratory-tasks in which learning was either error-based or reward-based. Interestingly, these studies reported opposing dynamics of the PMBR magnitude over learning for the error-based and reward-based tasks (increase verses decrease, respectively). Here we explored the PMBR dynamics during real-world motor-skill-learning in a billiards task using mobile-brain-imaging. Our EEG recordings highlight opposing dynamics of PMBR magnitudes between different subjects performing the same task. The groups of subjects, defined by their neural-dynamics, also showed behavioral differences expected for error-based verses reward-based learning. Our results suggest that when faced with the complexity of the real-world different subjects might use different learning mechanisms for the same complex task. We speculate that all subjects combine multi-modal mechanisms of learning, but different subjects have different predominant learning mechanisms.

Working paper

Shafti A, Tjomsland J, Dudley W, Faisal AAet al., 2020, Real-world human-robot collaborative reinforcement learning, Publisher: arXiv

The intuitive collaboration of humans and intelligent robots (embodied AI) inthe real-world is an essential objective for many desirable applications ofrobotics. Whilst there is much research regarding explicit communication, wefocus on how humans and robots interact implicitly, on motor adaptation level.We present a real-world setup of a human-robot collaborative maze game,designed to be non-trivial and only solvable through collaboration, by limitingthe actions to rotations of two orthogonal axes, and assigning each axes to oneplayer. This results in neither the human nor the agent being able to solve thegame on their own. We use a state-of-the-art reinforcement learning algorithmfor the robotic agent, and achieve results within 30 minutes of real-worldplay, without any type of pre-training. We then use this system to performsystematic experiments on human/agent behaviour and adaptation when co-learninga policy for the collaborative game. We present results on how co-policylearning occurs over time between the human and the robotic agent resulting ineach participant's agent serving as a representation of how they would play thegame. This allows us to relate a person's success when playing with differentagents than their own, by comparing the policy of the agent with that of theirown agent.

Working paper

Bachtiger P, Plymen CM, Pabari PA, Howard JP, Whinnett ZI, Opoku F, Janering S, Faisal AA, Francis DP, Peters NSet al., 2020, Artificial intelligence, data sensors and interconnectivity: future Opportunities for heart failure, Cardiac Failure Review, Vol: 6, Pages: e11-e11, ISSN: 2057-7540

A higher proportion of patients with heart failure have benefitted from a wide and expanding variety of sensor-enabled implantable devices than any other patient group. These patients can now also take advantage of the ever-increasing availability and affordability of consumer electronics. Wearable, on- and near-body sensor technologies, much like implantable devices, generate massive amounts of data. The connectivity of all these devices has created opportunities for pooling data from multiple sensors - so-called interconnectivity - and for artificial intelligence to provide new diagnostic, triage, risk-stratification and disease management insights for the delivery of better, more personalised and cost-effective healthcare. Artificial intelligence is also bringing important and previously inaccessible insights from our conventional cardiac investigations. The aim of this article is to review the convergence of artificial intelligence, sensor technologies and interconnectivity and the way in which this combination is set to change the care of patients with heart failure.

Journal article

Abbott W, Harston J, Faisal A, 2020, Linear Embodied Saliency: a Model of Full-Body Kinematics-based Visual Attention, bioRxiv

Linear Embodied Saliency: a Model of Full-Body Kinematics-based Visual Attention

Journal article

Deisenroth MP, Faisal AA, Ong CS, 2020, Mathematics for Machine Learning, Publisher: Cambridge University Press, ISBN: 9781108455145

Book

Beyret B, Shafti SA, Faisal A, 2020, Dot-to-dot: explainable hierarchical reinforcement learning for robotic manipulation, IEEE/RSJ International Conference on Intelligent Robots and Systems. IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), Publisher: IEEE, Pages: 1-6, ISSN: 2153-0866

Robotic systems are ever more capable of automationand fulfilment of complex tasks, particularly withreliance on recent advances in intelligent systems, deep learningand artificial intelligence in general. However, as robots andhumans come closer together in their interactions, the matterof interpretability, or explainability of robot decision-makingprocesses for the human grows in importance. A successfulinteraction and collaboration would only be possible throughmutual understanding of underlying representations of theenvironment and the task at hand. This is currently a challengein deep learning systems. We present a hierarchical deepreinforcement learning system, consisting of a low-level agenthandling the large actions/states space of a robotic systemefficiently, by following the directives of a high-level agent whichis learning the high-level dynamics of the environment and task.This high-level agent forms a representation of the world andtask at hand that is interpretable for a human operator. Themethod, which we call Dot-to-Dot, is tested on a MuJoCo-basedmodel of the Fetch Robotics Manipulator, as well as a ShadowHand, to test its performance. Results show efficient learningof complex actions/states spaces by the low-level agent, and aninterpretable representation of the task and decision-makingprocess learned by the high-level agent.

Conference paper

Lima IR, Haar S, Di Grassi L, Faisal Aet al., 2019, Neurobehavioural signatures in race car driving

ABSTRACT Recent technological developments in mobile brain and body imaging are enabling new frontiers of real-world neuroscience. Simultaneous recordings of body movement and brain activity from highly skillful individuals as they demonstrate their exceptional skills in real-world settings, can shed new light on neurobehavioural structure of human expertise. Driving is a real-world skill which many of us acquire on different levels of expertise. Here we ran a case-study on a subject with the highest level of driving expertise - a Formula E Champion. We studied the expert driver’s neural and motor patterns while he drove a sports car in the “Top Gear” race track under extreme conditions (high speed, low visibility, low temperature, wet track). His brain activity, eye movements and hand/foot movements were recorded. Brain activity in the delta, alpha, and beta frequency bands showed causal relation to hand movements. We demonstrate, here in summary, that even in extreme situations (race track driving) a method for conducting human ethomic (Ethology + Omics) data that encompasses information on the sensory inputs and motor outputs outputs of the brain as well as brain state to characterise complex human skills.

Working paper

Tjomsland J, Shafti A, Faisal AA, 2019, Human-robot collaboration via deep reinforcement learning of real-world interactions, Publisher: arXiv

We present a robotic setup for real-world testing and evaluation ofhuman-robot and human-human collaborative learning. Leveraging thesample-efficiency of the Soft Actor-Critic algorithm, we have implemented arobotic platform able to learn a non-trivial collaborative task with a humanpartner, without pre-training in simulation, and using only 30 minutes ofreal-world interactions. This enables us to study Human-Robot and Human-Humancollaborative learning through real-world interactions. We present preliminaryresults, showing that state-of-the-art deep learning methods can takehuman-robot collaborative learning a step closer to that of humans interactingwith each other.

Working paper

Faisal A, Hermano K, Antonio P, 2019, Proceedings of the 3rd International Congress on Neurotechnology, Electronics and Informatics, Setúbal, Publisher: Scitepress, ISBN: 978-989-758-161-8

Book

Hermano K, Pedotti A, Faisal A, 2019, Proceedings of the 4th International Congress on Neurotechnology, Electronics and Informatics 2016, ISBN: 978-989-758-204-2

Book

Subramanian M, Songur N, Adjei D, Orlov P, Faisal Aet al., 2019, A.Eye Drive: gaze-based semi-autonomous wheelchair interface, 41st International Engineering in Medicine & Biology Society (EMBC 2019), Publisher: IEEE

Existing wheelchair control interfaces, such as sip & puff or screen based gaze-controlled cursors, are challenging for the severely disabled to navigate safely and independently as users continuously need tointeract with an interface during navigation. This putsa significant cognitive load on users and prevents them from interacting with the environment in other forms during navigation. We have combined eyetracking/gaze-contingent intention decoding with computervision context-awarealgorithms and autonomous navigation drawn fromself-driving vehicles to allow paralysed users to drive by eye, simply by decoding natural gaze about where the user wants to go: A.Eye Drive. Our “Zero UI” driving platform allows users to look and interact visually with at an objector destination of interest in their visual scene, and the wheelchairautonomously takes the user to the intended destination, while continuously updating the computed path for static and dynamic obstacles. This intention decoding technology empowers the end-user by promising more independence through their own agency.

Conference paper

Khwaja M, Ferrer M, Jesus I, Faisal A, Matic Aet al., 2019, Aligning daily activities with personality: towards a recommender system for improving wellbeing, ACM Conference on Recommender Systems (RecSys), Publisher: ACM, Pages: 368-372

Recommender Systems have not been explored to a great extentfor improving health and subjective wellbeing. Recent advances inmobile technologies and user modelling present the opportunityfor delivering such systems, however the key issue is understand-ing the drivers of subjective wellbeing at an individual level. Inthis paper we propose a novel approach for deriving personalizedactivity recommendations to improve subjective wellbeing by maxi-mizing the congruence between activities and personality traits. Toevaluate the model, we leveraged a rich dataset collected in a smart-phone study, which contains three weeks of daily activity probes,the Big-Five personality questionnaire and subjective wellbeingsurveys. We show that the model correctly infers a range of activ-ities that are ’good’ or ’bad’ (i.e. that are positively or negativelyrelated to subjective wellbeing) for a given user and that the derivedrecommendations greatly match outcomes in the real-world.

Conference paper

Khwaja M, Vaid SS, Zannone S, Harari GM, Faisal A, Matic Aet al., 2019, Modeling personality vs. modeling personalidad: In-the-wild mobile data analysis in five countries suggests cultural impact on personality models, Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies, Vol: 3, Pages: 1-24, ISSN: 2474-9567

Sensor data collected from smartphones provides the possibility to passively infer a user’s personality traits. Such models canbe used to enable technology personalization, while contributing to our substantive understanding of how human behaviormanifests in daily life. A significant challenge in personality modeling involves improving the accuracy of personalityinferences, however, research has yet to assess and consider the cultural impact of users’ country of residence on modelreplicability. We collected mobile sensing data and self-reported Big Five traits from 166 participants (54 women and 112men) recruited in five different countries (UK, Spain, Colombia, Peru, and Chile) for 3 weeks. We developed machine learningbased personality models using culturally diverse datasets - representing different countries - and we show that such modelscan achieve state-of-the-art accuracy when tested in new countries, ranging from 63% (Agreeableness) to 71% (Extraversion)of classification accuracy. Our results indicate that using country-specific datasets can improve the classification accuracybetween 3% and 7% for Extraversion, Agreeableness, and Conscientiousness. We show that these findings hold regardless ofgender and age balance in the dataset. Interestingly, using gender- or age- balanced datasets as well as gender-separateddatasets improve trait prediction by up to 17%. We unpack differences in personality models across the five countries, highlightthe most predictive data categories (location, noise, unlocks, accelerometer), and provide takeaways to technologists andsocial scientists interested in passive personality assessment.

Journal article

Shafti SA, Orlov P, Faisal A, 2019, Gaze-based, context-aware robotic system for assisted reaching and grasping, International Conference on Robotics and Automation 2019, Publisher: IEEE, ISSN: 2152-4092

Assistive robotic systems endeavour to support those with movement disabilities, enabling them to move againand regain functionality. Main issue with these systems is the complexity of their low-level control, and how to translate thisto simpler, higher level commands that are easy and intuitivefor a human user to interact with. We have created a multi-modal system, consisting of different sensing, decision makingand actuating modalities, to create intuitive, human-in-the-loopassistive robotics. The system takes its cue from the user’s gaze,to decode their intentions and implement lower-level motionactions and achieve higher level tasks. This results in the usersimply having to look at the objects of interest, for the robotic system to assist them in reaching for those objects, grasping them, and using them to interact with other objects. We presentour method for 3D gaze estimation, and action grammars-basedimplementation of sequences of action through the robotic system. The 3D gaze estimation is evaluated with 8 subjects,showing an overall accuracy of 4.68±0.14cm. The full systemis tested with 5 subjects, showing successful implementation of 100% of reach to gaze point actions and full implementationof pick and place tasks in 96%, and pick and pour tasks in76% of cases. Finally we present a discussion on our results and what future work is needed to improve the system.

Conference paper

Haar S, van Assel C, Faisal A, 2019, Neurobehavioural signatures of learning that emerge in a real-world motor skill task

Summary The behavioral and neural processes of real-world motor learning remain largely unknown. We demonstrate the feasibility of real-world neuroscience, using wearables for naturalistic full-body motion tracking and mobile brain imaging, to study motor learning in billiards. We highlight the similarities between motor learning in-the-wild and classic toy-tasks in well-known features, such as multiple learning rates, and the relationship between task-related variability and motor learning. However, we found that real-world motor learning affects the whole body, changing motor control from head to toe. Moreover, with a data-driven approach, based on the relationship between variability and learning, we found the arm supination to be the task relevant joint angle. Our EEG recordings highlight groups of subjects with opposing dynamics of post-movement Beta rebound (PMBR), not resolved before in toy-tasks. The first group increased PMBR over learning while the second decreased. These opposite trends were previously reported in error-based learning and skill learning tasks respectively. Behaviorally, the PMBR decreasers better controlled task-relevant variability dynamically leading to lower variability and smaller errors in the learning plateau. We speculate that these PMBR dynamics emerge because subjects must combine multi-modal mechanisms of learning in new ways when faced with the complexity of the real-world.

Working paper

Komorowski M, Celi LA, Badawi O, Gordon AC, Faisal AAet al., 2019, Understanding the artificial intelligence clinician and optimal treatment strategies for sepsis in intensive care

In this document, we explore in more detail our published work (Komorowski,Celi, Badawi, Gordon, & Faisal, 2018) for the benefit of the AI in Healthcareresearch community. In the above paper, we developed the AI Clinician system,which demonstrated how reinforcement learning could be used to make usefulrecommendations towards optimal treatment decisions from intensive care data.Since publication a number of authors have reviewed our work (e.g. Abbasi,2018; Bos, Azoulay, & Martin-Loeches, 2019; Saria, 2018). Given the differenceof our framework to previous work, the fact that we are bridging two verydifferent academic communities (intensive care and machine learning) and thatour work has impact on a number of other areas with more traditionalcomputer-based approaches (biosignal processing and control, biomedicalengineering), we are providing here additional details on our recentpublication.

Working paper

Beyret B, Shafti A, Faisal AA, 2019, Dot-to-Dot: Explainable Hierarchical Reinforcement Learning for Robotic Manipulation, IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), Publisher: IEEE, Pages: 5014-5019, ISSN: 2153-0858

Conference paper

Gottesman O, Johansson F, Komorowski M, Faisal A, Sontag D, Doshi-Velez F, Celi LAet al., 2019, Guidelines for reinforcement learning in healthcare, Nature Medicine, Vol: 25, Pages: 16-18, ISSN: 1078-8956

In this Comment, we provide guidelines for reinforcement learning for decisions about patient treatment that we hope will accelerate the rate at which observational cohorts can inform healthcare practice in a safe, risk-conscious manner.

Journal article

Peng X, Ding Y, Wihl D, Gottesman O, Komorowski M, Lehman L-WH, Ross A, Faisal A, Doshi-Velez Fet al., 2018, Improving Sepsis Treatment Strategies by Combining Deep and Kernel-Based Reinforcement Learning., AMIA 2018 Annual Symposium, Pages: 887-896

Sepsis is the leading cause of mortality in the ICU. It is challenging to manage because individual patients respond differently to treatment. Thus, tailoring treatment to the individual patient is essential for the best outcomes. In this paper, we take steps toward this goal by applying a mixture-of-experts framework to personalize sepsis treatment. The mixture model selectively alternates between neighbor-based (kernel) and deep reinforcement learning (DRL) experts depending on patient's current history. On a large retrospective cohort, this mixture-based approach outperforms physician, kernel only, and DRL-only experts.

Conference paper

Liu Y, Gottesman O, Raghu A, Komorowski M, Faisal AA, Doshi-Velez F, Brunskill Eet al., 2018, Representation Balancing MDPs for Off-Policy Policy Evaluation, Thirty-second Annual Conference on Neural Information Processing Systems (NIPS)

We study the problem of off-policy policy evaluation (OPPE) in RL. In contrastto prior work, we consider how to estimate both the individual policy value and average policy value accurately. We draw inspiration from recent work in causal reasoning, and propose a new finite sample generalization error bound for value estimates from MDP models. Using this upper bound as an objective, we develop a learning algorithm of an MDP model with a balanced representation, and show that our approach can yield substantially lower MSE in a common synthetic domain and on a challenging real-world sepsis management problem.

Conference paper

Parbhoo S, Gottesman O, Ross AS, Komorowski M, Faisal A, Bon I, Roth V, Doshi-Velez Fet al., 2018, Improving counterfactual reasoning with kernelised dynamic mixing models, PLoS ONE, Vol: 13, ISSN: 1932-6203

Simulation-based approaches to disease progression allow us to make counterfactual predictions about the effects of an untried series of treatment choices. However, building accurate simulators of disease progression is challenging, limiting the utility of these approaches for real world treatment planning. In this work, we present a novel simulation-based reinforcement learning approach that mixes between models and kernel-based approaches to make its forward predictions. On two real world tasks, managing sepsis and treating HIV, we demonstrate that our approach both learns state-of-the-art treatment policies and can make accurate forward predictions about the effects of treatments on unseen patients.

Journal article

This data is extracted from the Web of Science and reproduced under a licence from Thomson Reuters. You may not copy or re-distribute this data in whole or in part without the written consent of the Science business of Thomson Reuters.

Request URL: http://wlsprd.imperial.ac.uk:80/respub/WEB-INF/jsp/search-html.jsp Request URI: /respub/WEB-INF/jsp/search-html.jsp Query String: respub-action=search.html&id=00539811&limit=30&person=true