Imperial College London

Dr Ben Glocker

Faculty of EngineeringDepartment of Computing

Professor in Machine Learning for Imaging
 
 
 
//

Contact

 

+44 (0)20 7594 8334b.glocker Website CV

 
 
//

Location

 

377Huxley BuildingSouth Kensington Campus

//

Summary

 

Publications

Publication Type
Year
to

353 results found

Santhirasekaram A, Kori A, Winkler M, Rockall AG, Glocker Bet al., 2022, Vector Quantisation for Robust Segmentation., Publisher: Springer, Pages: 663-672

Conference paper

Shehata N, Bain W, Glocker B, 2022, A Comparative Study of Graph Neural Networks for Shape Classification in Neuroimaging., Publisher: PMLR, Pages: 160-171

Conference paper

Rosnati M, Soreq E, Monteiro M, Li LM, Graham NSN, Zimmerman K, Rossi C, Carrara G, Bertolini G, Sharp DJ, Glocker Bet al., 2022, Automatic lesion analysis for increased efficiency in outcome prediction of traumatic brain injury., CoRR, Vol: abs/2208.04114

Journal article

Rosnati M, Soreq E, Monteiro M, Li LM, Graham NSN, Zimmerman K, Rossi C, Carrara G, Bertolini G, Sharp DJ, Glocker Bet al., 2022, Automatic Lesion Analysis for Increased Efficiency in Outcome Prediction of Traumatic Brain Injury., Publisher: Springer, Pages: 135-146

Conference paper

Santhirasekaram A, Kori A, Winkler M, Rockall A, Glocker Bet al., 2022, Vector Quantisation for Robust Segmentation, Publisher: SPRINGER INTERNATIONAL PUBLISHING AG

Working paper

Islam M, Glocker B, 2022, Frequency Dropout: Feature-Level Regularization via Randomized Filtering., CoRR, Vol: abs/2209.09844

Journal article

Whitehouse DP, Monteiro M, Czeiter E, Vyvere TV, Valerio F, Ye Z, Amrein K, Kamnitsas K, Xu H, Yang Z, Verheyden J, Das T, Kornaropoulos EN, Steyerberg E, Maas AIR, Wang KKW, Büki A, Glocker B, Menon DK, Newcombe VFJ, CENTER-TBI Participants and Investigatorset al., 2021, Relationship of admission blood proteomic biomarkers levels to lesion type and lesion burden in traumatic brain injury: A CENTER-TBI study., EBioMedicine, Vol: 75, Pages: 1-15, ISSN: 2352-3964

BACKGROUND: We aimed to understand the relationship between serum biomarker concentration and lesion type and volume found on computed tomography (CT) following all severities of TBI. METHODS: Concentrations of six serum biomarkers (GFAP, NFL, NSE, S100B, t-tau and UCH-L1) were measured in samples obtained <24 hours post-injury from 2869 patients with all severities of TBI, enrolled in the CENTER-TBI prospective cohort study (NCT02210221). Imaging phenotypes were defined as intraparenchymal haemorrhage (IPH), oedema, subdural haematoma (SDH), extradural haematoma (EDH), traumatic subarachnoid haemorrhage (tSAH), diffuse axonal injury (DAI), and intraventricular haemorrhage (IVH). Multivariable polynomial regression was performed to examine the association between biomarker levels and both distinct lesion types and lesion volumes. Hierarchical clustering was used to explore imaging phenotypes; and principal component analysis and k-means clustering of acute biomarker concentrations to explore patterns of biomarker clustering. FINDINGS: 2869 patient were included, 68% (n=1946) male with a median age of 49 years (range 2-96). All severities of TBI (mild, moderate and severe) were included for analysis with majority (n=1946, 68%) having a mild injury (GCS 13-15). Patients with severe diffuse injury (Marshall III/IV) showed significantly higher levels of all measured biomarkers, with the exception of NFL, than patients with focal mass lesions (Marshall grades V/VI). Patients with either DAI+IVH or SDH+IPH+tSAH, had significantly higher biomarker concentrations than patients with EDH. Higher biomarker concentrations were associated with greater volume of IPH (GFAP, S100B, t-tau;adj r2 range:0·48-0·49; p<0·05), oedema (GFAP, NFL, NSE, t-tau, UCH-L1;adj r2 range:0·44-0·44; p<0·01), IVH (S100B;adj r2 range:0.48-0.49; p<0.05), Unsupervised k-means biomarker clustering revealed two clusters explaining 83·9% of varian

Journal article

Popescu SG, Glocker B, Sharp DJ, Cole JHet al., 2021, Local brain-age: A u-net model, Frontiers in Aging Neuroscience, Vol: 13, Pages: 1-17, ISSN: 1663-4365

We propose a new framework for estimating neuroimaging-derived “brain-age” at a local level within the brain, using deep learning. The local approach, contrary to existing global methods, provides spatial information on anatomical patterns of brain ageing. We trained a U-Net model using brain MRI scans from n = 3,463 healthy people (aged 18–90 years) to produce individualised 3D maps of brain-predicted age. When testing on n = 692 healthy people, we found a median (across participant) mean absolute error (within participant) of 9.5 years. Performance was more accurate (MAE around 7 years) in the prefrontal cortex and periventricular areas. We also introduce a new voxelwise method to reduce the age-bias when predicting local brain-age “gaps.” To validate local brain-age predictions, we tested the model in people with mild cognitive impairment or dementia using data from OASIS3 (n = 267). Different local brain-age patterns were evident between healthy controls and people with mild cognitive impairment or dementia, particularly in subcortical regions such as the accumbens, putamen, pallidum, hippocampus, and amygdala. Comparing groups based on mean local brain-age over regions-of-interest resulted in large effects sizes, with Cohen's d values >1.5, for example when comparing people with stable and progressive mild cognitive impairment. Our local brain-age framework has the potential to provide spatial information leading to a more mechanistic understanding of individual differences in patterns of brain ageing in health and disease.

Journal article

Folgoc LL, Baltatzis V, Alansary A, Desai S, Devaraj A, Ellis S, Manzanera OEM, Kanavati F, Nair A, Schnabel J, Glocker Bet al., 2021, Bayesian analysis of the prevalence bias: learning and predicting from imbalanced data, Publisher: ArXiv

Datasets are rarely a realistic approximation of the target population. Say,prevalence is misrepresented, image quality is above clinical standards, etc.This mismatch is known as sampling bias. Sampling biases are a major hindrancefor machine learning models. They cause significant gaps between modelperformance in the lab and in the real world. Our work is a solution toprevalence bias. Prevalence bias is the discrepancy between the prevalence of apathology and its sampling rate in the training dataset, introduced uponcollecting data or due to the practioner rebalancing the training batches. Thispaper lays the theoretical and computational framework for training models, andfor prediction, in the presence of prevalence bias. Concretely a bias-correctedloss function, as well as bias-corrected predictive rules, are derived underthe principles of Bayesian risk minimization. The loss exhibits a directconnection to the information gain. It offers a principled alternative toheuristic training losses and complements test-time procedures based onselecting an operating point from summary curves. It integrates seamlessly inthe current paradigm of (deep) learning using stochastic backpropagation andnaturally with Bayesian models.

Working paper

Chen X, Pawlowski N, Glocker B, Konukoglu Eet al., 2021, Normative ascent with local gaussians for unsupervised lesion detection, MEDICAL IMAGE ANALYSIS, Vol: 74, ISSN: 1361-8415

Journal article

Glocker B, Jones C, Bernhardt M, Winzeck Set al., 2021, Algorithmic encoding of protected characteristics in image-based models for disease detection

It has been rightfully emphasized that the use of AI for clinical decisionmaking could amplify health disparities. An algorithm may encode protectedcharacteristics, and then use this information for making predictions due toundesirable correlations in the (historical) training data. It remains unclearhow we can establish whether such information is actually used. Besides thescarcity of data from underserved populations, very little is known about howdataset biases manifest in predictive models and how this may result indisparate performance. This article aims to shed some light on these issues byexploring new methodology for subgroup analysis in image-based diseasedetection models. We utilize two publicly available chest X-ray datasets,CheXpert and MIMIC-CXR, to study performance disparities across race andbiological sex in deep learning models. We explore test set resampling,transfer learning, multitask learning, and model inspection to assess therelationship between the encoding of protected characteristics and diseasedetection performance across subgroups. We confirm subgroup disparities interms of shifted true and false positive rates which are partially removedafter correcting for population and prevalence shifts in the test sets. Wefurther find a previously used transfer learning method to be insufficient forestablishing whether specific patient information is used for makingpredictions. The proposed combination of test-set resampling, multitasklearning, and model inspection reveals valuable new insights about the wayprotected characteristics are encoded in the feature representations of deepneural networks.

Working paper

Santhirasekaram A, Pinto K, Winkler M, Aboagye E, Glocker B, Rockall Aet al., 2021, Multi-scale hybrid transformer networks: application to prostate disease classification, 11th Workshop on Multimodal Learning and Fusion Across Scales for Clinical Decision Support (ML-CDS) held at 24th International Conference on Medical Image Computing and Computer Assisted Intervention (MICCAI), Publisher: SPRINGER INTERNATIONAL PUBLISHING AG, Pages: 12-21, ISSN: 0302-9743

Automated disease classification could significantly improve the accuracy of prostate cancer diagnosis on MRI, which is a difficult task even for trained experts. Convolutional neural networks (CNNs) have shown some promising results for disease classification on multi-parametric MRI. However, CNNs struggle to extract robust global features about the anatomy which may provide important contextual information for further improving classification accuracy. Here, we propose a novel multi-scale hybrid CNN/transformer architecture with the ability of better contextualising local features at different scales. In our application, we found this to significantly improve performance compared to using CNNs. Classification accuracy is even further improved with a stacked ensemble yielding promising results for binary classification of prostate lesions into clinically significant or non-significant.

Conference paper

Folgoc LL, Baltatzis V, Desai S, Devaraj A, Ellis S, Manzanera OEM, Nair A, Qiu H, Schnabel J, Glocker Bet al., 2021, Is MC Dropout Bayesian?

MC Dropout is a mainstream "free lunch" method in medical imaging forapproximate Bayesian computations (ABC). Its appeal is to solve out-of-the-boxthe daunting task of ABC and uncertainty quantification in Neural Networks(NNs); to fall within the variational inference (VI) framework; and to proposea highly multimodal, faithful predictive posterior. We question the propertiesof MC Dropout for approximate inference, as in fact MC Dropout changes theBayesian model; its predictive posterior assigns $0$ probability to the truemodel on closed-form benchmarks; the multimodality of its predictive posterioris not a property of the true predictive posterior but a design artefact. Toaddress the need for VI on arbitrary models, we share a generic VI enginewithin the pytorch framework. The code includes a carefully designedimplementation of structured (diagonal plus low-rank) multivariate normalvariational families, and mixtures thereof. It is intended as a go-tono-free-lunch approach, addressing shortcomings of mean-field VI with anadjustable trade-off between expressivity and computational complexity.

Working paper

Sounderajah V, Ashrafian H, Rose S, Shah NH, Ghassemi M, Golub R, Kahn CE, Esteva A, Karthikesalingam A, Mateen B, Webster D, Milea D, Ting D, Treanor D, Cushnan D, King D, McPherson D, Glocker B, Greaves F, Harling L, Ordish J, Cohen JF, Deeks J, Leeflang M, Diamond M, McInnes MDF, McCradden M, Abramoff MD, Normahani P, Markar SR, Chang S, Liu X, Mallett S, Shetty S, Denniston A, Collins GS, Moher D, Whiting P, Bossuyt PM, Darzi Aet al., 2021, A quality assessment tool for artificial intelligence-centered diagnostic test accuracy studies: QUADAS-AI, NATURE MEDICINE, Vol: 27, Pages: 1663-1665, ISSN: 1078-8956

Journal article

Baltatzis V, Bintsi K-M, Folgoc LL, Manzanera OEM, Ellis S, Nair A, Desai S, Glocker B, Schnabel JAet al., 2021, The pitfalls of sample selection: a case study on lung nodule classification, Predictive Intelligence in Medicine at MICCAI, Publisher: Springer, Pages: 201-211

Using publicly available data to determine the performance of methodologicalcontributions is important as it facilitates reproducibility and allowsscrutiny of the published results. In lung nodule classification, for example,many works report results on the publicly available LIDC dataset. In theory,this should allow a direct comparison of the performance of proposed methodsand assess the impact of individual contributions. When analyzing seven recentworks, however, we find that each employs a different data selection process,leading to largely varying total number of samples and ratios between benignand malignant cases. As each subset will have different characteristics withvarying difficulty for classification, a direct comparison between the proposedmethods is thus not always possible, nor fair. We study the particular effectof truthing when aggregating labels from multiple experts. We show thatspecific choices can have severe impact on the data distribution where it maybe possible to achieve superior performance on one sample distribution but noton another. While we show that we can further improve on the state-of-the-arton one sample selection, we also find that on a more challenging sampleselection, on the same database, the more advanced models underperform withrespect to very simple baseline methods, highlighting that the selected datadistribution may play an even more important role than the model architecture.This raises concerns about the validity of claimed methodologicalcontributions. We believe the community should be aware of these pitfalls andmake recommendations on how these can be avoided in future work.

Conference paper

Kamnitsas K, Winzeck S, Kornaropoulos EN, Whitehouse D, Englman C, Phyu P, Pao N, Menon DK, Rueckert D, Das T, Newcombe VFJ, Glocker Bet al., 2021, Transductive image segmentation: Self-training and effect of uncertainty estimation, MICCAI Workshop on Domain Adaptation and Representation Transfer, Publisher: Springer, Pages: 79-89

Semi-supervised learning (SSL) uses unlabeled data during training to learnbetter models. Previous studies on SSL for medical image segmentation focusedmostly on improving model generalization to unseen data. In some applications,however, our primary interest is not generalization but to obtain optimalpredictions on a specific unlabeled database that is fully available duringmodel development. Examples include population studies for extracting imagingphenotypes. This work investigates an often overlooked aspect of SSL,transduction. It focuses on the quality of predictions made on the unlabeleddata of interest when they are included for optimization during training,rather than improving generalization. We focus on the self-training frameworkand explore its potential for transduction. We analyze it through the lens ofInformation Gain and reveal that learning benefits from the use of calibratedor under-confident models. Our extensive experiments on a large MRI databasefor multi-class segmentation of traumatic brain lesions shows promising resultswhen comparing transductive with inductive predictions. We believe this studywill inspire further research on transductive learning, a well-suited paradigmfor medical image analysis.

Conference paper

Qaiser T, Winzeck S, Barfoot T, Barwick T, Doran SJ, Kaiser MF, Wedlake L, Tunariu N, Koh D-M, Messiou C, Rockall A, Glocker Bet al., 2021, Multiple instance learning with auxiliary task weighting for multiple myeloma classification, International Conference on Medical Image Computing and Computer Assisted Intervention (MICCAI), Publisher: Springer, Pages: 786-796, ISSN: 0302-9743

Whole body magnetic resonance imaging (WB-MRI) is the recommended modality for diagnosis of multiple myeloma (MM). WB-MRI is used to detect sites of disease across the entire skeletal system, but it requires significant expertise and is time-consuming to report due to the great number of images. To aid radiological reading, we propose an auxiliary task-based multiple instance learning approach (ATMIL) for MM classification with the ability to localize sites of disease. This approach is appealing as it only requires patient-level annotations where an attention mechanism is used to identify local regions with active disease. We borrow ideas from multi-task learning and define an auxiliary task with adaptive reweighting to support and improve learning efficiency in the presence of data scarcity. We validate our approach on both synthetic and real multi-center clinical data. We show that the MIL attention module provides a mechanism to localize bone regions while the adaptive reweighting of the auxiliary task considerably improves the performance.

Conference paper

Budd S, Sinclair M, Day T, Vlontzos A, Tan J, Liu T, Matthew J, Skelton E, Simpson J, Razavi R, Glocker B, Rueckert D, Robinson EC, Kainz Bet al., 2021, Detecting hypo-plastic left heart syndrome in fetal ultrasound via disease-specific atlas maps, 24th International Conference on Medical Image Computing and Computer Assisted Intervention, Publisher: Springer, Pages: 207-217, ISSN: 0302-9743

Fetal ultrasound screening during pregnancy plays a vital role in the early detection of fetal malformations which have potential long-term health impacts. The level of skill required to diagnose such malformations from live ultrasound during examination is high and resources for screening are often limited. We present an interpretable, atlas-learning segmentation method for automatic diagnosis of Hypo-plastic Left Heart Syndrome (HLHS) from a single ‘4 Chamber Heart’ view image. We propose to extend the recently introduced Image-and-Spatial Transformer Networks (Atlas-ISTN) into a framework that enables sensitising atlas generation to disease. In this framework we can jointly learn image segmentation, registration, atlas construction and disease prediction while providing a maximum level of clinical interpretability compared to direct image classification methods. As a result our segmentation allows diagnoses competitive with expert-derived manual diagnosis and yields an AUC-ROC of 0.978 (1043 cases for training, 260 for validation and 325 for testing).

Conference paper

Baltatzis V, Folgoc LL, Ellis S, Manzanera OEM, Bintsi K-M, Nair A, Desai S, Glocker B, Schnabel JAet al., 2021, The effect of the loss on generalization: empirical study on syntheticlung nodule data, Interpretability of Machine Intelligence in Medical Image Computing at MICCAI 2021, Publisher: Springer, Pages: 56-64

Convolutional Neural Networks (CNNs) are widely used for image classificationin a variety of fields, including medical imaging. While most studies deploycross-entropy as the loss function in such tasks, a growing number ofapproaches have turned to a family of contrastive learning-based losses. Eventhough performance metrics such as accuracy, sensitivity and specificity areregularly used for the evaluation of CNN classifiers, the features that theseclassifiers actually learn are rarely identified and their effect on theclassification performance on out-of-distribution test samples isinsufficiently explored. In this paper, motivated by the real-world task oflung nodule classification, we investigate the features that a CNN learns whentrained and tested on different distributions of a synthetic dataset withcontrolled modes of variation. We show that different loss functions lead todifferent features being learned and consequently affect the generalizationability of the classifier on unseen data. This study provides some importantinsights into the design of deep learning solutions for medical imaging tasks.

Conference paper

Filbrandt G, Kamnitsas K, Bernstein D, Taylor A, Glocker Bet al., 2021, Learning from Partially Overlapping Labels: Image Segmentation under Annotation Shift, MICCAI Workshop on Domain Adaptation and Representation Transfer

Scarcity of high quality annotated images remains a limiting factor fortraining accurate image segmentation models. While more and more annotateddatasets become publicly available, the number of samples in each individualdatabase is often small. Combining different databases to create larger amountsof training data is appealing yet challenging due to the heterogeneity as aresult of differences in data acquisition and annotation processes, oftenyielding incompatible or even conflicting information. In this paper, weinvestigate and propose several strategies for learning from partiallyoverlapping labels in the context of abdominal organ segmentation. We find thatcombining a semi-supervised approach with an adaptive cross entropy loss cansuccessfully exploit heterogeneously annotated data and substantially improvesegmentation accuracy compared to baseline and alternative approaches.

Conference paper

Glocker B, Musolesi M, Richens J, Uhler Cet al., 2021, Causality in digital medicine, Nature Communications, Vol: 12, Pages: 1-6, ISSN: 2041-1723

Ben Glocker (an expert in machine learning for medical imaging, Imperial College London), Mirco Musolesi (a data science and digital health expert, University College London), Jonathan Richens (an expert in diagnostic machine learning models, Babylon Health) and Caroline Uhler (a computational biology expert, MIT) talked to Nature Communications about their research interests in causality inference and how this can provide a robust framework for digital medicine studies and their implementation, across different fields of application.

Journal article

Usynin D, Ziller A, Makowski M, Braren R, Rueckert D, Glocker B, Kaissis G, Passerat-Palmbach Jet al., 2021, Adversarial interference and its mitigations in privacy-preserving collaborative machine learning, Nature Machine Intelligence, Vol: 3, Pages: 749-758, ISSN: 2522-5839

Despite the rapid increase of data available to train machine-learning algorithms in many domains, several applications suffer from a paucity of representative and diverse data. The medical and financial sectors are, for example, constrained by legal, ethical, regulatory and privacy concerns preventing data sharing between institutions. Collaborative learning systems, such as federated learning, are designed to circumvent such restrictions and provide a privacy-preserving alternative by eschewing data sharing and relying instead on the distributed remote execution of algorithms. However, such systems are susceptible to malicious adversarial interference attempting to undermine their utility or divulge confidential information. Here we present an overview and analysis of current adversarial attacks and their mitigations in the context of collaborative machine learning. We discuss the applicability of attack vectors to specific learning contexts and attempt to formulate a generic foundation for adversarial influence and mitigation mechanisms. We moreover show that a number of context-specific learning conditions are exploited in similar fashion across all settings. Lastly, we provide a focused perspective on open challenges and promising areas of future research in the field.

Journal article

Li J, Pimentel P, Szengel A, Ehlke M, Lamecker H, Zachow S, Estacio L, Doenitz C, Ramm H, Shi H, Chen X, Matzkin F, Newcombe V, Ferrante E, Jin Y, Ellis DG, Aizenberg MR, Kodym O, Spanel M, Herout A, Mainprize JG, Fishman Z, Hardisty MR, Bayat A, Shit S, Wang B, Liu Z, Eder M, Pepe A, Gsaxner C, Alves V, Zefferer U, von Campe G, Pistracher K, Schaefer U, Schmalstieg D, Menze BH, Glocker B, Egger Jet al., 2021, AutoImplant 2020-First MICCAI Challenge on Automatic Cranial Implant Design, IEEE TRANSACTIONS ON MEDICAL IMAGING, Vol: 40, Pages: 2329-2342, ISSN: 0278-0062

Journal article

Co KT, Muñoz-González L, Kanthan L, Glocker B, Lupu ECet al., 2021, Universal adversarial robustness of texture and shape-biased models, IEEE International Conference on Image Processing (ICIP)

Increasing shape-bias in deep neural networks has been shown to improverobustness to common corruptions and noise. In this paper we analyze theadversarial robustness of texture and shape-biased models to UniversalAdversarial Perturbations (UAPs). We use UAPs to evaluate the robustness of DNNmodels with varying degrees of shape-based training. We find that shape-biasedmodels do not markedly improve adversarial robustness, and we show thatensembles of texture and shape-biased models can improve universal adversarialrobustness while maintaining strong performance.

Conference paper

Sekuboyina A, Husseini ME, Bayat A, Löffler M, Liebl H, Li H, Tetteh G, Kukačka J, Payer C, Štern D, Urschler M, Chen M, Cheng D, Lessmann N, Hu Y, Wang T, Yang D, Xu D, Ambellan F, Amiranashvili T, Ehlke M, Lamecker H, Lehnert S, Lirio M, Olaguer NPD, Ramm H, Sahu M, Tack A, Zachow S, Jiang T, Ma X, Angerman C, Wang X, Brown K, Wolf M, Kirszenberg A, Puybareau É, Chen D, Bai Y, Rapazzo BH, Yeah T, Zhang A, Xu S, Hou F, He Z, Zeng C, Xiangshang Z, Liming X, Netherton TJ, Mumme RP, Court LE, Huang Z, He C, Wang L-W, Ling SH, Huynh LD, Boutry N, Jakubicek R, Chmelik J, Mulay S, Sivaprakasam M, Paetzold JC, Shit S, Ezhov I, Wiestler B, Glocker B, Valentinitsch A, Rempfler M, Menze BH, Kirschke JSet al., 2021, VerSe: a vertebrae labelling and segmentation benchmark for multi-detector CT images, Medical Image Analysis, ISSN: 1361-8415

Vertebral labelling and segmentation are two fundamental tasks in anautomated spine processing pipeline. Reliable and accurate processing of spineimages is expected to benefit clinical decision-support systems for diagnosis,surgery planning, and population-based analysis on spine and bone health.However, designing automated algorithms for spine processing is challengingpredominantly due to considerable variations in anatomy and acquisitionprotocols and due to a severe shortage of publicly available data. Addressingthese limitations, the Large Scale Vertebrae Segmentation Challenge (VerSe) wasorganised in conjunction with the International Conference on Medical ImageComputing and Computer Assisted Intervention (MICCAI) in 2019 and 2020, with acall for algorithms towards labelling and segmentation of vertebrae. Twodatasets containing a total of 374 multi-detector CT scans from 355 patientswere prepared and 4505 vertebrae have individually been annotated atvoxel-level by a human-machine hybrid algorithm (https://osf.io/nqjyw/,https://osf.io/t98fz/). A total of 25 algorithms were benchmarked on thesedatasets. In this work, we present the the results of this evaluation andfurther investigate the performance-variation at vertebra-level, scan-level,and at different fields-of-view. We also evaluate the generalisability of theapproaches to an implicit domain shift in data by evaluating the top performingalgorithms of one challenge iteration on data from the other iteration. Theprincipal takeaway from VerSe: the performance of an algorithm in labelling andsegmenting a spine scan hinges on its ability to correctly identify vertebraein cases of rare anatomical variations. The content and code concerning VerSecan be accessed at: https://github.com/anjany/verse.

Journal article

Osuala R, Kushibar K, Garrucho L, Linardos A, Szafranowska Z, Klein S, Glocker B, Diaz O, Lekadir Ket al., 2021, A review of generative adversarial networks in cancer imaging: new applications, new solutions, Publisher: arXiv

Despite technological and medical advances, the detection, interpretation,and treatment of cancer based on imaging data continue to pose significantchallenges. These include high inter-observer variability, difficulty ofsmall-sized lesion detection, nodule interpretation and malignancydetermination, inter- and intra-tumour heterogeneity, class imbalance,segmentation inaccuracies, and treatment effect uncertainty. The recentadvancements in Generative Adversarial Networks (GANs) in computer vision aswell as in medical imaging may provide a basis for enhanced capabilities incancer detection and analysis. In this review, we assess the potential of GANsto address a number of key challenges of cancer imaging, including datascarcity and imbalance, domain and dataset shifts, data access and privacy,data annotation and quantification, as well as cancer detection, tumourprofiling and treatment planning. We provide a critical appraisal of theexisting literature of GANs applied to cancer imagery, together withsuggestions on future research directions to address these challenges. Weanalyse and discuss 163 papers that apply adversarial training techniques inthe context of cancer imaging and elaborate their methodologies, advantages andlimitations. With this work, we strive to bridge the gap between the needs ofthe clinical cancer imaging community and the current and prospective researchon GANs in the artificial intelligence community.

Working paper

Islam M, Seenivasan L, Ren H, Glocker Bet al., 2021, Class-distribution-aware calibration for long-tailed visual recognition, ICML Workshop on Uncertainty and Robustness in Deep Learning

Despite impressive accuracy, deep neural networks are often miscalibrated andtend to overly confident predictions. Recent techniques like temperaturescaling (TS) and label smoothing (LS) show effectiveness in obtaining awell-calibrated model by smoothing logits and hard labels with scalar factors,respectively. However, the use of uniform TS or LS factor may not be optimalfor calibrating models trained on a long-tailed dataset where the modelproduces overly confident probabilities for high-frequency classes. In thisstudy, we propose class-distribution-aware TS (CDA-TS) and LS (CDA-LS) byincorporating class frequency information in model calibration in the contextof long-tailed distribution. In CDA-TS, the scalar temperature value isreplaced with the CDA temperature vector encoded with class frequency tocompensate for the over-confidence. Similarly, CDA-LS uses a vector smoothingfactor and flattens the hard labels according to their corresponding classdistribution. We also integrate CDA optimal temperature vector withdistillation loss, which reduces miscalibration in self-distillation (SD). Weempirically show that class-distribution-aware TS and LS can accommodate theimbalanced data distribution yielding superior performance in both calibrationerror and predictive accuracy. We also observe that SD with an extremelyimbalanced dataset is less effective in terms of calibration performance. Codeis available in https://github.com/mobarakol/Class-Distribution-Aware-TS-LS.

Conference paper

Islam M, Glocker B, 2021, Spatially varying label smoothing: capturing uncertainty from expertannotations, Information Processing in Medical Imaging (IPMI) 2021, Publisher: Springer Verlag, Pages: 677-688, ISSN: 0302-9743

The task of image segmentation is inherently noisy due to ambiguitiesregarding the exact location of boundaries between anatomical structures. Weargue that this information can be extracted from the expert annotations at noextra cost, and when integrated into state-of-the-art neural networks, it canlead to improved calibration between soft probabilistic predictions and theunderlying uncertainty. We built upon label smoothing (LS) where a network istrained on 'blurred' versions of the ground truth labels which has been shownto be effective for calibrating output predictions. However, LS is not takingthe local structure into account and results in overly smoothed predictionswith low confidence even for non-ambiguous regions. Here, we propose SpatiallyVarying Label Smoothing (SVLS), a soft labeling technique that captures thestructural uncertainty in semantic segmentation. SVLS also naturally lendsitself to incorporate inter-rater uncertainty when multiple labelmaps areavailable. The proposed approach is extensively validated on four clinicalsegmentation tasks with different imaging modalities, number of classes andsingle and multi-rater expert annotations. The results demonstrate that SVLS,despite its simplicity, obtains superior boundary prediction with improveduncertainty and model calibration.

Conference paper

Kart T, Fischer M, Kuestner T, Hepp T, Bamberg F, Winzeck S, Glocker B, Rueckert D, Gatidis Set al., 2021, Deep Learning-Based Automated Abdominal Organ Segmentation in the UK Biobank and German National Cohort Magnetic Resonance Imaging Studies, INVESTIGATIVE RADIOLOGY, Vol: 56, Pages: 401-408, ISSN: 0020-9996

Journal article

Popescu SG, Sharp DJ, Cole JH, Kamnitsas K, Glocker Bet al., 2021, Distributional gaussian process layers for outlier detection in imagesegmentation, Information Processing in Medical Imaging (IPMI) 2021, Publisher: arXiv

We propose a parameter efficient Bayesian layer for hierarchicalconvolutional Gaussian Processes that incorporates Gaussian Processes operatingin Wasserstein-2 space to reliably propagate uncertainty. This directlyreplaces convolving Gaussian Processes with a distance-preserving affineoperator on distributions. Our experiments on brain tissue-segmentation showthat the resulting architecture approaches the performance of well-establisheddeterministic segmentation algorithms (U-Net), which has never been achievedwith previous hierarchical Gaussian Processes. Moreover, by applying the samesegmentation model to out-of-distribution data (i.e., images with pathologysuch as brain tumors), we show that our uncertainty estimates result inout-of-distribution detection that outperforms the capabilities of previousBayesian networks and reconstruction-based approaches that learn normativedistributions.

Conference paper

This data is extracted from the Web of Science and reproduced under a licence from Thomson Reuters. You may not copy or re-distribute this data in whole or in part without the written consent of the Science business of Thomson Reuters.

Request URL: http://wlsprd.imperial.ac.uk:80/respub/WEB-INF/jsp/search-html.jsp Request URI: /respub/WEB-INF/jsp/search-html.jsp Query String: limit=30&id=00795421&person=true&page=4&respub-action=search.html