Imperial College London

DR BERNHARD KAINZ

Faculty of EngineeringDepartment of Computing

Reader in Medical Image Computing
 
 
 
//

Contact

 

+44 (0)20 7594 8349b.kainz Website CV

 
 
//

Location

 

372Huxley BuildingSouth Kensington Campus

//

Summary

 

Publications

Publication Type
Year
to

201 results found

Oppenheimer J, Mandegaran R, Staabs F, Adler A, Singöhl S, Kainz B, Heinrich M, Geroulakos G, Spiliopoulos S, Avgerinos Eet al., 2024, Remote Expert DVT Triaging of Novice-User Compression Sonography with AI-Guidance., Ann Vasc Surg, Vol: 99, Pages: 272-279

BACKGROUND: Compression ultrasonography of the leg is established for triaging proximal lower extremity deep vein thrombosis (DVT). AutoDVT, a machine-learning software, provides a tool for nonspecialists in acquiring compression sequences to be reviewed by an expert for patient triage. The purpose of this study was to test image acquisition and remote triaging in a clinical setting. METHODS: Patients with a suspected DVT were recruited at 2 centers in Germany and Greece. Enrolled patients underwent an artificial intelligence-guided two-point compression examination by a nonspecialist using a handheld ultrasound device prior to a standard scan. Images collected by the software were uploaded for blind review by 5 qualified physicians. All reviewers rated the quality of all sequences on the American College of Emergency Physicians (ACEP) image quality scale (score 1-5, ≥ 3 defined as adequate imaging quality) and for an ACEP score ≥3, chose "Compressible", "Incompressible", or "Other". Sensitivity and specificity were calculated for adequate quality scans with an assessment as "Compressible" or "Incompressible". We define this group as diagnostic quality. To simulate a triaging clinical algorithm, a post hoc analysis was performed merging the "incomplete", the "low quality", and the "Incompressible" into a high-risk group for proximal DVT. RESULTS: Seventy-three patients (average age 64.2 years, 44% females) were eligible for inclusion and scanned by 3 nonultrasound-qualified healthcare professionals. Three patients were excluded from further analysis due to incomplete scans. Sixty two of 70 (88.57%) of the completed scans were judged to be of adequate image quality with an average ACEP score of 3.35. Forty seven of 62 adequate AutoDVT scans were assessed as diagnostic quality, of which 8 were interpreted as positive for proximal DVT by the reviewers resulting in a sensitivity o

Journal article

Maier-Hein L, Reinke A, Godau P, Tizabi MD, Buettner F, Christodoulou E, Glocker B, Isensee F, Kleesiek J, Kozubek M, Reyes M, Riegler MA, Wiesenfarth M, Kavur AE, Sudre CH, Baumgartner M, Eisenmann M, Heckmann-Nötzel D, Rädsch T, Acion L, Antonelli M, Arbel T, Bakas S, Benis A, Blaschko MB, Cardoso MJ, Cheplygina V, Cimini BA, Collins GS, Farahani K, Ferrer L, Galdran A, van Ginneken B, Haase R, Hashimoto DA, Hoffman MM, Huisman M, Jannin P, Kahn CE, Kainmueller D, Kainz B, Karargyris A, Karthikesalingam A, Kofler F, Kopp-Schneider A, Kreshuk A, Kurc T, Landman BA, Litjens G, Madani A, Maier-Hein K, Martel AL, Mattson P, Meijering E, Menze B, Moons KGM, Müller H, Nichyporuk B, Nickel F, Petersen J, Rajpoot N, Rieke N, Saez-Rodriguez J, Sánchez CI, Shetty S, van Smeden M, Summers RM, Taha AA, Tiulpin A, Tsaftaris SA, Van Calster B, Varoquaux G, Jäger PFet al., 2024, Metrics reloaded: recommendations for image analysis validation, Nature Methods, Vol: 21, Pages: 195-212, ISSN: 1548-7091

Increasing evidence shows that flaws in machine learning (ML) algorithm validation are an underestimated global problem. In biomedical image analysis, chosen performance metrics often do not reflect the domain interest, and thus fail to adequately measure scientific progress and hinder translation of ML techniques into practice. To overcome this, we created Metrics Reloaded, a comprehensive framework guiding researchers in the problem-aware selection of metrics. Developed by a large international consortium in a multistage Delphi process, it is based on the novel concept of a problem fingerprint-a structured representation of the given problem that captures all aspects that are relevant for metric selection, from the domain interest to the properties of the target structure(s), dataset and algorithm output. On the basis of the problem fingerprint, users are guided through the process of choosing and applying appropriate validation metrics while being made aware of potential pitfalls. Metrics Reloaded targets image analysis problems that can be interpreted as classification tasks at image, object or pixel level, namely image-level classification, object detection, semantic segmentation and instance segmentation tasks. To improve the user experience, we implemented the framework in the Metrics Reloaded online tool. Following the convergence of ML methodology across application domains, Metrics Reloaded fosters the convergence of validation methodology. Its applicability is demonstrated for various biomedical use cases.

Journal article

Reinke A, Tizabi MD, Baumgartner M, Eisenmann M, Heckmann-Nötzel D, Kavur AE, Rädsch T, Sudre CH, Acion L, Antonelli M, Arbel T, Bakas S, Benis A, Buettner F, Cardoso MJ, Cheplygina V, Chen J, Christodoulou E, Cimini BA, Farahani K, Ferrer L, Galdran A, van Ginneken B, Glocker B, Godau P, Hashimoto DA, Hoffman MM, Huisman M, Isensee F, Jannin P, Kahn CE, Kainmueller D, Kainz B, Karargyris A, Kleesiek J, Kofler F, Kooi T, Kopp-Schneider A, Kozubek M, Kreshuk A, Kurc T, Landman BA, Litjens G, Madani A, Maier-Hein K, Martel AL, Meijering E, Menze B, Moons KGM, Müller H, Nichyporuk B, Nickel F, Petersen J, Rafelski SM, Rajpoot N, Reyes M, Riegler MA, Rieke N, Saez-Rodriguez J, Sánchez CI, Shetty S, Summers RM, Taha AA, Tiulpin A, Tsaftaris SA, Van Calster B, Varoquaux G, Yaniv ZR, Jäger PF, Maier-Hein Let al., 2024, Understanding metric-related pitfalls in image analysis validation, Nature Methods, Vol: 21, Pages: 182-194, ISSN: 1548-7091

Validation metrics are key for tracking scientific progress and bridging the current chasm between artificial intelligence research and its translation into practice. However, increasing evidence shows that, particularly in image analysis, metrics are often chosen inadequately. Although taking into account the individual strengths, weaknesses and limitations of validation metrics is a critical prerequisite to making educated choices, the relevant knowledge is currently scattered and poorly accessible to individual researchers. Based on a multistage Delphi process conducted by a multidisciplinary expert consortium as well as extensive community feedback, the present work provides a reliable and comprehensive common point of access to information on pitfalls related to validation metrics in image analysis. Although focused on biomedical image analysis, the addressed pitfalls generalize across application domains and are categorized according to a newly created, domain-agnostic taxonomy. The work serves to enhance global comprehension of a key topic in image analysis validation.

Journal article

Day TG, Matthew J, Budd SF, Venturini L, Wright R, Farruggia A, Vigneswaran TV, Zidere V, Hajnal JV, Razavi R, Simpson JM, Kainz Bet al., 2024, Interaction between clinicians and artificial intelligence to detect fetal atrioventricular septal defects on ultrasound: how can we optimize collaborative performance?, Ultrasound Obstet Gynecol

OBJECTIVES: Artificial intelligence (AI) has shown promise in improving the performance of fetal ultrasound screening in detecting congenital heart disease (CHD). The effect of giving AI advice to human operators has not been studied in this context. Giving additional information about AI model workings, such as confidence scores for AI predictions, may be a way of improving performance further. Our aims were to investigate whether AI advice improved overall diagnostic accuracy (using a single CHD lesion as an exemplar), and to see what, if any, additional information given to clinicians optimized the overall performance of the clinician-AI team. METHODS: An AI model was trained to classify a single fetal CHD lesion (atrioventricular septal defect, AVSD), using a retrospective cohort of 121,130 cardiac four chamber images extracted from 173 ultrasound scan videos (98 with normal hearts, 75 with AVSD). A ResNet50 model architecture was used. Temperature scaling of model prediction probability was performed on a validation set, and gradient-weighted class activation maps (grad-CAMs) produced. Ten clinicians (two consultant fetal cardiologists, three trainees in pediatric cardiology, and five fetal cardiac sonographers) were recruited from a center of fetal cardiology to participate. Each participant was shown 2000 fetal four chamber images in a random order (1,000 normal and 1,000 AVSD). The dataset was comprised of 500 images, each shown in four conditions: 1) image alone without AI output; 2) image with binary AI classification; 3) image with AI model confidence; 4) image with gradient-weighted class activation map image overlays. The clinicians were asked to classify each image as normal or AVSD. RESULTS: 20,000 image classifications were recorded from 10 clinicians. The AI model alone achieved an accuracy of 0.798 (95% CI 0.760 - 0.832), sensitivity of 0.868 (95% CI 0.834 - 0.902) and specificity of 0.728 (95% CI 0.702 - 0.754, and the clinicians without AI achiev

Journal article

Cechnicka S, Ball J, Reynaud H, Arthurs C, Roufosse C, Kainz Bet al., 2024, Realistic Data Enrichment for Robust Image Segmentation in Histopathology, Pages: 63-72, ISSN: 0302-9743

Poor performance of quantitative analysis in histopathological Whole Slide Images (WSI) has been a significant obstacle in clinical practice. Annotating large-scale WSIs manually is a demanding and time-consuming task, unlikely to yield the expected results when used for fully supervised learning systems. Rarely observed disease patterns and large differences in object scales are difficult to model through conventional patient intake. Prior methods either fall back to direct disease classification, which only requires learning a few factors per image, or report on average image segmentation performance, which is highly biased towards majority observations. Geometric image augmentation is commonly used to improve robustness for average case predictions and to enrich limited datasets. So far no method provided sampling of a realistic posterior distribution to improve stability, e.g. for the segmentation of imbalanced objects within images. Therefore, we propose a new approach, based on diffusion models, which can enrich an imbalanced dataset with plausible examples from underrepresented groups by conditioning on segmentation maps. Our method can simply expand limited clinical datasets making them suitable to train machine learning pipelines, and provides an interpretable and human-controllable way of generating histopathology images that are indistinguishable from real ones to human experts. We validate our findings on two datasets, one from the public domain and one from a Kidney Transplant study. 1 (The source code and trained models will be publicly available at the time of the conference, on huggingface and github. )

Conference paper

Sarapata G, Dushin Y, Morinan G, Ong J, Budhdeo S, Kainz B, O'Keeffe Jet al., 2023, Video-Based Activity Recognition for Automated Motor Assessment of Parkinson's Disease, IEEE JOURNAL OF BIOMEDICAL AND HEALTH INFORMATICS, Vol: 27, Pages: 5032-5041, ISSN: 2168-2194

Journal article

Wright R, Gomez A, Zimmer VA, Toussaint N, Khanal B, Matthew J, Skelton E, Kainz B, Rueckert D, V Hajnal J, Schnabel JAet al., 2023, Fast fetal head compounding from multi-view 3D ultrasound, MEDICAL IMAGE ANALYSIS, Vol: 89, ISSN: 1361-8415

Journal article

Day TG, Budd S, Tan J, Matthew J, Skelton E, Jowett V, Lloyd D, Gomez A, Hajnal JV, Razavi R, Kainz B, Simpson JMet al., 2023, Prenatal diagnosis of hypoplastic left heart syndrome on ultrasound using artificial intelligence: How does performance compare to a current screening programme?, PRENATAL DIAGNOSIS, ISSN: 0197-3851

Journal article

Cao Y, Monod A, Vlontzos A, Schmidtke L, Kainz Bet al., 2023, Topological information retrieval with dilation-invariant bottleneck comparative measures, Information and Inference: a Journal of the IMA, Vol: 12, Pages: 1964-1996, ISSN: 2049-8772

Appropriately representing elements in a database so that queries may be accurately matched is a central task in information retrieval; recently, this has been achieved by embedding the graphical structure of the database into a manifold in a hierarchy-preserving manner using a variety of metrics. Persistent homology is a tool commonly used in topological data analysis that is able to rigorously characterize a database in terms of both its hierarchy and connectivity structure. Computing persistent homology on a variety of embedded datasets reveals that some commonly used embeddings fail to preserve the connectivity. We show that those embeddings which successfully retain the database topology coincide in persistent homology by introducing two dilation-invariant comparative measures to capture this effect: in particular, they address the issue of metric distortion on manifolds. We provide an algorithm for their computation that exhibits greatly reduced time complexity over existing methods. We use these measures to perform the first instance of topology-based information retrieval and demonstrate its increased performance over the standard bottleneck distance for persistent homology. We showcase our approach on databases of different data varieties including text, videos and medical images.

Journal article

Day TG, Matthew J, Budd S, Hajnal JV, Simpson JM, Razavi R, Kainz Bet al., 2023, Sonographer interaction with artificial intelligence: collaboration or conflict?, ULTRASOUND IN OBSTETRICS & GYNECOLOGY, Vol: 62, Pages: 167-174, ISSN: 0960-7692

Journal article

Barcroft J, Lebbos C, Linton-Reid K, Landolfo C, Al Memar M, Bharwani N, Parker N, Kyriacou C, Cooper N, Murugesu S, Pikovsky M, Novak A, Bourne T, Kainz B, Saso Set al., 2023, EP.0248 Automated ultrasound feature extraction in adnexal masses, RCOG World Congress 2023, Publisher: Wiley, Pages: 98-98, ISSN: 1470-0328

Conference paper

Barcroft J, Lebbos C, Linton-Reid K, Landolfo C, Al Memar M, Bharwani N, Parker N, Kyriacou C, Cooper N, Murugesu S, Pikovsky M, Novak A, Bourne T, Kainz B, Saso Set al., 2023, Automated ultrasound feature extraction in adnexal masses, Publisher: WILEY, Pages: 98-98, ISSN: 1470-0328

Conference paper

Mitchell S, Bailey F, Smith A, Sayasneh Aet al., 2023, Category - Imaging in Gynaecology, Publisher: WILEY, Pages: 97-97, ISSN: 1470-0328

Conference paper

Hinterreiter A, Humer C, Kainz B, Streit Met al., 2023, ParaDime: A Framework for Parametric Dimensionality Reduction, COMPUTER GRAPHICS FORUM, Vol: 42, Pages: 337-348, ISSN: 0167-7055

Journal article

Day TG, Simpson JM, Razavi R, Kainz Bet al., 2023, Improving image labelling quality, NATURE MACHINE INTELLIGENCE, Vol: 5, Pages: 335-336

Journal article

Day TG, Kainz B, Razavi R, Simpson Jet al., 2023, RE: Wang et al. Diagnosis of fetal total anomalous pulmonary venous connection based on the post-left atrium space ratio using artificial intelligence, PRENATAL DIAGNOSIS, Vol: 43, Pages: 400-401, ISSN: 0197-3851

Journal article

Vlontzos A, Kainz B, Gilligan-Lee CM, 2023, Estimating categorical counterfactuals via deep twin networks, Nature Machine Intelligence, Vol: 5, Pages: 159-168, ISSN: 2522-5839

Counterfactual inference is a powerful tool, capable of solving challenging problems in high-profile sectors. To perform counterfactual inference, we require knowledge of the underlying causal mechanisms. However, causal mechanisms cannot be uniquely determined from observations and interventions alone. This raises the question of how to choose the causal mechanisms so that the resulting counterfactual inference is trustworthy in a given domain. This question has been addressed in causal models with binary variables, but for the case of categorical variables, it remains unanswered. We address this challenge by introducing for causal models with categorical variables the notion of counterfactual ordering, a principle positing desirable properties that causal mechanisms should possess and prove that it is equivalent to specific functional constraints on the causal mechanisms. To learn causal mechanisms satisfying these constraints, and perform counterfactual inference with them, we introduce deep twin networks. These are deep neural networks that, when trained, are capable of twin network counterfactual inference—an alternative to the abduction–action–prediction method. We empirically test our approach on diverse real-world and semisynthetic data from medicine, epidemiology and finance, reporting accurate estimation of counterfactual probabilities while demonstrating the issues that arise with counterfactual reasoning when counterfactual ordering is not enforced.

Journal article

Ma Q, Li L, Robinson EC, Kainz B, Rueckert D, Alansary Aet al., 2023, CortexODE: learning cortical surface reconstruction by neural ODEs, IEEE Transactions on Medical Imaging, Vol: 42, Pages: 430-443, ISSN: 0278-0062

We present CortexODE, a deep learning framework for cortical surface reconstruction. CortexODE leverages neural ordinary differential equations (ODEs) to deform an input surface into a target shape by learning a diffeomorphic flow. The trajectories of the points on the surface are modeled as ODEs, where the derivatives of their coordinates are parameterized via a learnable Lipschitz-continuous deformation network. This provides theoretical guarantees for the prevention of self-intersections. CortexODE can be integrated to an automatic learning-based pipeline, which reconstructs cortical surfaces efficiently in less than 5 seconds. The pipeline utilizes a 3D U-Net to predict a white matter segmentation from brain Magnetic Resonance Imaging (MRI) scans, and further generates a signed distance function that represents an initial surface. Fast topology correction is introduced to guarantee homeomorphism to a sphere. Following the isosurface extraction step, two CortexODE models are trained to deform the initial surface to white matter and pial surfaces respectively. The proposed pipeline is evaluated on large-scale neuroimage datasets in various age groups including neonates (25-45 weeks), young adults (22-36 years) and elderly subjects (55-90 years). Our experiments demonstrate that the CortexODE-based pipeline can achieve less than 0.2mm average geometric error while being orders of magnitude faster compared to conventional processing pipelines.

Journal article

Baugh M, Tan J, Müller JP, Dombrowski M, Batten J, Kainz Bet al., 2023, Many Tasks Make Light Work: Learning to Localise Medical Anomalies from Multiple Synthetic Tasks, Pages: 162-172, ISBN: 9783031439063

There is a growing interest in single-class modelling and out-of-distribution detection as fully supervised machine learning models cannot reliably identify classes not included in their training. The long tail of infinitely many out-of-distribution classes in real-world scenarios, e.g., for screening, triage, and quality control, means that it is often necessary to train single-class models that represent an expected feature distribution, e.g., from only strictly healthy volunteer data. Conventional supervised machine learning would require the collection of datasets that contain enough samples of all possible diseases in every imaging modality, which is not realistic. Self-supervised learning methods with synthetic anomalies are currently amongst the most promising approaches, alongside generative auto-encoders that analyse the residual reconstruction error. However, all methods suffer from a lack of structured validation, which makes calibration for deployment difficult and dataset-dependant. Our method alleviates this by making use of multiple visually-distinct synthetic anomaly learning tasks for both training and validation. This enables more robust training and generalisation. With our approach we can readily outperform state-of-the-art methods, which we demonstrate on exemplars in brain MRI and chest X-rays. Code is available at https://github.com/matt-baugh/many-tasks-make-light-work.

Book chapter

Shkëmbi G, Müller JP, Li Z, Breininger K, Schüffler P, Kainz Bet al., 2023, Whole Slide Multiple Instance Learning for Predicting Axillary Lymph Node Metastasis, Pages: 11-20, ISBN: 9783031449918

Breast cancer is a major concern for women’s health globally, with axillary lymph node (ALN) metastasis identification being critical for prognosis evaluation and treatment guidance. This paper presents a deep learning (DL) classification pipeline for quantifying clinical information from digital core-needle biopsy (CNB) images, with one step less than existing methods. A publicly available dataset of 1058 patients was used to evaluate the performance of different baseline state-of-the-art (SOTA) DL models in classifying ALN metastatic status based on CNB images. An extensive ablation study of various data augmentation techniques was also conducted. Finally, the manual tumor segmentation and annotation step performed by the pathologists was assessed. Our proposed training scheme outperformed SOTA by 3.73%. Source code is available here.

Book chapter

Jehn C, Müller JP, Kainz B, 2023, Learnable Slice-to-volume Reconstruction for Motion Compensation in Fetal Magnetic Resonance Imaging, Pages: 25-31, ISSN: 1431-472X

Reconstructing motion-free 3D magnetic resonance imaging (MRI) volumes of fetal organs comes with the challenge of motion artefacts due to fetal motion and maternal respiration. Current methods rely on iterative procedures of outlier removal, super-resolution (SR) and slice-to-volume registration (SVR). Long runtimes and missing volume preservation over multiple iterations are still challenges for widespread clinical implementation. We envision an end-to-end learnable reconstruction framework that enables faster inference times and that can be steered by downstream tasks like segmentation. Therefore, we propose a new hybrid architecture for fetal brain reconstruction, consisting of a fully differentiable pre-registration module and a CycleGAN model for 3D image-toimage translation that is pretrained on our custom-generated dataset of 209 pairs of low-resolution (LoRes) and high-resolution (HiRes) fetal brain volumes. Our results are evaluated quantitatively with respect to five different similarity metrics. We incorporate the learned perceptual image patch similarity (LPIPS) metric and apply it to quantify volumetric image similarity for the first time in literature. Furthermore, we evaluate the model outputs qualitatively and conduct an expert survey to compare our method’s reconstruction quality to an established approach.

Conference paper

Kainz B, 2023, Keynote: Beyond Supervised Learning Exploring Novel Machine Learning Approaches for Robust Medical Image Analysis, ISSN: 1431-472X

Machine learning has been widely regarded as a solution for diagnostic automation in medical image analysis, but there are still unsolved problems in robust modelling of normal appearance and identification of features pointing into the long tail of population data. In this talk, I will explore the fitness of machine learning for applications at the front line of care and high throughput population health screening, specifically in prenatal health screening with ultrasound and MRI, cardiac imaging, and bedside diagnosis of deep vein thrombosis. I will discuss the requirements for such applications and how quality control can be achieved through robust estimation of algorithmic uncertainties and automatic robust modelling of expected anatomical structures. I will also explore the potential for improving models through active learning and the accuracy of nonexpert labelling workforces. However, I will argue that supervised machine learning might not be fit for purpose, as it cannot handle the unknown and requires a lot of annotated examples from well-defined pathological appearance. This categorization paradigm cannot be deployed earlier in the diagnostic pathway or for health screening, where a growing number of potentially hundred-thousands of medically catalogued illnesses may be relevant for diagnosis. Therefore, I introduce the idea of normative representation learning as a new machine learning paradigm for medical imaging. This paradigm can provide patient-specific computational tools for robust confirmation of normality, image quality control, health screening, and prevention of disease before onset. I will present novel deep learning approaches that can learn without manual labels from healthy patient data only. Our initial success with single class learning and self-supervised learning will be discussed, along with an outlook into the future with causal machine learning methods and the potential of advanced generative models [1].

Conference paper

Zimmerer D, Full P, Isensee F, Jäger P, Adler T, Petersen J, Köhler G, Ross T, Reinke A, Kascenas A, Jensen BS, O’Neil AQ, Tan J, Hou B, Batten J, Qiu H, Kainz B, Shvetsova N, Fedulova I, Dylov DV, Yu B, Zhai J, Hu J, Si R, Zhou S, Wang S, Li X, Chen X, Zhao Y, Marimont SN, Tarroni G, Saase V, Maier-Hein L, Maier-Hein Ket al., 2023, Abstract: MOOD 2020: A Public Benchmark for Out-of-distribution Detection and Localization on Medical Images, ISSN: 1431-472X

Detecting out-of-distribution (OoD) data is one of the greatest challenges in safe and robust deployment of machine learning algorithms in medicine. When the algorithms encounter cases that deviate from the distribution of the training data, they often produce incorrect and over-confident predictions. OoD detection algorithms aim to catch erroneous predictions in advance by analysing the data distribution and detecting potential instances of failure. Moreover, flagging OoD cases may support human readers in identifying incidental findings. Due to the increased interest in OoD algorithms, benchmarks for different domains have recently been established. In the medical imaging domain, for which reliable predictions are often essential, an open benchmark has been missing. We introduce the Medical-Out-Of-Distribution-Analysis-Challenge (MOOD) as an open, fair, and unbiased benchmark for OoD methods in the medical imaging domain. The analysis of the submitted algorithms shows that performance has a strong positive correlation with the perceived difficulty, and that all algorithms show a high variance for different anomalies, making it yet hard to recommend them for clinical practice. We also see a strong correlation between challenge ranking and performance on a simple toy test set, indicating that this might be a valuable addition as a proxy dataset during anomaly detection algorithm development [1].

Conference paper

Schmidtke L, Hou B, Vlontzos A, Kainz Bet al., 2023, Self-supervised 3D Human Pose Estimation in Static Video via Neural Rendering, Pages: 704-713, ISSN: 0302-9743

Inferring 3D human pose from 2D images is a challenging and long-standing problem in the field of computer vision with many applications including motion capture, virtual reality, surveillance or gait analysis for sports and medicine. We present preliminary results for a method to estimate 3D pose from 2D video containing a single person and a static background without the need for any manual landmark annotations. We achieve this by formulating a simple yet effective self-supervision task: our model is required to reconstruct a random frame of a video given a frame from another timepoint and a rendered image of a transformed human shape template. Crucially for optimisation, our ray casting based rendering pipeline is fully differentiable, enabling end to end training solely based on the reconstruction task.

Conference paper

Dombrowski M, Reynaud H, Baugh M, Kainz Bet al., 2023, Foreground-Background Separation through Concept Distillation from Generative Image Foundation Models, Pages: 988-998, ISSN: 1550-5499

Curating datasets for object segmentation is a difficult task. With the advent of large-scale pre-trained generative models, conditional image generation has been given a significant boost in result quality and ease of use. In this paper, we present a novel method that enables the generation of general foreground-background segmentation models from simple textual descriptions, without requiring segmentation labels. We leverage and explore pre-trained latent diffusion models, to automatically generate weak segmentation masks for concepts and objects. The masks are then used to fine-tune the diffusion model on an inpainting task, which enables fine-grained removal of the object, while at the same time providing a synthetic foreground and background dataset. We demonstrate that using this method beats previous methods in both discriminative and generative performance and closes the gap with fully supervised training while requiring no pixel-wise object labels. We show results on the task of segmenting four different objects (humans, dogs, cars, birds) and a use case scenario in medical image analysis. The code is available at https://github.com/MischaD/fobadiffusion.

Conference paper

Zimmer VA, Gomez A, Skelton E, Wright R, Wheeler G, Deng S, Ghavami N, Lloyd K, Matthew J, Kainz B, Rueckert D, Hajnal JV, Schnabel JAet al., 2023, Placenta segmentation in ultrasound imaging: Addressing sources of uncertainty and limited field-of-view, Medical Image Analysis, Vol: 83, ISSN: 1361-8415

Automatic segmentation of the placenta in fetal ultrasound (US) is challenging due to the (i) high diversity of placenta appearance, (ii) the restricted quality in US resulting in highly variable reference annotations, and (iii) the limited field-of-view of US prohibiting whole placenta assessment at late gestation. In this work, we address these three challenges with a multi-task learning approach that combines the classification of placental location (e.g., anterior, posterior) and semantic placenta segmentation in a single convolutional neural network. Through the classification task the model can learn from larger and more diverse datasets while improving the accuracy of the segmentation task in particular in limited training set conditions. With this approach we investigate the variability in annotations from multiple raters and show that our automatic segmentations (Dice of 0.86 for anterior and 0.83 for posterior placentas) achieve human-level performance as compared to intra- and inter-observer variability. Lastly, our approach can deliver whole placenta segmentation using a multi-view US acquisition pipeline consisting of three stages: multi-probe image acquisition, image fusion and image segmentation. This results in high quality segmentation of larger structures such as the placenta in US with reduced image artifacts which are beyond the field-of-view of single probes.

Journal article

Li L, Ma Q, Ouyang C, Li Z, Meng Q, Zhang W, Qiao M, Kyriakopoulou V, Hajnal JV, Rueckert D, Kainz Bet al., 2023, Robust Segmentation via Topology Violation Detection and Feature Synthesis, Pages: 67-77, ISSN: 0302-9743

Despite recent progress of deep learning-based medical image segmentation techniques, fully automatic results often fail to meet clinically acceptable accuracy, especially when topological constraints should be observed, e.g., closed surfaces. Although modern image segmentation methods show promising results when evaluated based on conventional metrics such as the Dice score or Intersection-over-Union, these metrics do not reflect the correctness of a segmentation in terms of a required topological genus. Existing approaches estimate and constrain the topological structure via persistent homology (PH). However, these methods are not computationally efficient as calculating PH is not differentiable. To overcome this problem, we propose a novel approach for topological constraints based on the multi-scale Euler Characteristic (EC). To mitigate computational complexity, we propose a fast formulation for the EC that can inform the learning process of arbitrary segmentation networks via topological violation maps. Topological performance is further facilitated through a corrective convolutional network block. Our experiments on two datasets show that our method can significantly improve topological correctness.

Conference paper

Reynaud H, Qiao M, Dombrowski M, Day T, Razavi R, Gomez A, Leeson P, Kainz Bet al., 2023, Feature-Conditioned Cascaded Video Diffusion Models for Precise Echocardiogram Synthesis, Pages: 142-152, ISSN: 0302-9743

Image synthesis is expected to provide value for the translation of machine learning methods into clinical practice. Fundamental problems like model robustness, domain transfer, causal modelling, and operator training become approachable through synthetic data. Especially, heavily operator-dependant modalities like Ultrasound imaging require robust frameworks for image and video generation. So far, video generation has only been possible by providing input data that is as rich as the output data, e.g., image sequence plus conditioning in → video out. However, clinical documentation is usually scarce and only single images are reported and stored, thus retrospective patient-specific analysis or the generation of rich training data becomes impossible with current approaches. In this paper, we extend elucidated diffusion models for video modelling to generate plausible video sequences from single images and arbitrary conditioning with clinical parameters. We explore this idea within the context of echocardiograms by looking into the variation of the Left Ventricle Ejection Fraction, the most essential clinical metric gained from these examinations. We use the publicly available EchoNet-Dynamic dataset for all our experiments. Our image to sequence approach achieves an R2 score of 93%, which is 38 points higher than recently proposed sequence to sequence generation methods. Code and weights are available at https://github.com/HReynaud/EchoDiffusion.

Conference paper

Ma Q, Li L, Kyriakopoulou V, Hajnal JV, Robinson EC, Kainz B, Rueckert Det al., 2023, Conditional Temporal Attention Networks for Neonatal Cortical Surface Reconstruction, Pages: 312-322, ISSN: 0302-9743

Cortical surface reconstruction plays a fundamental role in modeling the rapid brain development during the perinatal period. In this work, we propose Conditional Temporal Attention Network (CoTAN), a fast end-to-end framework for diffeomorphic neonatal cortical surface reconstruction. CoTAN predicts multi-resolution stationary velocity fields (SVF) from neonatal brain magnetic resonance images (MRI). Instead of integrating multiple SVFs, CoTAN introduces attention mechanisms to learn a conditional time-varying velocity field (CTVF) by computing the weighted sum of all SVFs at each integration step. The importance of each SVF, which is estimated by learned attention maps, is conditioned on the age of the neonates and varies with the time step of integration. The proposed CTVF defines a diffeomorphic surface deformation, which reduces mesh self-intersection errors effectively. It only requires 0.21 s to deform an initial template mesh to cortical white matter and pial surfaces for each brain hemisphere. CoTAN is validated on the Developing Human Connectome Project (dHCP) dataset with 877 3D brain MR images acquired from preterm and term born neonates. Compared to state-of-the-art baselines, CoTAN achieves superior performance with only 0.12 ± 0.03 mm geometric error and 0.07 ± 0.03% self-intersecting faces. The visualization of our attention maps illustrates that CoTAN indeed learns coarse-to-fine surface deformations automatically without intermediate supervision.

Conference paper

Kainz B, Schnabel J, Noble JA, Khanal B, Müller JP, Day Tet al., 2023, Preface, Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), Vol: 14337 LNCS, Pages: v-vii, ISSN: 0302-9743

Journal article

This data is extracted from the Web of Science and reproduced under a licence from Thomson Reuters. You may not copy or re-distribute this data in whole or in part without the written consent of the Science business of Thomson Reuters.

Request URL: http://wlsprd.imperial.ac.uk:80/respub/WEB-INF/jsp/search-html.jsp Request URI: /respub/WEB-INF/jsp/search-html.jsp Query String: respub-action=search.html&id=00646162&limit=30&person=true