DrJohnLees

Faculty of Medicine, School of Public Health

Visiting Researcher

Contact

+44 (0)20 7594 2939j.lees Website

Location

UG4Sir Alexander Fleming BuildingSouth Kensington Campus

Summary

Publications

Xie O, Morris JM, Hayes AJ, Towers RJ, Jespersen MG, Lees JA, Ben Zakour NL, Berking O, Baines SL, Carter GP, Tonkin-Hill G, Schrieber L, McIntyre L, Lacey JA, James TB, Sriprakash KS, Beatson SA, Hasegawa T, Giffard P, Steer AC, Batzloff MR, Beall BW, Pinho MD, Ramirez M, Bessen DE, Dougan G, Bentley SD, Walker MJ, Currie BJ, Tong SYC, McMillan DJ, Davies MRet al., 2024, Inter-species gene flow drives ongoing evolution of Streptococcus pyogenes and Streptococcus dysgalactiae subsp. equisimilis., Nat Commun, Vol: 15

Streptococcus dysgalactiae subsp. equisimilis (SDSE) is an emerging cause of human infection with invasive disease incidence and clinical manifestations comparable to the closely related species, Streptococcus pyogenes. Through systematic genomic analyses of 501 disseminated SDSE strains, we demonstrate extensive overlap between the genomes of SDSE and S. pyogenes. More than 75% of core genes are shared between the two species with one third demonstrating evidence of cross-species recombination. Twenty-five percent of mobile genetic element (MGE) clusters and 16 of 55 SDSE MGE insertion regions were shared across species. Assessing potential cross-protection from leading S. pyogenes vaccine candidates on SDSE, 12/34 preclinical vaccine antigen genes were shown to be present in >99% of isolates of both species. Relevant to possible vaccine evasion, six vaccine candidate genes demonstrated evidence of inter-species recombination. These findings demonstrate previously unappreciated levels of genomic overlap between these closely related pathogens with implications for streptococcal pathobiology, disease surveillance and prevention.

Journal article

Batisti Biffignandi G, Chindelevitch L, Corbella M, Feil EJ, Sassera D, Lees JAet al., 2024, Optimising machine learning prediction of minimum inhibitory concentrations in Klebsiella pneumoniae., Microb Genom, Vol: 10

Minimum Inhibitory Concentrations (MICs) are the gold standard for quantitatively measuring antibiotic resistance. However, lab-based MIC determination can be time-consuming and suffers from low reproducibility, and interpretation as sensitive or resistant relies on guidelines which change over time. Genome sequencing and machine learning promise to allow in silico MIC prediction as an alternative approach which overcomes some of these difficulties, albeit the interpretation of MIC is still needed. Nevertheless, precisely how we should handle MIC data when dealing with predictive models remains unclear, since they are measured semi-quantitatively, with varying resolution, and are typically also left- and right-censored within varying ranges. We therefore investigated genome-based prediction of MICs in the pathogen Klebsiella pneumoniae using 4367 genomes with both simulated semi-quantitative traits and real MICs. As we were focused on clinical interpretation, we used interpretable rather than black-box machine learning models, namely, Elastic Net, Random Forests, and linear mixed models. Simulated traits were generated accounting for oligogenic, polygenic, and homoplastic genetic effects with different levels of heritability. Then we assessed how model prediction accuracy was affected when MICs were framed as regression and classification. Our results showed that treating the MICs differently depending on the number of concentration levels of antibiotic available was the most promising learning strategy. Specifically, to optimise both prediction accuracy and inference of the correct causal variants, we recommend considering the MICs as continuous and framing the learning problem as a regression when the number of observed antibiotic concentration levels is large, whereas with a smaller number of concentration levels they should be treated as a categorical variable and the learning problem should be framed as a classification. Our findings also underline how predicti

Journal article

Chapman LAC, Aubry M, Maset N, Russell TW, Knock ES, Lees JA, Mallet H-P, Cao-Lormeau V-M, Kucharski AJet al., 2023, Impact of vaccinations, boosters and lockdowns on COVID-19 waves in French Polynesia., Nat Commun, Vol: 14

Estimating the impact of vaccination and non-pharmaceutical interventions on COVID-19 incidence is complicated by several factors, including successive emergence of SARS-CoV-2 variants of concern and changing population immunity from vaccination and infection. We develop an age-structured multi-strain COVID-19 transmission model and inference framework to estimate vaccination and non-pharmaceutical intervention impact accounting for these factors. We apply this framework to COVID-19 waves in French Polynesia and estimate that the vaccination programme averted 34.8% (95% credible interval: 34.5-35.2%) of 223,000 symptomatic cases, 49.6% (48.7-50.5%) of 5830 hospitalisations and 64.2% (63.1-65.3%) of 1540 hospital deaths that would have occurred in a scenario without vaccination up to May 2022. We estimate the booster campaign contributed 4.5%, 1.9%, and 0.4% to overall reductions in cases, hospitalisations, and deaths. Our results suggest that removing lockdowns during the first two waves would have had non-linear effects on incidence by altering accumulation of population immunity. Our estimates of vaccination and booster impact differ from those for other countries due to differences in age structure, previous exposure levels and timing of variant introduction relative to vaccination, emphasising the importance of detailed analysis that accounts for these factors.

Journal article

Derelle R, Lees J, Phelan J, Lalvani A, Arinaminpathy N, Chindelevitch Let al., 2023, fastlin: an ultra-fast program for Mycobacterium tuberculosis complex lineage typing, Bioinformatics, Vol: 39, ISSN: 1367-4803

SUMMARY: Fastlin is a bioinformatics tool designed for rapid Mycobacterium tuberculosis complex (MTBC) lineage typing. It utilizes an ultra-fast alignment-free approach to detect previously identified barcode single nucleotide polymorphisms associated with specific MTBC lineages. In a comprehensive benchmarking against existing tools, fastlin demonstrated high accuracy and significantly faster running times. AVAILABILITY AND IMPLEMENTATION: fastlin is freely available at https://github.com/rderelle/fastlin and can easily be installed using Conda.

Journal article

Horsfield ST, Tonkin-Hill G, Croucher NJ, Lees JAet al., 2023, Accurate and fast graph-based pangenome annotation and clustering with ggCaller., Genome Res, Vol: 33, Pages: 1622-1637

Bacterial genomes differ in both gene content and sequence mutations, which underlie extensive phenotypic diversity, including variation in susceptibility to antimicrobials or vaccine-induced immunity. To identify and quantify important variants, all genes within a population must be predicted, functionally annotated, and clustered, representing the "pangenome." Despite the volume of genome data available, gene prediction and annotation are currently conducted in isolation on individual genomes, which is computationally inefficient and frequently inconsistent across genomes. Here, we introduce the open-source software graph-gene-caller (ggCaller). ggCaller combines gene prediction, functional annotation, and clustering into a single workflow using population-wide de Bruijn graphs, removing redundancy in gene annotation and resulting in more accurate gene predictions and orthologue clustering. We applied ggCaller to simulated and real-world bacterial data sets containing hundreds or thousands of genomes, comparing it to current state-of-the-art tools. ggCaller has considerable speed-ups with equivalent or greater accuracy, particularly with data sets containing complex sources of error, such as assembly contamination or fragmentation. ggCaller is also an important extension to bacterial genome-wide association studies, enabling querying of annotated graphs for functional analyses. We highlight this application by functionally annotating DNA sequences with significant associations to tetracycline and macrolide resistance in Streptococcus pneumoniae, identifying key resistance determinants that were missed when using only a single reference genome. ggCaller is a novel bacterial genome analysis tool with applications in bacterial evolution and epidemiology.

Journal article

Horsfield ST, Croucher NJ, Lees JA, 2023, Accurate and fast graph-based pangenome annotation and clustering with ggCaller

<jats:title>Abstract</jats:title><jats:p>Bacterial genomes differ in both gene content and sequence mutations, which can cause important clinical phenotypic differences such as vaccine escape or antimicrobial resistance. To identify and quantify important variants, all genes within a population must be predicted, functionally annotated and clustered, representing the ‘pangenome’. Despite the volume of genome data available, gene prediction and annotation are currently conducted in isolation on individual genomes, which is computationally inefficient and frequently inconsistent across genomes. Here, we introduce the open-source software graph-gene-caller (ggCaller;<jats:ext-link xmlns:xlink="http://www.w3.org/1999/xlink" ext-link-type="uri" xlink:href="https://github.com/samhorsfield96/ggCaller">https://github.com/samhorsfield96/ggCaller</jats:ext-link>). ggCaller combines gene prediction, functional annotation and clustering into a single step using population-wide de Bruijn Graphs, removing redundancy in gene annotation, and resulting in more accurate gene predictions and orthologue clustering. We applied ggCaller to simulated and real-world bacterial genome datasets, comparing it to current state-of-the-art tools. ggCaller is ~50x faster with equivalent or greater accuracy, particularly in datasets with complex sources of error, such as assembly contamination or fragmentation. ggCaller is also an important extension to bacterial genome-wide association studies, enabling querying of annotated graphs for functional analyses. We highlight this application by functionally annotating DNA sequences with significant associations to tetracycline and macrolide resistance in<jats:italic>Streptococcus pneumoniae</jats:italic>, identifying key resistance determinants that were missed when using only a single reference genome. ggCaller is a novel bacterial genome analysis tool with applications

Journal article

Mai TT, Lees JA, Gladstone RA, Corander Jet al., 2023, Inferring the heritability of bacterial traits in the era of machine learning., Bioinform Adv, Vol: 3

UNLABELLED: Quantification of heritability is a fundamental desideratum in genetics, which allows an assessment of the contribution of additive genetic variation to the variability of a trait of interest. The traditional computational approaches for assessing the heritability of a trait have been developed in the field of quantitative genetics. However, the rise of modern population genomics with large sample sizes has led to the development of several new machine learning-based approaches to inferring heritability. In this article, we systematically summarize recent advances in machine learning which can be used to infer heritability. We focus on an application of these methods to bacterial genomes, where heritability plays a key role in understanding phenotypes such as antibiotic resistance and virulence, which are particularly important due to the rising frequency of antimicrobial resistance. By designing a heritability model incorporating realistic patterns of genome-wide linkage disequilibrium for a frequently recombining bacterial pathogen, we test the performance of a wide spectrum of different inference methods, including also GCTA. In addition to the synthetic data benchmark, we present a comparison of the methods for antibiotic resistance traits for multiple bacterial pathogens. Insights from the benchmarking and real data analyses indicate a highly variable performance of the different methods and suggest that heritability inference would likely benefit from tailoring of the methods to the specific genetic architecture of the target organism. AVAILABILITY AND IMPLEMENTATION: The R codes and data used in the numerical experiments are available at: https://github.com/tienmt/her_MLs.

Journal article

Lees JA, Tonkin-Hill G, Yang Z, Corander Jet al., 2022, Mandrake: visualizing microbial population structure by embedding millions of genomes into a low-dimensional representation, PHILOSOPHICAL TRANSACTIONS OF THE ROYAL SOCIETY B-BIOLOGICAL SCIENCES, Vol: 377, ISSN: 0962-8436

Author Web Link
Cite
Citations: 4

Journal article

Aggarwal SD, Lees JA, Jacobs NT, Bee GCW, Abruzzo AR, Weiser JNet al., 2022, BlpC-mediated selfish program leads to rapid loss of Streptococcus pneumoniae clonal diversity during infection

<jats:title>SUMMARY</jats:title><jats:p>Chromosomal barcoding and high-throughput sequencing were used to investigate the population dynamics of <jats:italic>Streptococcus pneumoniae</jats:italic>. During infant mouse colonization, >35-fold reduction in diversity and expansion of a single clonal lineage was observed within 1 day post-inoculation. This loss of diversity was not due to immune factors, host microbiota or exclusively because of genetic drift. Rather, it required the expression of <jats:italic>blp</jats:italic> bacteriocins induced by the BlpC-quorum sensing pheromone. This points towards the role of intra-strain competition whereby the subpopulation reaching a quorum eliminates others that have yet to activate the <jats:italic>blp</jats:italic> locus. We show that this loss of diversity also restricts the number of unique clones that could establish colonization during transmission between hosts. Moreover, we show that genetic variation in the <jats:italic>blp</jats:italic> locus is associated with transmissibility in the human population. We posit this is due to its importance in clonal selection and its role as a selfish element.</jats:p>

Journal article

Kremer PHC, Ferwerda B, Bootsma HJ, Rots NY, Wijmenga-Monsuur AJ, Sanders EAM, Trzcinski K, Wyllie AL, Turner P, van der Ende A, Brouwer MC, Bentley SD, van de Beek D, Lees JAet al., 2022, Pneumococcal genetic variability in age-dependent bacterial carriage, ELIFE, Vol: 11, ISSN: 2050-084X

Author Web Link
Cite
Citations: 3

Journal article

Stevens EJ, Morse DJ, Bonini D, Duggan S, Brignoli T, Recker M, Lees JA, Croucher NJ, Bentley S, Wilson DJ, Earle SG, Dixon R, Nobbs A, Jenkinson H, van Opijnen T, Thibault D, Wilkinson OJ, Dillingham MS, Carlile S, McLoughlin RM, Massey RCet al., 2022, Targeted control of pneumolysin production by a mobile genetic element in Streptococcus pneumoniae, MICROBIAL GENOMICS, Vol: 8, ISSN: 2057-5858

Author Web Link
Cite
Citations: 2

Journal article

Gladstone RA, Siira L, Brynildsrud OB, Vestrheim DF, Turner P, Clarke SC, Srifuengfung S, Ford R, Lehmann D, Egorova E, Voropaeva E, Haraldsson G, Kristinsson KG, McGee L, Breiman RF, Bentley SD, Sheppard CL, Fry NK, Corander J, Toropainen M, Steens Aet al., 2022, International links between Streptococcus pneumoniae vaccine serotype 4 sequence type (ST) 801 in Northern European shipyard outbreaks of invasive pneumococcal disease, VACCINE, Vol: 40, Pages: 1054-1060, ISSN: 0264-410X

Author Web Link
Cite
Citations: 2

Journal article

Zangari T, Zafar MA, Lees JA, Abruzzo AR, Bee GCW, Weiser JNet al., 2021, Pneumococcal capsule blocks protection by immunization with conserved surface proteins, NPJ VACCINES, Vol: 6

Author Web Link
Cite
Citations: 10

Journal article

Sonabend R, Whittles LK, Imai N, Perez Guzman PN, Knock E, Rawson T, Gaythorpe KA, Djaafara A, Hinsley W, Fitzjohn R, Lees JA, Thekke Kanapram D, Volz E, Ghani A, Ferguson NM, Baguelin M, Cori Aet al., 2021, Non-pharmaceutical interventions, vaccination, and the SARS-CoV-2 delta variant in England: a mathematical modelling study, The Lancet, Vol: 398, Pages: 1825-1835, ISSN: 0140-6736

Background:England's COVID-19 roadmap out of lockdown policy set out the timeline and conditions for the stepwise lifting of non-pharmaceutical interventions (NPIs) as vaccination roll-out continued, with step one starting on March 8, 2021. In this study, we assess the roadmap, the impact of the delta (B.1.617.2) variant of SARS-CoV-2, and potential future epidemic trajectories.Methods:This mathematical modelling study was done to assess the UK Government's four-step process to easing lockdown restrictions in England, UK. We extended a previously described model of SARS-CoV-2 transmission to incorporate vaccination and multi-strain dynamics to explicitly capture the emergence of the delta variant. We calibrated the model to English surveillance data, including hospital admissions, hospital occupancy, seroprevalence data, and population-level PCR testing data using a Bayesian evidence synthesis framework, then modelled the potential trajectory of the epidemic for a range of different schedules for relaxing NPIs. We estimated the resulting number of daily infections and hospital admissions, and daily and cumulative deaths. Three scenarios spanning a range of optimistic to pessimistic vaccine effectiveness, waning natural immunity, and cross-protection from previous infections were investigated. We also considered three levels of mixing after the lifting of restrictions.Findings:The roadmap policy was successful in offsetting the increased transmission resulting from lifting NPIs starting on March 8, 2021, with increasing population immunity through vaccination. However, because of the emergence of the delta variant, with an estimated transmission advantage of 76% (95% credible interval [95% CrI] 69–83) over alpha, fully lifting NPIs on June 21, 2021, as originally planned might have led to 3900 (95% CrI 1500–5700) peak daily hospital admissions under our central parameter scenario. Delaying until July 19, 2021, reduced peak hospital admissions by three fol

Journal article

Lees JA, Tonkin-Hill G, Yang Z, Corander Jet al., 2021, Mandrake: visualising microbial population structure by embedding millions of genomes into a low-dimensional representation

<jats:title>Abstract</jats:title><jats:p>In less than a decade, population genomics of microbes has progressed from the effort of sequencing dozens of strains to thousands, or even tens of thousands of strains in a single study. There are now hundreds of thousands of genomes available even for a single bacterial species and the number of genomes is expected to continue to increase at an accelerated pace given the advances in sequencing technology and widespread genomic surveillance initiatives. This explosion of data calls for innovative methods to enable rapid exploration of the structure of a population based on different data modalities, such as multiple sequence alignments, assemblies and estimates of gene content across different genomes. Here we present Mandrake, an efficient implementation of a dimensional reduction method tailored for the needs of large-scale population genomics. Mandrake is capable of visualising population structure from millions of whole genomes and we illustrate its usefulness with several data sets representing major pathogens. Our method is freely available both as an analysis pipeline (<jats:ext-link xmlns:xlink="http://www.w3.org/1999/xlink" ext-link-type="uri" xlink:href="https://github.com/johnlees/mandrake">https://github.com/johnlees/mandrake</jats:ext-link>) and as a browser-based interactive application (<jats:ext-link xmlns:xlink="http://www.w3.org/1999/xlink" ext-link-type="uri" xlink:href="https://gtonkinhill.github.io/mandrake-web/">https://gtonkinhill.github.io/mandrake-web/</jats:ext-link>).</jats:p>

Journal article

Knock ES, Whittles LK, Lees JA, Perez-Guzman PN, Verity R, FitzJohn RG, Gaythorpe KAM, Imai N, Hinsley W, Okell LC, Rosello A, Kantas N, Walters CE, Bhatia S, Watson OJ, Whittaker C, Cattarino L, Boonyasiri A, Djaafara BA, Fraser K, Fu H, Wang H, Xi X, Donnelly CA, Jauneikaite E, Laydon DJ, White PJ, Ghani AC, Ferguson NM, Cori A, Baguelin Met al., 2021, Key epidemiological drivers and impact of interventions in the 2020 SARS-CoV-2 epidemic in England, Science Translational Medicine, Vol: 13, Pages: 1-12, ISSN: 1946-6234

We fitted a model of SARS-CoV-2 transmission in care homes and the community to regional surveillance data for England. Compared with other approaches, our model provides a synthesis of multiple surveillance data streams into a single coherent modelling framework allowing transmission and severity to be disentangled from features of the surveillance system. Of the control measures implemented, only national lockdown brought the reproduction number (Rteff ) below 1 consistently; if introduced one week earlier it could have reduced deaths in the first wave from an estimated 48,600 to 25,600 (95% credible interval [95%CrI]: 15,900-38,400). The infection fatality ratio decreased from 1.00% (95%CrI: 0.85%-1.21%) to 0.79% (95%CrI: 0.63%-0.99%), suggesting improved clinical care. The infection fatality ratio was higher in the elderly residing in care homes (23.3%, 95%CrI: 14.7%-35.2%) than those residing in the community (7.9%, 95%CrI: 5.9%-10.3%). On 2nd December 2020 England was still far from herd immunity, with regional cumulative infection incidence between 7.6% (95%CrI: 5.4%-10.2%) and 22.3% (95%CrI: 19.4%-25.4%) of the population. Therefore, any vaccination campaign will need to achieve high coverage and a high degree of protection in vaccinated individuals to allow non-pharmaceutical interventions to be lifted without a resurgence of transmission.

Journal article

D'Aeth JC, van der Linden MPG, McGee L, De Lencastre H, Turner P, Song J-H, Lo SW, Gladstone RA, Sa-Leao R, Ko KS, Hanage WP, Breiman RF, Beall B, Bentley SD, Croucher NJ, GPS Consortiumet al., 2021, The role of interspecies recombinations in the evolution of antibiotic-resistant pneumococci, eLife, Vol: 10, ISSN: 2050-084X

The evolutionary histories of the antibiotic-resistant Streptococcus pneumoniae lineages PMEN3 and PMEN9 were reconstructed using global collections of genomes. In PMEN3, one resistant clade spread worldwide, and underwent 25 serotype switches, enabling evasion of vaccine-induced immunity. In PMEN9, only 9 switches were detected, and multiple resistant lineages emerged independently and circulated locally. In Germany, PMEN9’s expansion correlated significantly with the macrolide:penicillin consumption ratio. These isolates were penicillin sensitive but macrolide resistant, through a homologous recombination that integrated Tn1207.1 into a competence gene, preventing further diversification via transformation. Analysis of a species-wide dataset found 183 acquisitions of macrolide resistance, and multiple gains of the tetracycline-resistant transposon Tn916, through homologous recombination, often originating in other streptococcal species. Consequently, antibiotic selection preserves atypical recom- bination events that cause sequence divergence and structural variation throughout the S. pneumoniae chromosome. These events reveal the genetic exchanges between species normally counter-selected until perturbed by clinical interventions.

Journal article

Gladstone RA, McNally A, Pontinen AK, Tonkin-Hill G, Lees JA, Skyten K, Cleon F, Christensen MOK, Haldorsen BC, Bye KK, Gammelsrud KW, Hjetland R, Kummel A, Larsen HE, Lindemann PC, Lohr IH, Marvik A, Nilsen E, Noer MT, Simonsen GS, Steinbakk M, Tofteland S, Vattoy M, Bentley SD, Croucher NJ, Parkhill J, Johnsen PJ, Samuelsen O, Corander Jet al., 2021, Emergence and dissemination of antimicrobial resistance in Escherichia coli causing bloodstream infections in Norway in 2002-17: a nationwide, longitudinal, microbial population genomic study, The Lancet Microbe, Vol: 2, Pages: E331-E341, ISSN: 2666-5247

BackgroundThe clonal diversity underpinning trends in multidrug resistant Escherichia coli causing bloodstream infections remains uncertain. We aimed to determine the contribution of individual clones to resistance over time, using large-scale genomics-based molecular epidemiology.MethodsThis was a longitudinal, E coli population, genomic, cohort study that sampled isolates from 22 512 E coli bloodstream infections included in the Norwegian surveillance programme on resistant microbes (NORM) from 2002 to 2017. 15 of 22 laboratories were able to share their isolates, and the first 22·5% of isolates from each year were requested. We used whole genome sequencing to infer the population structure (PopPUNK), and we investigated the clade composition of the dominant multidrug resistant clonal complex (CC)131 using genetic markers previously reported for sequence type (ST)131, effective population size (BEAST), and presence of determinants of antimicrobial resistance (ARIBA, PointFinder, and ResFinder databases) over time. We compared these features between the 2002–10 and 2011–17 time periods. We also compared our results with those of a longitudinal study from the UK done between 2001 and 2011.FindingsOf the 3500 isolates requested from the participating laboratories, 3397 (97·1%) were received, of which 3254 (95·8%) were successfully sequenced and included in the analysis. A significant increase in the number of multidrug resistant CC131 isolates from 71 (5·6%) of 1277 in 2002–10 to 207 (10·5%) of 1977 in 2011–17 (p<0·0001), was the largest clonal expansion. CC131 was the most common clone in extended-spectrum β-lactamase (ESBL)-positive isolates (75 [58·6%] of 128) and fluoroquinolone non-susceptible isolates (148 [39·2%] of 378). Within CC131, clade A increased in prevalence from 2002, whereas the global multidrug resistant clade C2 was not observed until 2007. Multiple de-n

Journal article

FitzJohn RG, Knock ES, Whittles LK, Perez-Guzman PN, Bhatia S, Guntoro F, Watson OJ, Whittaker C, Ferguson NM, Cori A, Baguelin M, Lees JAet al., 2021, Reproducible parallel inference and simulation of stochastic state space models using odin, dust, and mcstate [version 2; peer review: 2 approved], Wellcome Open Research, Vol: 5, ISSN: 2398-502X

State space models, including compartmental models, are used to model physical, biological and social phenomena in a broad range of scientific fields. A common way of representing the underlying processes in these models is as a system of stochastic processes which can be simulated forwards in time. Inference of model parameters based on observed time-series data can then be performed using sequential Monte Carlo techniques. However, using these methods for routine inference problems can be made difficult due to various engineering considerations: allowing model design to change in response to new data and ideas, writing model code which is highly performant, and incorporating all of this with up-to-date statistical techniques. Here, we describe a suite of packages in the R programming language designed to streamline the design and deployment of state space models, targeted at infectious disease modellers but suitable for other domains. Users describe their model in a familiar domain-specific language, which is converted into parallelised C++ code. A fast, parallel, reproducible random number generator is then used to run large numbers of model simulations in an efficient manner. We also provide standard inference and prediction routines, though the model simulator can be used directly if these do not meet the user's needs. These packages provide guarantees on reproducibility and performance, allowing the user to focus on the model itself, rather than the underlying computation. The ability to automatically generate high-performance code that would be tedious and time-consuming to write and verify manually, particularly when adding further structure to compartments, is crucial for infectious disease modellers. Our packages have been critical to the development cycle of our ongoing real-time modelling efforts in the COVID-19 pandemic, and have the potential to do the same for models used in a number of different domains.

Journal article

McCabe R, Kont M, Schmit N, Whittaker C, Lochen A, Baguelin M, Knock E, Whittles L, Lees J, Brazeau N, Walker P, Ghani A, Ferguson N, White P, Donnelly C, Hauck K, Watson Oet al., 2021, Modelling ICU capacity under different epidemiological scenarios of the COVID-19 pandemic in three western European countries, International Journal of Epidemiology, Vol: 50, Pages: 753-767, ISSN: 0300-5771

Background: The coronavirus disease 2019 (COVID-19) pandemic has placed enormous strain on intensive care units (ICUs) in Europe. Ensuring access to care, irrespective of COVID-19 status, in winter 2020/21 is essential.Methods: An integrated model of hospital capacity planning and epidemiological projections of COVID-19 patients is used to estimate the demand for and resultant spare capacity of ICU beds, staff, and ventilators under different epidemic scenarios in France, Germany, and Italy across the 2020/21 winter period. The effect of implementing lockdowns triggered by different numbers of COVID-19 patients in ICU under varying levels of effectiveness is examined, using a ‘dual-demand’ (COVID-19 and non-COVID-19) patient model.Results: Without sufficient mitigation, we estimate that COVID-19 ICU patient numbers will exceed those seen in the first peak, resulting in substantial capacity deficits, with beds being consistently found to be the most constrained resource. Reactive lockdowns could lead to large improvements in ICU capacity during the winter season, with pressure being most effectively alleviated when lockdown is triggered early and sustained under a higher level of suppression. The success of such interventions also depends on baseline bed numbers and average non-COVID-19 patient occupancy.Conclusions: Reductions in capacity deficits under different scenarios must be weighed against the feasibility and drawbacks of further lockdowns. Careful, continuous decision-making by national policymakers will be required across the winter period 2020/21.

Journal article

Hogan AB, Winskill P, Watson OJ, Walker PGT, Whittaker C, Baguelin M, Brazeau NF, Charles GD, Gaythorpe KAM, Hamlet A, Knock E, Laydon DJ, Lees JA, Løchen A, Verity R, Whittles LK, Muhib F, Hauck K, Ferguson NM, Ghani ACet al., 2021, Within-country age-based prioritisation, global allocation, and public health impact of a vaccine against SARS-CoV-2: a mathematical modelling analysis, Vaccine, Vol: 39, Pages: 2995-3006, ISSN: 0264-410X

The worldwide endeavour to develop safe and effective COVID-19 vaccines has been extraordinary, and vaccination is now underway in many countries. However, the doses available in 2021 are likely to be limited. We extended a mathematical model of SARS-CoV-2 transmission across different country settings to evaluate the public health impact of potential vaccines using WHO-developed target product profiles. We identified optimal vaccine allocation strategies within- and between-countries to maximise averted deaths under constraints on dose supply. We found that the health impact of SARS-CoV-2 vaccination depends on the cumulative population-level infection incidence when vaccination begins, the duration of natural immunity, the trajectory of the epidemic prior to vaccination, and the level of healthcare available to effectively treat those with disease. Within a country we find that for a limited supply (doses for <20% of the population) the optimal strategy is to target the elderly. However, with a larger supply, if vaccination can occur while other interventions are maintained, the optimal strategy switches to targeting key transmitters to indirectly protect the vulnerable. As supply increases, vaccines that reduce or block infection have a greater impact than those that prevent disease alone due to the indirect protection provided to high-risk groups. Given a 2 billion global dose supply in 2021, we find that a strategy in which doses are allocated to countries proportional to population size is close to optimal in averting deaths and aligns with the ethical principles agreed in pandemic preparedness planning.

Journal article

Watson O, Alhaffar M, Mehchy Z, Whittaker C, Akil Z, Brazeau N, Cuomo-Dannenburg G, Hamlet A, Thompson H, Baguelin M, Fitzjohn R, Knock E, Lees J, Whittles L, Mellan T, Winskill P, COVID-19 Response Team IC, Howard N, Clapham H, Checchi F, Ferguson N, Ghani A, Walker P, Beals Eet al., 2021, Leveraging community mortality indicators to infer COVID-19 mortality and transmission dynamics in Damascus, Syria, Nature Communications, Vol: 12, Pages: 1-10, ISSN: 2041-1723

The COVID-19 pandemic has resulted in substantial mortality worldwide. However, to date, countries in the Middle East and Africa have reported considerably lower mortality rates than in Europe and the Americas. Motivated by reports of an overwhelmed health system, we estimate the likely under-ascertainment of COVID-19 mortality in Damascus, Syria. Using all-cause mortality data, we fit a mathematical model of COVID-19 transmission to reported mortality, estimating that 1.25% of COVID-19 deaths (sensitivity range 1.00% – 3.00%) have been reported as of 2 September 2020. By 2 September, we estimate that 4,380 (95% CI: 3,250 – 5,550) COVID-19 deaths in Damascus may have been missed, with 39.0% (95% CI: 32.5% – 45.0%) of the population in Damascus estimated to have been infected. Accounting for under-ascertainment corroborates reports of exceeded hospital bed capacity and is validated by community-uploaded obituary notifications, which confirm extensive unreported mortality in Damascus.

Journal article

Lees J, Ccroucher NJ, Anderson D, Stajich Jet al., 2021, johnlees/PopPUNK: PopPUNK v2.4.0

Minimum sketchlib version for this release is v1.7.0

Abstract
Cite

Software

Pontinen AK, Top J, Arredondo-Alonso S, Tonkin-Hill G, Freitas AR, Novais C, Gladstone RA, Pesonen M, Meneses R, Pesonen H, Lees JA, Jamrozy D, Bentley SD, Lanza VF, Torres C, Peixe L, Coque TM, Parkhill J, Schurch AC, Willems RJL, Corander Jet al., 2021, Apparent nosocomial adaptation of Enterococcus faecalis predates the modern hospital era, NATURE COMMUNICATIONS, Vol: 12

Author Web Link
Cite
Citations: 39

Journal article

Kremer PHC, Ferwerda B, Bootsma HJ, Rots NY, Wijmega-Monsuur AJ, Sanders EAM, Trzciński K, Wyllie AL, Turner P, van der Ende A, Brouwer MC, Bentley SD, van de Beek D, Lees JAet al., 2021, Pneumococcal genetic variability influences age-dependent bacterial carriage

<jats:title>Abstract</jats:title><jats:p>The pneumococcal conjugate vaccine (PCV) primarily reduces disease burden in adults through a reduction in carriage prevalence of invasive serotypes in children. Current vaccine formulations are the same for both adults and children, but tailoring these formulations to age category could optimize vaccine efficacy. Identification of specific pneumococcal genetic factors associated with carriage in younger or older age groups may suggest alternative formulations and contribute to a better mechanistic understanding of immunity. Here, we used whole genome sequencing to dissect pneumococcal variation associated with age. We performed genome sequencing in a large carriage cohort, and conducted a meta-analysis with an existing carriage study. We compiled a dictionary of pathogen genetic variation including serotype, sequence cluster, sequence elements, SNPs, burden combined rare variants, and clusters of orthologous genes (COGs) for each cohort – all of which used in a genome-wide association with host age. Age-dependent colonization had some heritability, though this varied between cohorts (h<jats:sup>2</jats:sup> = 0.10, 0.00 – 0.69 95% CI in the first; h<jats:sup>2</jats:sup> = 0.46, 0.33 – 0.60 95% CI in the second cohort). We found that serotypes and genetic background (strain) explained most of the heritability in each cohort (h<jats:sup>2</jats:sup><jats:sub>serotype</jats:sub> = 0.06 and h<jats:sup>2</jats:sup><jats:sub>GPSC</jats:sub> = 0.04 in the first; h<jats:sup>2</jats:sup><jats:sub>serotype</jats:sub> = 0.20 and h<jats:sup>2</jats:sup><jats:sub>GPSC</jats:sub> = 0.23 in the second cohort). We found one candidate association (p = 1.2×10<jats:sup>−9</jats:sup>) upstream of an accessory Sec-dependent serine-rich glycoprotein adhesin. Overall, a

Journal article

Nouvellet P, Bhatia S, Cori A, Ainslie K, Baguelin M, Bhatt S, Boonyasiri A, Brazeau N, Cattarino L, Cooper L, Coupland H, Cucunuba Perez Z, Cuomo-Dannenburg G, Dighe A, Djaafara A, Dorigatti I, Eales O, van Elsland S, NASCIMENTO F, Fitzjohn R, Gaythorpe K, Geidelberg L, green W, Hamlet A, Hauck K, Hinsley W, Imai N, Jeffrey, Jeffrey B, Knock E, Laydon D, Lees J, Mangal T, Mellan T, Nedjati Gilani G, Parag K, Pons Salort M, Ragonnet-Cronin M, Riley S, Unwin H, Verity R, Vollmer M, Volz E, Walker P, Walters C, Wang H, Watson O, Whittaker C, Whittles L, Xi X, Ferguson N, Donnelly Cet al., 2021, Reduction in mobility and COVID-19 transmission, Nature Communications, Vol: 12, ISSN: 2041-1723

In response to the COVID-19 pandemic, countries have sought to control SARS-CoV-2 transmission by restricting population movement through social distancing interventions, thus reducing the number of contacts.Mobility data represent an important proxy measure of social distancing, and here, we characterise the relationship between transmission and mobility for 52 countries around the world.Transmission significantly decreased with the initial reduction in mobility in 73% of the countries analysed, but we found evidence of decoupling of transmission and mobility following the relaxation of strict control measures for 80% of countries. For the majority of countries, mobility explained a substantial proportion of the variation in transmissibility (median adjusted R-squared: 48%, interquartile range - IQR - across countries [27-77%]). Where a change in the relationship occurred, predictive ability decreased after the relaxation; from a median adjusted R-squared of 74% (IQR across countries [49-91%]) pre-relaxation, to a median adjusted R-squared of 30% (IQR across countries [12-48%]) post-relaxation.In countries with a clear relationship between mobility and transmission both before and after strict control measures were relaxed, mobility was associated with lower transmission rates after control measures were relaxed indicating that the beneficial effects of ongoing social distancing behaviours were substantial.

Journal article

Croucher N, Harrow G, Lees J, Hanage W, Lipsitch M, Corander J, Colijn Cet al., 2021, Negative frequency-dependent selection and asymmetrical transformation stabilise multi-strain bacterial population structures, The ISME Journal: multidisciplinary journal of microbial ecology, Vol: 15, Pages: 1523-1538, ISSN: 1751-7362

Streptococcus pneumoniae can be divided into many strains, each a distinct set of isolates sharing similar core and accessorygenomes, which co-circulate within the same hosts. Previous analyses suggested the short-term vaccine-associated dynamicsof S. pneumoniae strains may be mediated through multi-locus negative frequency-dependent selection (NFDS), whichmaintains accessory loci at equilibrium frequencies. Long-term simulations demonstrated NFDS stabilised clonally-evolvingmulti-strain populations through preventing the loss of variation through drift, based on polymorphism frequencies,pairwise genetic distances and phylogenies. However, allowing symmetrical recombination between isolates evolving undermulti-locus NFDS generated unstructured populations of diverse genotypes. Replication of the observed data improvedwhen multi-locus NFDS was combined with recombination that was instead asymmetrical, favouring deletion of accessoryloci over insertion. This combination separated populations into strains through outbreeding depression, resulting fromrecombinants with reduced accessory genomes having lower fitness than their parental genotypes. Although simplisticmodelling of recombination likely limited these simulations’ ability to maintain some properties of genomic data asaccurately as those lacking recombination, the combination of asymmetrical recombination and multi-locus NFDS couldrestore multi-strain population structures from randomised initial populations. As many bacteria inhibit insertions into theirchromosomes, this combination may commonly underlie the co-existence of strains within a niche.

Journal article

Fu H, Wang H, Xi X, Boonyasiri A, Wang Y, Hinsley W, Fraser KJ, McCabe R, Olivera Mesa D, Skarp J, Ledda A, Dewé T, Dighe A, Winskill P, van Elsland SL, Ainslie KEC, Baguelin M, Bhatt S, Boyd O, Brazeau NF, Cattarino L, Charles G, Coupland H, Cucunubá ZM, Cuomo-Dannenburg G, Donnelly CA, Dorigatti I, Eales OD, Fitzjohn RG, Flaxman S, Gaythorpe KAM, Ghani AC, Green WD, Hamlet A, Hauck K, Haw DJ, Jeffrey B, Laydon DJ, Lees JA, Mellan T, Mishra S, Nedjati Gilani G, Nouvellet P, Okell L, Parag KV, Ragonnet-Cronin M, Riley S, Schmit N, Thompson HA, Unwin HJT, Verity R, Vollmer MAC, Volz E, Walker PGT, Walters CE, Waston OJ, Whittaker C, Whittles LK, Imai N, Bhatia S, Ferguson NMet al., 2021, A database for the epidemic trends and control measures during the first wave of COVID-19 in mainland China, International Journal of Infectious Diseases, Vol: 102, Pages: 463-471, ISSN: 1201-9712

Objectives: This data collation effort aims to provide a comprehensive database to describe the epidemic trends and responses during the first wave of coronavirus disease 2019 (COVID-19)across main provinces in China. Methods: From mid-January to March 2020, we extracted publicly available data on the spread and control of COVID-19 from 31 provincial health authorities and major media outlets in mainland China. Based on these data, we conducted a descriptive analysis of the epidemics in the six most-affected provinces. Results: School closures, travel restrictions, community-level lockdown, and contact tracing were introduced concurrently around late January but subsequent epidemic trends were different across provinces. Compared to Hubei, the other five most-affected provinces reported a lower crude case fatality ratio and proportion of critical and severe hospitalised cases. From March 2020, as local transmission of COVID-19 declined, switching the focus of measures to testing and quarantine of inbound travellers could help to sustain the control of the epidemic. Conclusions: Aggregated indicators of case notifications and severity distributions are essential for monitoring an epidemic. A publicly available database with these indicators and information on control measures provides useful source for exploring further research and policy planning for response to the COVID-19 epidemic.

Journal article

Knock E, Whittles L, Lees J, Perez Guzman P, Verity R, Fitzjohn R, Gaythorpe K, Imai N, Hinsley W, Okell L, Rosello A, Kantas N, Walters C, Bhatia S, Watson O, Whittaker C, Cattarino L, Boonyasiri A, Djaafara A, Fraser K, Fu H, Wang H, Xi X, Donnelly C, Jauneikaite E, Laydon D, White P, Ghani A, Ferguson N, Cori A, Baguelin Met al., 2020, Report 41: The 2020 SARS-CoV-2 epidemic in England: key epidemiological drivers and impact of interventions

England has been severely affected by COVID-19. We fitted a model of SARS-CoV-2 transmission in care homes and the community to regional 2020 surveillance data. Only national lockdown brought the reproduction number below 1 consistently; introduced one week earlier in the first wave it could have reduced mortality by 23,300 deaths on average. The mean infection fatality ratio was initially ~1.3% across all regions except London and halved following clinical care improvements. The infection fatality ratio was two-fold lower throughout in London, even when adjusting for demographics. The infection fatality ratio in care homes was 2.5-times that in the elderly in the community. Population-level infection-induced immunity in England is still far from herd immunity, with regional mean cumulative attack rates ranging between 4.4% and 15.8%.

Report

Unwin H, Mishra S, Bradley V, Gandy A, Mellan T, Coupland H, Ish-Horowicz J, Vollmer M, Whittaker C, Filippi S, Xi X, Monod M, Ratmann O, Hutchinson M, Valka F, Zhu H, Hawryluk I, Milton P, Ainslie K, Baguelin M, Boonyasiri A, Brazeau N, Cattarino L, Cucunuba Z, Cuomo-Dannenburg G, Dorigatti I, Eales O, Eaton J, van Elsland S, Fitzjohn R, Gaythorpe K, Green W, Hinsley W, Jeffrey B, Knock E, Laydon D, Lees J, Nedjati-Gilani G, Nouvellet P, Okell L, Parag K, Siveroni I, Thompson H, Walker P, Walters C, Watson O, Whittles L, Ghani A, Ferguson N, Riley S, Donnelly C, Bhatt S, Flaxman Set al., 2020, State-level tracking of COVID-19 in the United States, Nature Communications, Vol: 11, Pages: 1-9, ISSN: 2041-1723

As of 1st June 2020, the US Centers for Disease Control and Prevention reported 104,232 confirmed or probable COVID-19-related deaths in the US. This was more than twice the number of deaths reported in the next most severely impacted country. We jointly model the US epidemic at the state-level, using publicly available deathdata within a Bayesian hierarchical semi-mechanistic framework. For each state, we estimate the number of individuals that have been infected, the number of individuals that are currently infectious and the time-varying reproduction number (the average number of secondary infections caused by an infected person). We use changes in mobility to capture the impact that non-pharmaceutical interventions and other behaviour changes have on therate of transmission of SARS-CoV-2. We estimate thatRtwas only below one in 23 states on 1st June. We also estimate that 3.7% [3.4%-4.0%] of the total population of the US had been infected, with wide variation between states, and approximately 0.01% of the population was infectious. We demonstrate good 3 week model forecasts of deaths with low error and good coverage of our credible intervals.

Journal article

This data is extracted from the Web of Science and reproduced under a licence from Thomson Reuters. You may not copy or re-distribute this data in whole or in part without the written consent of the Science business of Thomson Reuters.

Request URL: http://wlsprd.imperial.ac.uk:80/respub/WEB-INF/jsp/search-html.jsp Request URI: /respub/WEB-INF/jsp/search-html.jsp Query String: respub-action=search.html&id=00592818&limit=30&person=true