29 results found
Binsker U, Lees JA, Hammond AJ, et al., 2020, Immune exclusion by naturally acquired secretory IgA against pneumococcal pilus-1., J Clin Invest, Vol: 130, Pages: 927-941
Successful infection by mucosal pathogens requires overcoming the mucus barrier. To better understand this key step, we performed a survey of the interactions between human respiratory mucus and the human pathogen Streptococcus pneumoniae. Pneumococcal adherence to adult human nasal fluid was seen only by isolates expressing pilus-1. Robust binding was independent of pilus-1 adhesive properties but required Fab-dependent recognition of RrgB, the pilus shaft protein, by naturally acquired secretory IgA (sIgA). Pilus-1 binding by specific sIgA led to bacterial agglutination, but adherence required interaction of agglutinated pneumococci and entrapment in mucus particles. To test the effect of these interactions in vivo, pneumococci were preincubated with human sIgA before intranasal challenge in a mouse model of colonization. sIgA treatment resulted in rapid immune exclusion of pilus-expressing pneumococci. Our findings predict that immune exclusion would select for nonpiliated isolates in individuals who acquired RrgB-specific sIgA from prior episodes of colonization with piliated strains. Accordingly, genomic data comparing isolates carried by mothers and their children showed that mothers are less likely to be colonized with pilus-expressing strains. Our study provides a specific example of immune exclusion involving naturally acquired antibody in the human host, a major factor driving pneumococcal adaptation.
Lees JA, Tien Mai T, Galardini M, et al., 2019, Improved inference and prediction of bacterial genotype-phenotype associations using pangenome-spanning regressions
<jats:title>ABSTRACT</jats:title><jats:p>Discovery of influential genetic variants and prediction of phenotypes such as antibiotic resistance are becoming routine tasks in bacterial genomics. Genome-wide association study (GWAS) methods can be applied to study bacterial populations, with a particular emphasis on alignment-free approaches, which are necessitated by the more plastic nature of bacterial genomes. Here we advance bacterial GWAS by introducing a computationally scalable joint modeling framework, where genetic variants covering the entire pangenome are compactly represented by unitigs, and the model fitting is achieved using elastic net penalization. In contrast to current leading GWAS approaches, which test each genotype-phenotype association separately for each variant, our joint modelling approach is shown to lead to increased statistical power while maintaining control of the false positive rate. Our inference procedure also delivers an estimate of the narrow-sense heritability, which is gaining considerable interest in studies of bacteria. Using an extensive set of state-of-the-art bacterial population genomic datasets we demonstrate that our approach performs accurate phenotype prediction, comparable to popular machine learning methods, while retaining both interpretability and computational efficiency. We expect that these advances will pave the way for the next generation of high-powered association and prediction studies for an increasing number of bacterial species.</jats:p>
Pensar J, Puranen S, Arnold B, et al., 2019, Genome-wide epistasis and co-selection study using mutual information, NUCLEIC ACIDS RESEARCH, Vol: 47, ISSN: 0305-1048
The Handbook of Statistical Genomics is an excellent introductory text for advanced graduate students and early-career researchers involved in statistical genetics.
Lo SW, Gladstone RA, van Tonder AJ, et al., 2019, Pneumococcal lineages associated with serotype replacement and antibiotic resistance in childhood invasive pneumococcal disease in the post-PCV13 era: an international whole-genome sequencing study, LANCET INFECTIOUS DISEASES, Vol: 19, Pages: 759-769, ISSN: 1473-3099
Tonkin-Hill G, Lees JA, Bentley SD, et al., 2019, y Fast hierarchical Bayesian analysis of population structure, NUCLEIC ACIDS RESEARCH, Vol: 47, Pages: 5539-5549, ISSN: 0305-1048
Davies MR, McIntyre L, Mutreja A, et al., 2019, Atlas of group A streptococcal vaccine candidates compiled using large-scale comparative genomics, Nature Genetics, Vol: 51, Pages: 1035-1043, ISSN: 1061-4036
Lees JA, Ferwerda B, Kremer PHC, et al., 2019, Joint sequencing of human and pathogen genomes reveals the genetics of pneumococcal meningitis, NATURE COMMUNICATIONS, Vol: 10, ISSN: 2041-1723
Lehtinen S, Blanquart F, Lipsitch M, et al., 2019, On the evolutionary ecology of multidrug resistance in bacteria, PLoS Pathogens, Vol: 15, ISSN: 1553-7366
Resistance against different antibiotics appears on the same bacterial strains more oftenthan expected by chance, leading to high frequencies of multidrug resistance. There are multiple explanations for this observation, but these tend to be specific to subsets of antibioticsand/or bacterial species, whereas the trend is pervasive. Here, we consider the questionin terms of strain ecology: explaining why resistance to different antibiotics is often seen onthe same strain requires an understanding of the competition between strains with differentresistance profiles. This work builds on models originally proposed to explain another aspectof strain competition: the stable coexistence of antibiotic sensitivity and resistance observedin a number of bacterial species. We first identify a partial structural similarity in these models: either strain or host population structure stratifies the pathogen population into evolutionarily independent sub-populations and introduces variation in the fitness effect of resistancebetween these sub-populations, thus creating niches for sensitivity and resistance. We thengeneralise this unified underlying model to multidrug resistance and show that models withthis structure predict high levels of association between resistance to different drugs andhigh multidrug resistance frequencies. We test predictions from this model in six bacterialdatasets and find them to be qualitatively consistent with observed trends. The higher thanexpected frequencies of multidrug resistance are often interpreted as evidence that thesestrains are out-competing strains with lower resistance multiplicity. Our work provides analternative explanation that is compatible with long-term stability in resistance frequencies.
Gladstone RA, Lo SW, Lees JA, et al., 2019, International genomic definition of pneumococcal lineages, to contextualise disease, antibiotic resistance and vaccine impact, EBioMedicine, Vol: 43, Pages: 338-346, ISSN: 2352-3964
BackgroundPneumococcal conjugate vaccines have reduced the incidence of invasive pneumococcal disease, caused by vaccine serotypes, but non-vaccine-serotypes remain a concern. We used whole genome sequencing to study pneumococcal serotype, antibiotic resistance and invasiveness, in the context of genetic background.MethodsOur dataset of 13,454 genomes, combined with four published genomic datasets, represented Africa (40%), Asia (25%), Europe (19%), North America (12%), and South America (5%). These 20,027 pneumococcal genomes were clustered into lineages using PopPUNK, and named Global Pneumococcal Sequence Clusters (GPSCs). From our dataset, we additionally derived serotype and sequence type, and predicted antibiotic sensitivity. We then measured invasiveness using odds ratios that relating prevalence in invasive pneumococcal disease to carriage.FindingsThe combined collections (n = 20,027) were clustered into 621 GPSCs. Thirty-five GPSCs observed in our dataset were represented by >100 isolates, and subsequently classed as dominant-GPSCs. In 22/35 (63%) of dominant-GPSCs both non-vaccine serotypes and vaccine serotypes were observed in the years up until, and including, the first year of pneumococcal conjugate vaccine introduction.Penicillin and multidrug resistance were higher (p < .05) in a subset dominant-GPSCs (14/35, 9/35 respectively), and resistance to an increasing number of antibiotic classes was associated with increased recombination (R2 = 0.27 p < .0001). In 28/35 dominant-GPSCs, the country of isolation was a significant predictor (p < .05) of its antibiogram (mean misclassification error 0.28, SD ± 0.13).We detected increased invasiveness of six genetic backgrounds, when compared to other genetic backgrounds expressing the same serotype. Up to 1.6-fold changes in invasiveness odds ratio were observed.InterpretationWe define GPSCs that can be assigned to any pneumococcal genomic dataset, to aid international comparisons. Existing n
Copin R, Sause WE, Fulmer Y, et al., 2019, Sequential evolution of virulence and resistance during clonal spread of community-acquired methicillin-resistant Staphylococcus aureus, Proceedings of the National Academy of Sciences, Vol: 116, Pages: 1745-1754, ISSN: 0027-8424
<jats:p>The past two decades have witnessed an alarming expansion of staphylococcal disease caused by community-acquired methicillin-resistant <jats:italic>Staphylococcus aureus</jats:italic> (CA-MRSA). The factors underlying the epidemic expansion of CA-MRSA lineages such as USA300, the predominant CA-MRSA clone in the United States, are largely unknown. Previously described virulence and antimicrobial resistance genes that promote the dissemination of CA-MRSA are carried by mobile genetic elements, including phages and plasmids. Here, we used high-resolution genomics and experimental infections to characterize the evolution of a USA300 variant plaguing a patient population at increased risk of infection to understand the mechanisms underlying the emergence of genetic elements that facilitate clonal spread of the pathogen. Genetic analyses provided conclusive evidence that fitness (manifest as emergence of a dominant clone) changed coincidently with the stepwise emergence of (<jats:italic>i</jats:italic>) a unique prophage and mutation of the regulator of the pyrimidine nucleotide biosynthetic operon that promoted abscess formation and colonization, respectively, thereby priming the clone for success; and (<jats:italic>ii</jats:italic>) a unique plasmid that conferred resistance to two topical microbiocides, mupirocin and chlorhexidine, frequently used for decolonization and infection prevention. The resistance plasmid evolved through successive incorporation of DNA elements from non-<jats:italic>S. aureus</jats:italic> spp. into an indigenous cryptic plasmid, suggesting a mechanism for interspecies genetic exchange that promotes antimicrobial resistance. Collectively, the data suggest that clonal spread in a vulnerable population resulted from extensive clinical intervention and intense selection pressure toward a pathogen lifestyle that involved the evolution of consequential mutations and mobile genetic eleme
Lees JA, Harris SR, Tonkin-Hill G, et al., 2019, Fast and flexible bacterial genomic epidemiology with PopPUNK, Genome Research, Vol: 29, Pages: 304-316, ISSN: 1088-9051
The routine use of genomics for disease surveillance provides the opportunity for high-resolution bacterial epidemiology. Current whole-genome clustering and multilocus typing approaches do not fully exploit core and accessory genomic variation, and they cannot both automatically identify, and subsequently expand, clusters of significantly similar isolates in large data sets spanning entire species. Here, we describe PopPUNK (Population Partitioning Using Nucleotide K -mers), a software implementing scalable and expandable annotation- and alignment-free methods for population analysis and clustering. Variable-length k-mer comparisons are used to distinguish isolates' divergence in shared sequence and gene content, which we demonstrate to be accurate over multiple orders of magnitude using data from both simulations and genomic collections representing 10 taxonomically widespread species. Connections between closely related isolates of the same strain are robustly identified, despite interspecies variation in the pairwise distance distributions that reflects species' diverse evolutionary patterns. PopPUNK can process 103-104 genomes in a single batch, with minimal memory use and runtimes up to 200-fold faster than existing model-based methods. Clusters of strains remain consistent as new batches of genomes are added, which is achieved without needing to reanalyze all genomes de novo. This facilitates real-time surveillance with consistent cluster naming between studies and allows for outbreak detection using hundreds of genomes in minutes. Interactive visualization and online publication is streamlined through the automatic output of results to multiple platforms. PopPUNK has been designed as a flexible platform that addresses important issues with currently used whole-genome clustering and typing methods, and has potential uses across bacterial genetics and public health research.
Shen P, Lees JA, Bee GCW, et al., 2019, Pneumococcal quorum sensing drives an asymmetric owner-intruder competitive strategy during carriage via the competence regulon, NATURE MICROBIOLOGY, Vol: 4, Pages: 198-+, ISSN: 2058-5276
Lees JA, Galardini M, Bentley SD, et al., 2018, pyseer: a comprehensive tool for microbial pangenome-wide association studies, BIOINFORMATICS, Vol: 34, Pages: 4310-4312, ISSN: 1367-4803
Lees JA, Kendall M, Parkhill J, et al., 2018, Evaluation of phylogenetic reconstruction methods using bacterial whole genomes: a simulation based study, Wellcome Open Research, Vol: 3, Pages: 33-33, ISSN: 2398-502X
Background: Phylogenetic reconstruction is a necessary first step in many analyses which use whole genome sequence data from bacterial populations. There are many available methods to infer phylogenies, and these have various advantages and disadvantages, but few unbiased comparisons of the range of approaches have been made. Methods: We simulated data from a defined "true tree" using a realistic evolutionary model. We built phylogenies from this data using a range of methods, and compared reconstructed trees to the true tree using two measures, noting the computational time needed for different phylogenetic reconstructions. We also used real data from Streptococcus pneumoniae alignments to compare individual core gene trees to a core genome tree. Results: We found that, as expected, maximum likelihood trees from good quality alignments were the most accurate, but also the most computationally intensive. Using less accurate phylogenetic reconstruction methods, we were able to obtain results of comparable accuracy; we found that approximate results can rapidly be obtained using genetic distance based methods. In real data we found that highly conserved core genes, such as those involved in translation, gave an inaccurate tree topology, whereas genes involved in recombination events gave inaccurate branch lengths. We also show a tree-of-trees, relating the results of different phylogenetic reconstructions to each other. Conclusions: We recommend three approaches, depending on requirements for accuracy and computational time. Quicker approaches that do not perform full maximum likelihood optimisation may be useful for many analyses requiring a phylogeny, as generating a high quality input alignment is likely to be the major limiting factor of accurate tree topology. We have publicly released our simulated data and code to enable further comparisons.
Tonkin-Hill G, Lees JA, Bentley SD, et al., 2018, RhierBAPS: An R implementation of the population clustering algorithm hierBAPS., Wellcome Open Res, Vol: 3, ISSN: 2398-502X
Identifying structure in collections of sequence data sets remains a common problem in genomics. hierBAPS, a popular algorithm for identifying population structure in haploid genomes, has previously only been available as a MATLAB binary. We provide an R implementation which is both easier to install and use, automating the entire pipeline. Additionally, we allow for the use of multiple processors, improve on the default settings of the algorithm, and provide an interface with the ggtree library to enable informative illustration of the clustering results. Our aim is that this package aids in the understanding and dissemination of the method, as well as enhancing the reproducibility of population structure analyses.
Lees JA, Tonkin-Hill G, Bentley SD, 2017, GENOME WATCH Stronger together, NATURE REVIEWS MICROBIOLOGY, Vol: 15, Pages: 516-516, ISSN: 1740-1526
Lees JA, Croucher NJ, Goldblatt D, et al., 2017, Genome-wide identification of lineage and locus specific variation associated with pneumococcal carriage duration, eLife, Vol: 6, ISSN: 2050-084X
Streptococcus pneumoniae is a leading cause of invasive disease in infants, especiallyin low-income settings. Asymptomatic carriage in the nasopharynx is a prerequisite for disease, butvariability in its duration is currently only understood at the serotype level. Here we developed amodel to calculate the duration of carriage episodes from longitudinal swab data, and combinedthese results with whole genome sequence data. We estimated that pneumococcal genomicvariation accounted for 63% of the phenotype variation, whereas the host traits considered here(age and previous carriage) accounted for less than 5%. We further partitioned this heritability intoboth lineage and locus effects, and quantified the amount attributable to the largest sources ofvariation in carriage duration: serotype (17%), drug-resistance (9%) and other significant locuseffects (7%). A pan-genome-wide association study identified prophage sequences as beingassociated with decreased carriage duration independent of serotype, potentially by disruption ofthe competence mechanism. These findings support theoretical models of pneumococcalcompetition and antibiotic resistance.
Kremer PHC, Lees JA, Koopmans MM, et al., 2017, Benzalkonium tolerance genes and outcome in Listeria monocytogenes meningitis, CLINICAL MICROBIOLOGY AND INFECTION, Vol: 23, Pages: 265E1-265E7, ISSN: 1198-743X
Lees JA, Brouwer M, van der Ende A, et al., 2017, Within-Host Sampling of a Natural Population Shows Signs of Selection on Pde1 during Bacterial Meningitis, INFECTION AND IMMUNITY, Vol: 85, ISSN: 0019-9567
Lees JA, Kremer PHC, Manso AS, et al., 2017, Large scale genomic analysis shows no evidence for pathogen adaptation between the blood and cerebrospinal fluid niches during bacterial meningitis., Microbial Genomics, Vol: 3, ISSN: 2057-5858
Recent studies have provided evidence for rapid pathogen genome diversification, some of which could potentially affect the course of disease. We have previously described such variation seen between isolates infecting the blood and cerebrospinal fluid (CSF) of a single patient during a case of bacterial meningitis. Here, we performed whole-genome sequencing of paired isolates from the blood and CSF of 869 meningitis patients to determine whether such variation frequently occurs between these two niches in cases of bacterial meningitis. Using a combination of reference-free variant calling approaches, we show that no genetic adaptation occurs in either invaded niche during bacterial meningitis for two major pathogen species, Streptococcus pneumoniae and Neisseria meningitidis. This study therefore shows that the bacteria capable of causing meningitis are already able to do this upon entering the blood, and no further sequence change is necessary to cross the blood-brain barrier. Our findings place the focus back on bacterial evolution between nasopharyngeal carriage and invasion, or diversity of the host, as likely mechanisms for determining invasiveness.
Khatib U, van de Beek D, Lees JA, et al., 2017, Adults with suspected central nervous system infection: A prospective study of diagnostic accuracy, JOURNAL OF INFECTION, Vol: 74, Pages: 1-9, ISSN: 0163-4453
David S, Rusniok C, Mentasti M, et al., 2016, Multiple major disease-associated clones of Legionella pneumophila have emerged recently and independently, GENOME RESEARCH, Vol: 26, Pages: 1555-1564, ISSN: 1088-9051
Lees JA, Vehkala M, Välimäki N, et al., 2016, Sequence element enrichment analysis to determine the genetic basis of bacterial phenotypes, Nature Communications, Vol: 7, ISSN: 2041-1723
Bacterial genomes vary extensively in terms of both gene content and gene sequence. This plasticity hampers the use of traditional SNP-based methods for identifying all genetic associations with phenotypic variation. Here we introduce a computationally scalable and widely applicable statistical method (SEER) for the identification of sequence elements that are significantly enriched in a phenotype of interest. SEER is applicable to tens of thousands of genomes by counting variable-length k-mers using a distributed string-mining algorithm. Robust options are provided for association analysis that also correct for the clonal population structure of bacteria. Using large collections of genomes of the major human pathogens Streptococcus pneumoniae and Streptococcus pyogenes, SEER identifies relevant previously characterized resistance determinants for several antibiotics and discovers potential novel factors related to the invasiveness of S. pyogenes. We thus demonstrate that our method can answer important biologically and medically relevant questions.
Lees JA, Bentley SD, 2016, Bacterial GWAS: not just gilding the lily, NATURE REVIEWS MICROBIOLOGY, Vol: 14, Pages: 1-1, ISSN: 1740-1526
Cain AK, Lees JA, 2015, Using genomics to combat infectious diseases on a global scale, GENOME BIOLOGY, Vol: 16, ISSN: 1465-6906
Lees J, Gladstone RA, 2015, GENOME WATCH R-M systems go on the offensive, NATURE REVIEWS MICROBIOLOGY, Vol: 13, ISSN: 1740-1526
Tonkin-Hill G, MacAlasdair N, Ruis C, et al., Producing Polished Prokaryotic Pangenomes with the Panaroo Pipeline
<jats:p>Population-level comparisons of prokaryotic genomes must take into account the substantial differences in gene content, resulting from frequent horizontal gene transfer, gene duplication and gene loss. However, the automated annotation of prokaryotic genomes is imperfect, and errors due to fragmented assemblies, contamination, diverse gene families and mis-assemblies accumulate over the population, leading to profound consequences when analysing the set of all genes found in a species. Here we introduce Panaroo, a graph based pangenome clustering tool that is able to account for many of the sources of error introduced during the annotation of prokaryotic genome assemblies. We verified our approach through extensive simulations of de novo assemblies using the infinitely many genes model and by analysing a number of publicly available large bacterial genome datasets. Using a highly clonal <jats:italic>Mycobacterium tuberculosis</jats:italic> dataset as a negative control case, we show that failing to account for annotation errors can lead to pangenome estimates that are dominated by error. We additionally demonstrate the utility of the improved graphical output provided by Panaroo by performing a pan-genome wide association study in <jats:italic>Neisseria gonorrhoeae</jats:italic> and by analysing gene gain and loss rates across 51 of the major global pneumococcal sequence clusters. Panaroo is freely available under an open source MIT licence at <jats:ext-link xmlns:xlink="http://www.w3.org/1999/xlink" ext-link-type="uri" xlink:href="https://github.com/gtonkinhill/panaroo">https://github.com/gtonkinhill/panaroo</jats:ext-link>.</jats:p>
This data is extracted from the Web of Science and reproduced under a licence from Thomson Reuters. You may not copy or re-distribute this data in whole or in part without the written consent of the Science business of Thomson Reuters.