Imperial College London

DrJohnLees

Faculty of MedicineSchool of Public Health

MRC Centre GIDA Research Fellow
 
 
 
//

Contact

 

+44 (0)20 7594 2939j.lees Website

 
 
//

Location

 

UG4Sir Alexander Fleming BuildingSouth Kensington Campus

//

Summary

 

Publications

Publication Type
Year
to

29 results found

Binsker U, Lees JA, Hammond AJ, Weiser JNet al., 2020, Immune exclusion by naturally acquired secretory IgA against pneumococcal pilus-1., J Clin Invest, Vol: 130, Pages: 927-941

Successful infection by mucosal pathogens requires overcoming the mucus barrier. To better understand this key step, we performed a survey of the interactions between human respiratory mucus and the human pathogen Streptococcus pneumoniae. Pneumococcal adherence to adult human nasal fluid was seen only by isolates expressing pilus-1. Robust binding was independent of pilus-1 adhesive properties but required Fab-dependent recognition of RrgB, the pilus shaft protein, by naturally acquired secretory IgA (sIgA). Pilus-1 binding by specific sIgA led to bacterial agglutination, but adherence required interaction of agglutinated pneumococci and entrapment in mucus particles. To test the effect of these interactions in vivo, pneumococci were preincubated with human sIgA before intranasal challenge in a mouse model of colonization. sIgA treatment resulted in rapid immune exclusion of pilus-expressing pneumococci. Our findings predict that immune exclusion would select for nonpiliated isolates in individuals who acquired RrgB-specific sIgA from prior episodes of colonization with piliated strains. Accordingly, genomic data comparing isolates carried by mothers and their children showed that mothers are less likely to be colonized with pilus-expressing strains. Our study provides a specific example of immune exclusion involving naturally acquired antibody in the human host, a major factor driving pneumococcal adaptation.

Journal article

Lees JA, Tien Mai T, Galardini M, Wheeler NE, Corander Jet al., 2019, Improved inference and prediction of bacterial genotype-phenotype associations using pangenome-spanning regressions

<jats:title>ABSTRACT</jats:title><jats:p>Discovery of influential genetic variants and prediction of phenotypes such as antibiotic resistance are becoming routine tasks in bacterial genomics. Genome-wide association study (GWAS) methods can be applied to study bacterial populations, with a particular emphasis on alignment-free approaches, which are necessitated by the more plastic nature of bacterial genomes. Here we advance bacterial GWAS by introducing a computationally scalable joint modeling framework, where genetic variants covering the entire pangenome are compactly represented by unitigs, and the model fitting is achieved using elastic net penalization. In contrast to current leading GWAS approaches, which test each genotype-phenotype association separately for each variant, our joint modelling approach is shown to lead to increased statistical power while maintaining control of the false positive rate. Our inference procedure also delivers an estimate of the narrow-sense heritability, which is gaining considerable interest in studies of bacteria. Using an extensive set of state-of-the-art bacterial population genomic datasets we demonstrate that our approach performs accurate phenotype prediction, comparable to popular machine learning methods, while retaining both interpretability and computational efficiency. We expect that these advances will pave the way for the next generation of high-powered association and prediction studies for an increasing number of bacterial species.</jats:p>

Journal article

Pensar J, Puranen S, Arnold B, MacAlasdair N, Kuronen J, Tonkin-Hill G, Pesonen M, Xu Y, Sipola A, Sanchez-Buso L, Lees JA, Chewapreechi C, Bentley SD, Harris SR, Parkhill J, Croucher NJ, Corander Jet al., 2019, Genome-wide epistasis and co-selection study using mutual information, NUCLEIC ACIDS RESEARCH, Vol: 47, ISSN: 0305-1048

Journal article

Corander J, Croucher N, Harris S, Lees J, Tonkin-Hill Get al., 2019, Bacterial Population Genomics, Handbook of Statistical Genomics, Editors: Balding, Moltke, Marioni, Publisher: John Wiley & Sons, ISBN: 9781119429142

The Handbook of Statistical Genomics is an excellent introductory text for advanced graduate students and early-career researchers involved in statistical genetics.

Book chapter

Lo SW, Gladstone RA, van Tonder AJ, Lees JA, du Plessis M, Benisty R, Givon-Lavi N, Hawkins PA, Cornick JE, Kwambana-Adams B, Law PY, Ho PL, Antonio M, Everett DB, Dagan R, von Gottberg A, Klugman KP, McGee L, Breiman RF, Bentley SD, Brooks AW, Corso A, Davydov A, Maguire A, Pollard A, Kiran A, Skoczynska A, Moiane B, Beall B, Sigauque B, Aanensen D, Lehmann D, Faccone D, Faster-Nyarko E, Bojang E, Egorova E, Voropaeva E, Sampane-Donkor E, Sadowy E, Bigogo G, Mucavele H, Belabbes H, Diawara I, Moisi J, Verani J, Keenan J, Nair JN, Bhai T, Ndlangisa KM, Zerouali K, Ravikumar KL, Titov L, De Gouveia L, Alaerts M, Ip M, Brandileone MCDC, Hasanuzzaman M, Paragi M, Nurse-Lucas M, Ali M, Elmdaghri N, Croucher N, Wolter N, Porat N, Eser OK, Akpaka PE, Turner P, Gagetti P, Tientcheu P-E, Carter PE, Mostowy R, Kandasamy R, Ford R, Henderson R, Malaker R, Shakoor S, Almeida SCG, Saha SK, Doiphode S, Madhi SA, Sekaran SD, Srifuengfung S, Obaro S, Clarke SC, Nzenze SA, Kastrin T, Ochoa TJ, Balaji V, Hryniewicz W, Urban Yet al., 2019, Pneumococcal lineages associated with serotype replacement and antibiotic resistance in childhood invasive pneumococcal disease in the post-PCV13 era: an international whole-genome sequencing study, LANCET INFECTIOUS DISEASES, Vol: 19, Pages: 759-769, ISSN: 1473-3099

Journal article

Tonkin-Hill G, Lees JA, Bentley SD, Frost SDW, Corander Jet al., 2019, y Fast hierarchical Bayesian analysis of population structure, NUCLEIC ACIDS RESEARCH, Vol: 47, Pages: 5539-5549, ISSN: 0305-1048

Journal article

Davies MR, McIntyre L, Mutreja A, Lacey JA, Lees JA, Towers RJ, Duchêne S, Smeesters PR, Frost HR, Price DJ, Holden MTG, David S, Giffard PM, Worthing KA, Seale AC, Berkley JA, Harris SR, Rivera-Hernandez T, Berking O, Cork AJ, Torres RSLA, Lithgow T, Strugnell RA, Bergmann R, Nitsche-Schmitz P, Chhatwal GS, Bentley SD, Fraser JD, Moreland NJ, Carapetis JR, Steer AC, Parkhill J, Saul A, Williamson DA, Currie BJ, Tong SYC, Dougan G, Walker MJet al., 2019, Atlas of group A streptococcal vaccine candidates compiled using large-scale comparative genomics, Nature Genetics, Vol: 51, Pages: 1035-1043, ISSN: 1061-4036

Journal article

Lees JA, Ferwerda B, Kremer PHC, Wheeler NE, Seron MV, Croucher NJ, Gladstone RA, Bootsma HJ, Rots NY, Wijmega-Monsuur AJ, Sanders EAM, Trzcinski K, Wyllie AL, Zwinderman AH, van den Berg LH, van Rheenen W, Veldink JH, Harboe ZB, Lundbo LF, de Groot LCPGM, van Schoor NM, van der Velde N, Angquist LH, Sorensen TIA, Nohr EA, Mentzer AJ, Mills TC, Knight JC, du Plessis M, Nzenze S, Weiser JN, Parkhill J, Madhi S, Benfield T, von Gottberg A, van der Ende A, Brouwer MC, Barrett JC, Bentley SD, van de Beek Det al., 2019, Joint sequencing of human and pathogen genomes reveals the genetics of pneumococcal meningitis, NATURE COMMUNICATIONS, Vol: 10, ISSN: 2041-1723

Journal article

Lehtinen S, Blanquart F, Lipsitch M, Fraser C, Bentley SD, Croucher NJ, Lees JA, Turner Pet al., 2019, On the evolutionary ecology of multidrug resistance in bacteria, PLoS Pathogens, Vol: 15, ISSN: 1553-7366

Resistance against different antibiotics appears on the same bacterial strains more oftenthan expected by chance, leading to high frequencies of multidrug resistance. There are multiple explanations for this observation, but these tend to be specific to subsets of antibioticsand/or bacterial species, whereas the trend is pervasive. Here, we consider the questionin terms of strain ecology: explaining why resistance to different antibiotics is often seen onthe same strain requires an understanding of the competition between strains with differentresistance profiles. This work builds on models originally proposed to explain another aspectof strain competition: the stable coexistence of antibiotic sensitivity and resistance observedin a number of bacterial species. We first identify a partial structural similarity in these models: either strain or host population structure stratifies the pathogen population into evolutionarily independent sub-populations and introduces variation in the fitness effect of resistancebetween these sub-populations, thus creating niches for sensitivity and resistance. We thengeneralise this unified underlying model to multidrug resistance and show that models withthis structure predict high levels of association between resistance to different drugs andhigh multidrug resistance frequencies. We test predictions from this model in six bacterialdatasets and find them to be qualitatively consistent with observed trends. The higher thanexpected frequencies of multidrug resistance are often interpreted as evidence that thesestrains are out-competing strains with lower resistance multiplicity. Our work provides analternative explanation that is compatible with long-term stability in resistance frequencies.

Journal article

Gladstone RA, Lo SW, Lees JA, Croucher NJ, van Tonder AJ, Corander J, Page AJ, Marttinen P, Bentley LJ, Ochoa TJ, Ho PL, du Plessis M, Cornick JE, Kwambana-Adams B, Benisty R, Nzenze SA, Madhi SA, Hawkins PA, Everett DB, Antonio M, Dagan R, Klugman KP, von Gottberg A, McGee L, Breiman RF, Bentley SDet al., 2019, International genomic definition of pneumococcal lineages, to contextualise disease, antibiotic resistance and vaccine impact, EBioMedicine, Vol: 43, Pages: 338-346, ISSN: 2352-3964

BackgroundPneumococcal conjugate vaccines have reduced the incidence of invasive pneumococcal disease, caused by vaccine serotypes, but non-vaccine-serotypes remain a concern. We used whole genome sequencing to study pneumococcal serotype, antibiotic resistance and invasiveness, in the context of genetic background.MethodsOur dataset of 13,454 genomes, combined with four published genomic datasets, represented Africa (40%), Asia (25%), Europe (19%), North America (12%), and South America (5%). These 20,027 pneumococcal genomes were clustered into lineages using PopPUNK, and named Global Pneumococcal Sequence Clusters (GPSCs). From our dataset, we additionally derived serotype and sequence type, and predicted antibiotic sensitivity. We then measured invasiveness using odds ratios that relating prevalence in invasive pneumococcal disease to carriage.FindingsThe combined collections (n = 20,027) were clustered into 621 GPSCs. Thirty-five GPSCs observed in our dataset were represented by >100 isolates, and subsequently classed as dominant-GPSCs. In 22/35 (63%) of dominant-GPSCs both non-vaccine serotypes and vaccine serotypes were observed in the years up until, and including, the first year of pneumococcal conjugate vaccine introduction.Penicillin and multidrug resistance were higher (p < .05) in a subset dominant-GPSCs (14/35, 9/35 respectively), and resistance to an increasing number of antibiotic classes was associated with increased recombination (R2 = 0.27 p < .0001). In 28/35 dominant-GPSCs, the country of isolation was a significant predictor (p < .05) of its antibiogram (mean misclassification error 0.28, SD ± 0.13).We detected increased invasiveness of six genetic backgrounds, when compared to other genetic backgrounds expressing the same serotype. Up to 1.6-fold changes in invasiveness odds ratio were observed.InterpretationWe define GPSCs that can be assigned to any pneumococcal genomic dataset, to aid international comparisons. Existing n

Journal article

Copin R, Sause WE, Fulmer Y, Balasubramanian D, Dyzenhaus S, Ahmed JM, Kumar K, Lees J, Stachel A, Fisher JC, Drlica K, Phillips M, Weiser JN, Planet PJ, Uhlemann A-C, Altman DR, Sebra R, van Bakel H, Lighter J, Torres VJ, Shopsin Bet al., 2019, Sequential evolution of virulence and resistance during clonal spread of community-acquired methicillin-resistant Staphylococcus aureus, Proceedings of the National Academy of Sciences, Vol: 116, Pages: 1745-1754, ISSN: 0027-8424

<jats:p>The past two decades have witnessed an alarming expansion of staphylococcal disease caused by community-acquired methicillin-resistant <jats:italic>Staphylococcus aureus</jats:italic> (CA-MRSA). The factors underlying the epidemic expansion of CA-MRSA lineages such as USA300, the predominant CA-MRSA clone in the United States, are largely unknown. Previously described virulence and antimicrobial resistance genes that promote the dissemination of CA-MRSA are carried by mobile genetic elements, including phages and plasmids. Here, we used high-resolution genomics and experimental infections to characterize the evolution of a USA300 variant plaguing a patient population at increased risk of infection to understand the mechanisms underlying the emergence of genetic elements that facilitate clonal spread of the pathogen. Genetic analyses provided conclusive evidence that fitness (manifest as emergence of a dominant clone) changed coincidently with the stepwise emergence of (<jats:italic>i</jats:italic>) a unique prophage and mutation of the regulator of the pyrimidine nucleotide biosynthetic operon that promoted abscess formation and colonization, respectively, thereby priming the clone for success; and (<jats:italic>ii</jats:italic>) a unique plasmid that conferred resistance to two topical microbiocides, mupirocin and chlorhexidine, frequently used for decolonization and infection prevention. The resistance plasmid evolved through successive incorporation of DNA elements from non-<jats:italic>S. aureus</jats:italic> spp. into an indigenous cryptic plasmid, suggesting a mechanism for interspecies genetic exchange that promotes antimicrobial resistance. Collectively, the data suggest that clonal spread in a vulnerable population resulted from extensive clinical intervention and intense selection pressure toward a pathogen lifestyle that involved the evolution of consequential mutations and mobile genetic eleme

Journal article

Lees JA, Harris SR, Tonkin-Hill G, Gladstone RA, Lo SW, Weiser JN, Corander J, Bentley SD, Croucher NJet al., 2019, Fast and flexible bacterial genomic epidemiology with PopPUNK, Genome Research, Vol: 29, Pages: 304-316, ISSN: 1088-9051

The routine use of genomics for disease surveillance provides the opportunity for high-resolution bacterial epidemiology. Current whole-genome clustering and multilocus typing approaches do not fully exploit core and accessory genomic variation, and they cannot both automatically identify, and subsequently expand, clusters of significantly similar isolates in large data sets spanning entire species. Here, we describe PopPUNK (Population Partitioning Using Nucleotide K -mers), a software implementing scalable and expandable annotation- and alignment-free methods for population analysis and clustering. Variable-length k-mer comparisons are used to distinguish isolates' divergence in shared sequence and gene content, which we demonstrate to be accurate over multiple orders of magnitude using data from both simulations and genomic collections representing 10 taxonomically widespread species. Connections between closely related isolates of the same strain are robustly identified, despite interspecies variation in the pairwise distance distributions that reflects species' diverse evolutionary patterns. PopPUNK can process 103-104 genomes in a single batch, with minimal memory use and runtimes up to 200-fold faster than existing model-based methods. Clusters of strains remain consistent as new batches of genomes are added, which is achieved without needing to reanalyze all genomes de novo. This facilitates real-time surveillance with consistent cluster naming between studies and allows for outbreak detection using hundreds of genomes in minutes. Interactive visualization and online publication is streamlined through the automatic output of results to multiple platforms. PopPUNK has been designed as a flexible platform that addresses important issues with currently used whole-genome clustering and typing methods, and has potential uses across bacterial genetics and public health research.

Journal article

Shen P, Lees JA, Bee GCW, Brown SP, Weiser JNet al., 2019, Pneumococcal quorum sensing drives an asymmetric owner-intruder competitive strategy during carriage via the competence regulon, NATURE MICROBIOLOGY, Vol: 4, Pages: 198-+, ISSN: 2058-5276

Journal article

Lees JA, Galardini M, Bentley SD, Weiser JN, Corander Jet al., 2018, pyseer: a comprehensive tool for microbial pangenome-wide association studies, BIOINFORMATICS, Vol: 34, Pages: 4310-4312, ISSN: 1367-4803

Journal article

Puranen S, Pesonen M, Pensar J, Xu YY, Lees JA, Bentley SD, Croucher NJ, Corander Jet al., 2018, SuperDCA for genome-wide epistasis analysis, MICROBIAL GENOMICS, Vol: 4, ISSN: 2057-5858

Journal article

Lees JA, Kendall M, Parkhill J, Colijn C, Bentley SD, Harris SRet al., 2018, Evaluation of phylogenetic reconstruction methods using bacterial whole genomes: a simulation based study, Wellcome Open Research, Vol: 3, Pages: 33-33, ISSN: 2398-502X

Background: Phylogenetic reconstruction is a necessary first step in many analyses which use whole genome sequence data from bacterial populations. There are many available methods to infer phylogenies, and these have various advantages and disadvantages, but few unbiased comparisons of the range of approaches have been made. Methods: We simulated data from a defined "true tree" using a realistic evolutionary model. We built phylogenies from this data using a range of methods, and compared reconstructed trees to the true tree using two measures, noting the computational time needed for different phylogenetic reconstructions. We also used real data from Streptococcus pneumoniae alignments to compare individual core gene trees to a core genome tree. Results: We found that, as expected, maximum likelihood trees from good quality alignments were the most accurate, but also the most computationally intensive. Using less accurate phylogenetic reconstruction methods, we were able to obtain results of comparable accuracy; we found that approximate results can rapidly be obtained using genetic distance based methods. In real data we found that highly conserved core genes, such as those involved in translation, gave an inaccurate tree topology, whereas genes involved in recombination events gave inaccurate branch lengths. We also show a tree-of-trees, relating the results of different phylogenetic reconstructions to each other. Conclusions: We recommend three approaches, depending on requirements for accuracy and computational time. Quicker approaches that do not perform full maximum likelihood optimisation may be useful for many analyses requiring a phylogeny, as generating a high quality input alignment is likely to be the major limiting factor of accurate tree topology. We have publicly released our simulated data and code to enable further comparisons.

Journal article

Tonkin-Hill G, Lees JA, Bentley SD, Frost SDW, Corander Jet al., 2018, RhierBAPS: An R implementation of the population clustering algorithm hierBAPS., Wellcome Open Res, Vol: 3, ISSN: 2398-502X

Identifying structure in collections of sequence data sets remains a common problem in genomics. hierBAPS, a popular algorithm for identifying population structure in haploid genomes, has previously only been available as a MATLAB binary. We provide an R implementation which is both easier to install and use, automating the entire pipeline. Additionally, we allow for the use of multiple processors, improve on the default settings of the algorithm, and provide an interface with the ggtree library to enable informative illustration of the clustering results. Our aim is that this package aids in the understanding and dissemination of the method, as well as enhancing the reproducibility of population structure analyses.

Journal article

Lees JA, Tonkin-Hill G, Bentley SD, 2017, GENOME WATCH Stronger together, NATURE REVIEWS MICROBIOLOGY, Vol: 15, Pages: 516-516, ISSN: 1740-1526

Journal article

Lees JA, Croucher NJ, Goldblatt D, Nosten F, Parkhill J, Turner C, Turner P, Bentley SDet al., 2017, Genome-wide identification of lineage and locus specific variation associated with pneumococcal carriage duration, eLife, Vol: 6, ISSN: 2050-084X

Streptococcus pneumoniae is a leading cause of invasive disease in infants, especiallyin low-income settings. Asymptomatic carriage in the nasopharynx is a prerequisite for disease, butvariability in its duration is currently only understood at the serotype level. Here we developed amodel to calculate the duration of carriage episodes from longitudinal swab data, and combinedthese results with whole genome sequence data. We estimated that pneumococcal genomicvariation accounted for 63% of the phenotype variation, whereas the host traits considered here(age and previous carriage) accounted for less than 5%. We further partitioned this heritability intoboth lineage and locus effects, and quantified the amount attributable to the largest sources ofvariation in carriage duration: serotype (17%), drug-resistance (9%) and other significant locuseffects (7%). A pan-genome-wide association study identified prophage sequences as beingassociated with decreased carriage duration independent of serotype, potentially by disruption ofthe competence mechanism. These findings support theoretical models of pneumococcalcompetition and antibiotic resistance.

Journal article

Kremer PHC, Lees JA, Koopmans MM, Ferwerda B, Arends AWM, Feller MM, Schipper K, Seron MV, van der Ende A, Brouwer MC, van de Beek D, Bentley SDet al., 2017, Benzalkonium tolerance genes and outcome in Listeria monocytogenes meningitis, CLINICAL MICROBIOLOGY AND INFECTION, Vol: 23, Pages: 265E1-265E7, ISSN: 1198-743X

Journal article

Lees JA, Brouwer M, van der Ende A, Parkhill J, van de Beek D, Bentley SDet al., 2017, Within-Host Sampling of a Natural Population Shows Signs of Selection on Pde1 during Bacterial Meningitis, INFECTION AND IMMUNITY, Vol: 85, ISSN: 0019-9567

Journal article

Lees JA, Kremer PHC, Manso AS, Croucher NJ, Ferwerda B, Serón MV, Oggioni MR, Parkhill J, Brouwer MC, van der Ende A, van de Beek D, Bentley SDet al., 2017, Large scale genomic analysis shows no evidence for pathogen adaptation between the blood and cerebrospinal fluid niches during bacterial meningitis., Microbial Genomics, Vol: 3, ISSN: 2057-5858

Recent studies have provided evidence for rapid pathogen genome diversification, some of which could potentially affect the course of disease. We have previously described such variation seen between isolates infecting the blood and cerebrospinal fluid (CSF) of a single patient during a case of bacterial meningitis. Here, we performed whole-genome sequencing of paired isolates from the blood and CSF of 869 meningitis patients to determine whether such variation frequently occurs between these two niches in cases of bacterial meningitis. Using a combination of reference-free variant calling approaches, we show that no genetic adaptation occurs in either invaded niche during bacterial meningitis for two major pathogen species, Streptococcus pneumoniae and Neisseria meningitidis. This study therefore shows that the bacteria capable of causing meningitis are already able to do this upon entering the blood, and no further sequence change is necessary to cross the blood-brain barrier. Our findings place the focus back on bacterial evolution between nasopharyngeal carriage and invasion, or diversity of the host, as likely mechanisms for determining invasiveness.

Journal article

Khatib U, van de Beek D, Lees JA, Brouwer MCet al., 2017, Adults with suspected central nervous system infection: A prospective study of diagnostic accuracy, JOURNAL OF INFECTION, Vol: 74, Pages: 1-9, ISSN: 0163-4453

Journal article

David S, Rusniok C, Mentasti M, Gomez-Valero L, Harris SR, Lechat P, Lees J, Ginevra C, Glaser P, Ma L, Bouchier C, Underwood A, Jarraud S, Harrison TG, Parkhill J, Buchrieser Cet al., 2016, Multiple major disease-associated clones of Legionella pneumophila have emerged recently and independently, GENOME RESEARCH, Vol: 26, Pages: 1555-1564, ISSN: 1088-9051

Journal article

Lees JA, Vehkala M, Välimäki N, Harris SR, Chewapreecha C, Croucher NJ, Marttinen P, Davies MR, Steer AC, Tong SY, Honkela A, Parkhill J, Bentley SD, Corander Jet al., 2016, Sequence element enrichment analysis to determine the genetic basis of bacterial phenotypes, Nature Communications, Vol: 7, ISSN: 2041-1723

Bacterial genomes vary extensively in terms of both gene content and gene sequence. This plasticity hampers the use of traditional SNP-based methods for identifying all genetic associations with phenotypic variation. Here we introduce a computationally scalable and widely applicable statistical method (SEER) for the identification of sequence elements that are significantly enriched in a phenotype of interest. SEER is applicable to tens of thousands of genomes by counting variable-length k-mers using a distributed string-mining algorithm. Robust options are provided for association analysis that also correct for the clonal population structure of bacteria. Using large collections of genomes of the major human pathogens Streptococcus pneumoniae and Streptococcus pyogenes, SEER identifies relevant previously characterized resistance determinants for several antibiotics and discovers potential novel factors related to the invasiveness of S. pyogenes. We thus demonstrate that our method can answer important biologically and medically relevant questions.

Journal article

Lees JA, Bentley SD, 2016, Bacterial GWAS: not just gilding the lily, NATURE REVIEWS MICROBIOLOGY, Vol: 14, Pages: 1-1, ISSN: 1740-1526

Journal article

Cain AK, Lees JA, 2015, Using genomics to combat infectious diseases on a global scale, GENOME BIOLOGY, Vol: 16, ISSN: 1465-6906

Journal article

Lees J, Gladstone RA, 2015, GENOME WATCH R-M systems go on the offensive, NATURE REVIEWS MICROBIOLOGY, Vol: 13, ISSN: 1740-1526

Journal article

Tonkin-Hill G, MacAlasdair N, Ruis C, Weimann A, Horesh G, Lees JA, Gladstone RA, Lo S, Beaudoin C, Floto RA, Frost SDW, Corander J, Bentley SD, Parkhill Jet al., Producing Polished Prokaryotic Pangenomes with the Panaroo Pipeline

<jats:p>Population-level comparisons of prokaryotic genomes must take into account the substantial differences in gene content, resulting from frequent horizontal gene transfer, gene duplication and gene loss. However, the automated annotation of prokaryotic genomes is imperfect, and errors due to fragmented assemblies, contamination, diverse gene families and mis-assemblies accumulate over the population, leading to profound consequences when analysing the set of all genes found in a species. Here we introduce Panaroo, a graph based pangenome clustering tool that is able to account for many of the sources of error introduced during the annotation of prokaryotic genome assemblies. We verified our approach through extensive simulations of de novo assemblies using the infinitely many genes model and by analysing a number of publicly available large bacterial genome datasets. Using a highly clonal <jats:italic>Mycobacterium tuberculosis</jats:italic> dataset as a negative control case, we show that failing to account for annotation errors can lead to pangenome estimates that are dominated by error. We additionally demonstrate the utility of the improved graphical output provided by Panaroo by performing a pan-genome wide association study in <jats:italic>Neisseria gonorrhoeae</jats:italic> and by analysing gene gain and loss rates across 51 of the major global pneumococcal sequence clusters. Panaroo is freely available under an open source MIT licence at <jats:ext-link xmlns:xlink="http://www.w3.org/1999/xlink" ext-link-type="uri" xlink:href="https://github.com/gtonkinhill/panaroo">https://github.com/gtonkinhill/panaroo</jats:ext-link>.</jats:p>

Journal article

This data is extracted from the Web of Science and reproduced under a licence from Thomson Reuters. You may not copy or re-distribute this data in whole or in part without the written consent of the Science business of Thomson Reuters.

Request URL: http://wlsprd.imperial.ac.uk:80/respub/WEB-INF/jsp/search-html.jsp Request URI: /respub/WEB-INF/jsp/search-html.jsp Query String: respub-action=search.html&id=00592818&limit=30&person=true