41 results found
Unwin H, Mishra S, Bradley VC, et al., 2020, Report 23: State-level tracking of COVID-19 in the United States
our estimates show that the percentage of individuals that have been infected is 4.1% [3.7%-4.5%], with widevariation between states. For all states, even for the worst affected states, we estimate that less than a quarter of thepopulation has been infected; in New York, for example, we estimate that 16.6% [12.8%-21.6%] of individuals have beeninfected to date. Our attack rates for New York are in line with those from recent serological studies  broadly supportingour choice of infection fatality rate.There is variation in the initial reproduction number, which is likely due to a range of factors; we find a strong associationbetween the initial reproduction number with both population density (measured at the state level) and the chronologicaldate when 10 cumulative deaths occurred (a crude estimate of the date of locally sustained transmission).Our estimates suggest that the epidemic is not under control in much of the US: as of 17 May 2020 the reproductionnumber is above the critical threshold (1.0) in 24 [95% CI: 20-30] states. Higher reproduction numbers are geographicallyclustered in the South and Midwest, where epidemics are still developing, while we estimate lower reproduction numbersin states that have already suffered high COVID-19 mortality (such as the Northeast). These estimates suggest that cautionmust be taken in loosening current restrictions if effective additional measures are not put in place.We predict that increased mobility following relaxation of social distancing will lead to resurgence of transmission, keepingall else constant. We predict that deaths over the next two-month period could exceed current cumulative deathsby greater than two-fold, if the relationship between mobility and transmission remains unchanged. Our results suggestthat factors modulating transmission such as rapid testing, contact tracing and behavioural precautions are crucial to offsetthe rise of transmission associated with loosening of social distancing. Overall, we
Mellan T, Hoeltgebaum H, Mishra S, et al., 2020, Report 21: Estimating COVID-19 cases and reproduction number in Brazil
Brazil is an epicentre for COVID-19 in Latin America. In this report we describe the Brazilian epidemicusing three epidemiological measures: the number of infections, the number of deaths and the reproduction number. Our modelling framework requires sufficient death data to estimate trends, and wetherefore limit our analysis to 16 states that have experienced a total of more than fifty deaths. Thedistribution of deaths among states is highly heterogeneous, with 5 states—São Paulo, Rio de Janeiro,Ceará, Pernambuco and Amazonas—accounting for 81% of deaths reported to date. In these states, weestimate that the percentage of people that have been infected with SARS-CoV-2 ranges from 3.3% (95%CI: 2.8%-3.7%) in São Paulo to 10.6% (95% CI: 8.8%-12.1%) in Amazonas. The reproduction number (ameasure of transmission intensity) at the start of the epidemic meant that an infected individual wouldinfect three or four others on average. Following non-pharmaceutical interventions such as school closures and decreases in population mobility, we show that the reproduction number has dropped substantially in each state. However, for all 16 states we study, we estimate with high confidence that thereproduction number remains above 1. A reproduction number above 1 means that the epidemic isnot yet controlled and will continue to grow. These trends are in stark contrast to other major COVID19 epidemics in Europe and Asia where enforced lockdowns have successfully driven the reproductionnumber below 1. While the Brazilian epidemic is still relatively nascent on a national scale, our resultssuggest that further action is needed to limit spread and prevent health system overload.
Vollmer M, Mishra S, Unwin H, et al., 2020, Report 20: A sub-national analysis of the rate of transmission of Covid-19 in Italy
Italy was the first European country to experience sustained local transmission of COVID-19. As of 1st May 2020, the Italian health authorities reported 28; 238 deaths nationally. To control the epidemic, the Italian government implemented a suite of non-pharmaceutical interventions (NPIs), including school and university closures, social distancing and full lockdown involving banning of public gatherings and non essential movement. In this report, we model the effect of NPIs on transmission using data on average mobility. We estimate that the average reproduction number (a measure of transmission intensity) is currently below one for all Italian regions, and significantly so for the majority of the regions. Despite the large number of deaths, the proportion of population that has been infected by SARS-CoV-2 (the attack rate) is far from the herd immunity threshold in all Italian regions, with the highest attack rate observed in Lombardy (13.18% [10.66%-16.70%]). Italy is set to relax the currently implemented NPIs from 4th May 2020. Given the control achieved by NPIs, we consider three scenarios for the next 8 weeks: a scenario in which mobility remains the same as during the lockdown, a scenario in which mobility returns to pre-lockdown levels by 20%, and a scenario in which mobility returns to pre-lockdown levels by 40%. The scenarios explored assume that mobility is scaled evenly across all dimensions, that behaviour stays the same as before NPIs were implemented, that no pharmaceutical interventions are introduced, and it does not include transmission reduction from contact tracing, testing and the isolation of confirmed or suspected cases. We find that, in the absence of additional interventions, even a 20% return to pre-lockdown mobility could lead to a resurgence in the number of deaths far greater than experienced in the current wave in several regions. Future increases in the number of deaths will lag behind the increase in transmission intensity and so a
Hoornenborg E, Coyer L, Boyd A, et al., 2020, High incidence of HCV in HIV-negative men who have sex with men using pre-exposure prophylaxis, JOURNAL OF HEPATOLOGY, Vol: 72, Pages: 855-864, ISSN: 0168-8278
Bbosa N, Ssemwanga D, Ssekagiri A, et al., 2020, Phylogenetic and demographic characterization of directed HIV-1 transmission using deep sequences from high-risk and general population cohorts/groups in Uganda, Viruses, Vol: 12, ISSN: 1999-4915
Across sub-Saharan Africa, key populations with elevated HIV-1 incidence and/or prevalence have been identified, but their contribution to disease spread remains unclear. We performed viral deep-sequence phylogenetic analyses to quantify transmission dynamics between the general population (GP), fisherfolk communities (FF), and women at high risk of infection and their clients (WHR) in central and southwestern Uganda. Between August 2014 and August 2017, 6185 HIV-1 positive individuals were enrolled in 3 GP and 10 FF communities, 3 WHR enrollment sites. A total of 2531 antiretroviral therapy (ART) naïve participants with plasma viral load >1000 copies/mL were deep-sequenced. One hundred and twenty-three transmission networks were reconstructed, including 105 phylogenetically highly supported source-recipient pairs. Only one pair involved a WHR and male participant, suggesting that improved population sampling is needed to assess empirically the role of WHR to the transmission dynamics. More transmissions were observed from the GP communities to FF communities than vice versa, with an estimated flow ratio of 1.56 (95% CrI 0.68-3.72), indicating that fishing communities on Lake Victoria are not a net source of transmission flow to neighboring communities further inland. Men contributed disproportionally to HIV-1 transmission flow regardless of age, suggesting that prevention efforts need to better aid men to engage with and stay in care.
Capoferri AA, Lamers SL, Grabowski MK, et al., 2020, Recombination analysis of near full-length HIV-1 sequences and the identification of a potential new circulating recombinant form from Rakai, Uganda., AIDS Research and Human Retroviruses, ISSN: 0889-2229
The Phylogenetics And Networks for Generalized HIV Epidemics in Africa (PANGEA-HIV) consortium has been vital in the generation and examination of near full-length HIV-1 sequences generated from Sub-Saharan Africa. In this study, we examined a subset (n = 275) of sequences from Rakai, Uganda, collected between August 2011 and January 2015. Sequences were initially screened with COMET for subtyping and then evaluated using bootscanning and phylogenetic inference. Among 275 sequences, 38.6% were subtype D, 19.3% were subtype A, 2.9% were subtype C, and 39.3% were recombinant. The recombinants were structurally diverse in the number of breakpoints observed, the location of recombinant segments, and represented subtypes, with AD recombinants accounting for the majority of all recombinants (29.8%). Within the AD subpopulation, we identified a potential new circulating recombinant form in five individuals where the polymerase gene was subtype D and most of env was subtype A (D-A junctures at HXB2 6760 and 8709). While the breakpoints were identical for the viruses from these individuals, the viral fragments did not cluster together. These results suggest selection for a viral strain where properties of the subtype A and subtype D portions of the virus confer a survival advantage. The continued study of recombinants will increase our breadth of knowledge for the genetic diversity and evolution of HIV-1, which can further contribute to our understanding toward a universal HIV-1 vaccine.
Ratmann O, Kagaayi J, Hall M, et al., 2020, Quantifying HIV transmission flow between high-prevalence hotspots and surrounding communities: a population-based study in Rakai, Uganda, The Lancet HIV, Vol: 7, Pages: e173-e183, ISSN: 2352-3018
BackgroundInternational and global organisations advocate targeting interventions to areas of high HIV prevalence (ie, hotspots). To better understand the potential benefits of geo-targeted control, we assessed the extent to which HIV hotspots along Lake Victoria sustain transmission in neighbouring populations in south-central Uganda.MethodsWe did a population-based survey in Rakai, Uganda, using data from the Rakai Community Cohort Study. The study surveyed all individuals aged 15–49 years in four high-prevalence Lake Victoria fishing communities and 36 neighbouring inland communities. Viral RNA was deep sequenced from participants infected with HIV who were antiretroviral therapy-naive during the observation period. Phylogenetic analysis was used to infer partial HIV transmission networks, including direction of transmission. Reconstructed networks were interpreted through data for current residence and migration history. HIV transmission flows within and between high-prevalence and low-prevalence areas were quantified adjusting for incomplete sampling of the population.FindingsBetween Aug 10, 2011, and Jan 30, 2015, data were collected for the Rakai Community Cohort Study. 25 882 individuals participated, including an estimated 75·7% of the lakeside population and 16·2% of the inland population in the Rakai region of Uganda. 5142 participants were HIV-positive (2703 [13·7%] in inland and 2439 [40·1%] in fishing communities). 3878 (75·4%) people who were HIV-positive did not report antiretroviral therapy use, of whom 2652 (68·4%) had virus deep-sequenced at sufficient quality for phylogenetic analysis. 446 transmission networks were reconstructed, including 293 linked pairs with inferred direction of transmission. Adjusting for incomplete sampling, an estimated 5·7% (95% credibility interval 4·4–7·3) of transmissions occurred within lakeside areas, 89·2% (86·0–91·
Grant HE, Hodcroft EB, Ssemwanga D, et al., 2020, Pervasive and non-random recombination in near full-length HIV genomes from Uganda., Virus Evol, Vol: 6, ISSN: 2057-1577
Recombination is an important feature of HIV evolution, occurring both within and between the major branches of diversity (subtypes). The Ugandan epidemic is primarily composed of two subtypes, A1 and D, that have been co-circulating for 50 years, frequently recombining in dually infected patients. Here, we investigate the frequency of recombinants in this population and the location of breakpoints along the genome. As part of the PANGEA-HIV consortium, 1,472 consensus genome sequences over 5 kb have been obtained from 1,857 samples collected by the MRC/UVRI & LSHTM Research unit in Uganda, 465 (31.6 per cent) of which were near full-length sequences (>8 kb). Using the subtyping tool SCUEAL, we find that of the near full-length dataset, 233 (50.1 per cent) genomes contained only one subtype, 30.8 per cent A1 (n = 143), 17.6 per cent D (n = 82), and 1.7 per cent C (n = 8), while 49.9 per cent (n = 232) contained more than one subtype (including A1/D (n = 164), A1/C (n = 13), C/D (n = 9); A1/C/D (n = 13), and 33 complex types). K-means clustering of the recombinant A1/D genomes revealed a section of envelope (C2gp120-TMgp41) is often inherited intact, whilst a generalized linear model was used to demonstrate significantly fewer breakpoints in the gag-pol and envelope C2-TM regions compared with accessory gene regions. Despite similar recombination patterns in many recombinants, no clearly supported circulating recombinant form (CRF) was found, there was limited evidence of the transmission of breakpoints, and the vast majority (153/164; 93 per cent) of the A1/D recombinants appear to be unique recombinant forms. Thus, recombination is pervasive with clear biases in breakpoint location, but CRFs are not a significant feature, characteristic of a complex, and diverse epidemic.
Chatzilena A, van Leeuwen E, Ratmann O, et al., 2019, Contemporary statistical inference for infectious disease models using Stan, Epidemics: the journal of infectious disease dynamics, Vol: 29, ISSN: 1755-4365
This paper is concerned with the application of recent statistical advances to inference of infectious disease dynamics. We describe the fitting of a class of epidemic models using Hamiltonian Monte Carlo and variational inference as implemented in the freely available Stan software. We apply the two methods to real data from outbreaks as well as routinely collected observations. Our results suggest that both inference methods are computationally feasible in this context, and show a trade-off between statistical efficiency versus computational speed. The latter appears particularly relevant for real-time applications.
Le Vu S, Ratmann O, Delpech V, et al., 2019, HIV-1 transmission patterns in men who have sex with men: insights from genetic source attribution analysis, AIDS Research and Human Retroviruses, Vol: 39, Pages: 805-813, ISSN: 0889-2229
BACKGROUND: Near 60% of new HIV infections in the United Kingdom are estimated to occur in men who have sex with men (MSM). Age-disassortative partnerships in MSM have been suggested to spread the HIV epidemics in many Western developed countries and to contribute to ethnic disparities in infection rates. Understanding these mixing patterns in transmission can help to determine which groups are at a greater risk and guide public health interventions. METHODS: We analyzed combined epidemiologic data and viral sequences from MSM diagnosed with HIV at the national level. We applied a phylodynamic source attribution model to infer patterns of transmission between groups of patients. RESULTS: From pair probabilities of transmission between 14 603 MSM patients, we found that potential transmitters of HIV subtype B were on average 8 months older than recipients. We also found a moderate overall assortativity of transmission by ethnic group and a stronger assortativity by region. CONCLUSIONS: Our findings suggest that there is only a modest net flow of transmissions from older to young MSM in subtype B epidemics and that young MSM, both for Black or White groups, are more likely to be infected by one another than expected in a sexual network with random mixing.
Le Vu S, Ratmann O, Delpech V, et al., HIV-1 Transmission Patterns in Men Who Have Sex with Men: Insights from Genetic Source Attribution Analysis, AIDS Research and Human Retroviruses, ISSN: 0889-2229
Abeler-Dorner L, Grabowski MK, Rambaut A, et al., 2019, PANGEA-HIV 2: Phylogenetics and networks for generalised epidemics in Africa, Current Opinion in HIV and AIDS, Vol: 14, Pages: 173-180, ISSN: 1746-630X
Purpose of review The HIV epidemic in sub-Saharan Africa is far from being under control and the ambitious UNAIDS targets are unlikely to be met by 2020 as declines in per-capita incidence being largely offset by demographic trends. There is an increasing number of proven and specific HIV prevention tools, but little consensus on how best to deploy them.Recent findings Traditionally, phylogenetics has been used in HIV research to reconstruct the history of the epidemic and date zoonotic infections, whereas more recent publications focus on HIV diversity and drug resistance. However, it is also the most powerful method of source attribution available for the study of HIV transmission. The PANGEA (Phylogenetics And Networks for Generalized Epidemics in Africa) consortium has generated over 18 000 NGS HIV sequences from five countries in sub-Saharan Africa. Using phylogenetic methods, we will identify characteristics of individuals or groups, which are most likely to be at risk of infection or at risk of infecting others.Summary Combining phylogenetics, phylodynamics and epidemiology will allow PANGEA to highlight where prevention efforts should be focussed to reduce the HIV epidemic most effectively. To maximise the public health benefit of the data, PANGEA offers accreditation to external researchers, allowing them to access the data and join the consortium. We also welcome submissions of other HIV sequences from sub-Saharan Africa to the database.
Ratmann O, Grabowski MK, Hall M, et al., 2019, Inferring HIV-1 transmission networks and sources of epidemic spread in Africa with deep-sequence phylogenetic analysis, Nature Communications, Vol: 10, ISSN: 2041-1723
To prevent new infections with human immunodeficiency virus type 1 (HIV-1) in sub-Saharan Africa, UNAIDS recommends targeting interventions to populations that are at high risk of acquiring and passing on the virus. Yet it is often unclear who and where these ‘source’ populations are. Here we demonstrate how viral deep-sequencing can be used to reconstruct HIV-1 transmission networks and to infer the direction of transmission in these networks. We are able to deep-sequence virus from a large population-based sample of infected individuals in Rakai District, Uganda, reconstruct partial transmission networks, and infer the direction of transmission within them at an estimated error rate of 16.3% [8.8–28.3%]. With this error rate, deep-sequence phylogenetics cannot be used against individuals in legal contexts, but is sufficiently low for population-level inferences into the sources of epidemic spread. The technique presents new opportunities for characterizing source populations and for targeting of HIV-1 prevention interventions in Africa.
The relationship between the underlying contact network over which a pathogen spreads and the pathogen phylogenetic trees that are obtained presents an opportunity to use sequence data to learn about contact networks that are difficult to study empirically. However, this relationship is not explicitly known and is usually studied in simulations, often with the simplifying assumption that the contact network is static in time, though human contact networks are dynamic. We simulate pathogen phylogenetic trees on dynamic Erdős-Renyi random networks and on two dynamic networks with skewed degree distribution, of which one is additionally clustered. We use tree shape features to explore how adding dynamics changes the relationships between the overall network structure and phylogenies. Our tree features include the number of small substructures (cherries, pitchforks) in the trees, measures of tree imbalance (Sackin index, Colless index), features derived from network science (diameter, closeness), as well as features using the internal branch lengths from the tip to the root. Using principal component analysis we find that the network dynamics influence the shapes of phylogenies, as does the network type. We also compare dynamic and time-integrated static networks. We find, in particular, that static network models like the widely used Barabasi-Albert model can be poor approximations for dynamic networks. We explore the effects of mis-specifying the network on the performance of classifiers trained identify the transmission rate (using supervised learning methods). We find that both mis-specification of the underlying network and its parameters (mean degree, turnover rate) have a strong adverse effect on the ability to estimate the transmission parameter. We illustrate these results by classifying HIV trees with a classifier that we trained on simulated trees from different networks, infection rates and turnover rates. Our results point to the importance of correctly est
Lachlan RF, Ratmann O, Nowicki S, 2018, Cultural conformity generates extremely stable traditions in bird song, Nature Communications, Vol: 9, ISSN: 2041-1723
Cultural traditions have been observed in a wide variety of animal species. It remains unclear, however, what is required for social learning to give rise to stable traditions: what level of precision and what learning strategies are required. We address these questions by fitting models of cultural evolution to learned bird song. We recorded 615 swamp sparrow (Melospiza georgiana) song repertoires, and compared syllable frequency distributions to the output of individual-based simulations. We find that syllables are learned with an estimated error rate of 1.85% and with a conformist bias in learning. This bias is consistent with a simple mechanism of overproduction and selective attrition. Finally, we estimate that syllable types could frequently persist for more than 500 years. Our results demonstrate conformist bias in natural animal behaviour and show that this, along with moderately precise learning, may support traditions whose stability rivals those of humans.
Le Vu SOK, Ratmann O, Delpech V, et al., 2018, Comparison of cluster-based and source-attribution methods for estimating transmission risk using large HIV sequence databases, Epidemics, Vol: 23, Pages: 1-10, ISSN: 1755-4365
Phylogenetic clustering of HIV sequences from a random sample of patients can reveal epidemiological transmission patterns, but interpretation is hampered by limited theoretical support and statistical properties of clustering analysis remain poorly understood. Alternatively, source attribution methods allow fitting of HIV transmission models and thereby quantify aspects of disease transmission.A simulation study was conducted to assess error rates of clustering methods for detecting transmission risk factors. We modeled HIV epidemics among men having sex with men and generated phylogenies comparable to those that can be obtained from HIV surveillance data in the UK. Clustering and source attribution approaches were applied to evaluate their ability to identify patient attributes as transmission risk factors.We find that commonly used methods show a misleading association between cluster size or odds of clustering and covariates that are correlated with time since infection, regardless of their influence on transmission. Clustering methods usually have higher error rates and lower sensitivity than source attribution method for identifying transmission risk factors. But neither methods provide robust estimates of transmission risk ratios. Source attribution method can alleviate drawbacks from phylogenetic clustering but formal population genetic modeling may be required to estimate quantitative transmission risk factors.
Volz E, Le Vu S, Ratmann O, et al., 2018, Molecular epidemiology of HIV-1 subtype B reveals heterogeneous transmission risk: Implications for intervention and control, Journal of Infectious Diseases, Vol: 217, Pages: 1522-1529, ISSN: 0022-1899
BackgroundThe impact of HIV pre-exposure prophylaxis (PrEP) depends on infections averted by protecting vulnerable individuals as well as infections averted by preventing transmission by those who would have been infected if not receiving PrEP. Analysis of HIV phylogenies reveals risk factors for transmission, which we examine as potential criteria for allocating PrEP.MethodsWe analyzed 6912 HIV-1 partial pol sequences from men who have sex with men (MSM) in the United Kingdom combined with global reference sequences and patient-level metadata. Population genetic models were developed that adjust for stage of infection, global migration of HIV lineages, and changing incidence of infection through time. Models were extended to simulate the effects of providing susceptible MSM with PrEP.ResultsWe found that young age <25 years confers higher risk of HIV transmission (relative risk = 2.52 [95% confidence interval, 2.32–2.73]) and that young MSM are more likely to transmit to one another than expected by chance. Simulated interventions indicate that 4-fold more infections can be averted over 5 years by focusing PrEP on young MSM.ConclusionsConcentrating PrEP doses on young individuals can avert more infections than random allocation.
Wymant C, Hall M, Ratmann O, et al., 2018, PHYLOSCANNER: Inferring Transmission from Within- and Between-Host Pathogen Genetic Diversity., Mol Biol Evol, Vol: 35, Pages: 719-733
A central feature of pathogen genomics is that different infectious particles (virions and bacterial cells) within an infected individual may be genetically distinct, with patterns of relatedness among infectious particles being the result of both within-host evolution and transmission from one host to the next. Here, we present a new software tool, phyloscanner, which analyses pathogen diversity from multiple infected hosts. phyloscanner provides unprecedented resolution into the transmission process, allowing inference of the direction of transmission from sequence data alone. Multiply infected individuals are also identified, as they harbor subpopulations of infectious particles that are not connected by within-host evolution, except where recombinant types emerge. Low-level contamination is flagged and removed. We illustrate phyloscanner on both viral and bacterial pathogens, namely HIV-1 sequenced on Illumina and Roche 454 platforms, HCV sequenced with the Oxford Nanopore MinION platform, and Streptococcus pneumoniae with sequences from multiple colonies per individual. phyloscanner is available from https://github.com/BDI-pathogens/phyloscanner.
Wymant C, Blanquart F, Golubchik T, et al., 2018, Easy and accurate reconstruction of whole HIV genomes from short-read sequence data with shiver, Virus Evolution, Vol: 4, ISSN: 2057-1577
Studying the evolution of viruses and their molecular epidemiology relies on accurate viral sequence data, so that small differences between similar viruses can be meaningfully interpreted. Despite its higher throughput and more detailed minority variant data, next-generation sequencing has yet to be widely adopted for HIV. The difficulty of accurately reconstructing the consensus sequence of a quasispecies from reads (short fragments of DNA) in the presence of large between- and within-host diversity, including frequent indels, may have presented a barrier. In particular, mapping (aligning) reads to a reference sequence leads to biased loss of information; this bias can distort epidemiological and evolutionary conclusions. De novo assembly avoids this bias by aligning the reads to themselves, producing a set of sequences called contigs. However contigs provide only a partial summary of the reads, misassembly may result in their having an incorrect structure, and no information is available at parts of the genome where contigs could not be assembled. To address these problems we developed the tool shiver to pre-process reads for quality and contamination, then map them to a reference tailored to the sample using corrected contigs supplemented with the user's choice of existing reference sequences. Run with two commands per sample, it can easily be used for large heterogeneous data sets. We used shiver to reconstruct the consensus sequence and minority variant information from paired-end short-read whole-genome data produced with the Illumina platform, for sixty-five existing publicly available samples and fifty new samples. We show the systematic superiority of mapping to shiver's constructed reference compared with mapping the same reads to the closest of 3,249 real references: median values of 13 bases called differently and more accurately, 0 bases called differently and less accurately, and 205 bases of missing sequence recovered. We also successfully applied sh
Ratmann O, Ha Minh Lam, Boni MF, 2017, Improved algorithmic complexity for the 3SEQ recombination detection algorithm, Molecular Biology and Evolution, Vol: 35, Pages: 247-251, ISSN: 1537-1719
Identifying recombinant sequences in an era of large genomic databases is challenging as it requires an efficient algorithm to identify candidate recombinants and parents, as well as appropriate statistical methods to correct for the large number of comparisons performed. In 2007, a computation was introduced for an exact nonparametric mosaicism statistic that gave high-precision p-values for putative recombinants. This exact computation meant that multiple-comparisons corrected p-values also had high precision, which is crucial when performing millions or billions of tests in large databases. Here, we introduce an improvement to the algorithmic complexity of this computation from O(mn3) to O(mn2), where m and n are the numbers of recombination-informative sites in the candidate recombinant. This new computation allows for recombination analysis to be performed in alignments with thousands of polymorphic sites. Benchmark runs are presented on viral genome sequence alignments, new features are introduced, and applications outside recombination analysis are discussed.
Ratmann O, Wymant C, Colijn C, et al., 2017, HIV-1 full-genome phylogenetics of generalized epidemics in sub-Saharan Africa: impact of missing nucleotide characters in next-generation sequences, Aids Research and Human Retroviruses, Vol: 33, Pages: 1083-1098, ISSN: 1931-8405
To characterize HIV-1 transmission dynamics in regions where the burden of HIV-1 is greatest, the “Phylogenetics and Networks for Generalised HIV Epidemics in Africa” consortium (PANGEA-HIV) is sequencing full-genome viral isolates from across sub-Saharan Africa. We report the first 3,985 PANGEA-HIV consensus sequences from four cohort sites (Rakai Community Cohort Study, n = 2,833; MRC/UVRI Uganda, n = 701; Mochudi Prevention Project, n = 359; Africa Health Research Institute Resistance Cohort, n = 92). Next-generation sequencing success rates varied: more than 80% of the viral genome from the gag to the nef genes could be determined for all sequences from South Africa, 75% of sequences from Mochudi, 60% of sequences from MRC/UVRI Uganda, and 22% of sequences from Rakai. Partial sequencing failure was primarily associated with low viral load, increased for amplicons closer to the 3′ end of the genome, was not associated with subtype diversity except HIV-1 subtype D, and remained significantly associated with sampling location after controlling for other factors. We assessed the impact of the missing data patterns in PANGEA-HIV sequences on phylogeny reconstruction in simulations. We found a threshold in terms of taxon sampling below which the patchy distribution of missing characters in next-generation sequences (NGS) has an excess negative impact on the accuracy of HIV-1 phylogeny reconstruction, which is attributable to tree reconstruction artifacts that accumulate when branches in viral trees are long. The large number of PANGEA-HIV sequences provides unprecedented opportunities for evaluating HIV-1 transmission dynamics across sub-Saharan Africa and identifying prevention opportunities. Molecular epidemiological analyses of these data must proceed cautiously because sequence sampling remains below the identified threshold and a considerable negative impact of missing characters on phyloge
Ratmann O, Hodcroft EB, Pickles M, et al., 2017, Phylogenetic tools for generalized HIV-1 epidemics: findings from the PANGEA-HIV methods comparison, Molecular Biology and Evolution, Vol: 34, Pages: 185-203, ISSN: 1537-1719
Viral phylogenetic methods contribute to understanding how HIV spreads in populations, and thereby help guide the design of prevention interventions. So far, most analyses have been applied to well-sampled concentrated HIV-1 epidemics in wealthy countries. To direct the use of phylogenetic tools to where the impact of HIV-1 is greatest, the Phylogenetics And Networks for Generalized HIV Epidemics in Africa (PANGEA-HIV) consortium generates full-genome viral sequences from across sub-Saharan Africa. Analyzing these data presents new challenges, since epidemics are principally driven by heterosexual transmission and a smaller fraction of cases is sampled. Here, we show that viral phylogenetic tools can be adapted and used to estimate epidemiological quantities of central importance to HIV-1 prevention in sub-Saharan Africa. We used a community-wide methods comparison exercise on simulated data, where participants were blinded to the true dynamics they were inferring. Two distinct simulations captured generalized HIV-1 epidemics, before and after a large community-level intervention that reduced infection levels. Five research groups participated. Structured coalescent modeling approaches were most successful: phylogenetic estimates of HIV-1 incidence, incidence reductions, and the proportion of transmissions from individuals in their first 3 months of infection correlated with the true values (Pearson correlation > 90%), with small bias. However, on some simulations, true values were markedly outside reported confidence or credibility intervals. The blinded comparison revealed current limits and strengths in using HIV phylogenetics in challenging settings, provided benchmarks for future methods’ development, and supports using the latest generation of phylogenetic tools to advance HIV surveillance and prevention.
Lamers SL, Barbier A, Ratmann O, et al., 2016, HIV-1 Sequence Data Coverage in Central East Africa from 1959-2013, AIDS Research and Human Retroviruses, ISSN: 0889-2229
Central and Eastern African HIV sequence data has been most critical in understanding the establishment and evolution of the global HIV pandemic. Here we report on the extent of publically available HIV genetic sequence data in the Los Alamos National Laboratory Sequence Database sampled from 1959-2013 from six African countries: Uganda, Kenya, Tanzania, Burundi, the Democratic Republic of Congo, and Rwanda. We have summarized these data, including HIV subtypes, the years sampled, and the genomic regions sequenced. We also provide curated alignments for this important geographic area in five HIV genomic regions with substantial coverage.
Wilkinson E, Rasmussen D, Ratmann O, et al., 2016, Origin, imports and exports of HIV-1 subtype C in South Africa: a historical perspective, Infection, Genetics and Evolution, Vol: 46, Pages: 200-208, ISSN: 1567-7257
BACKGROUND: While the HIV epidemic in South Africa had a later onset than epidemics in other southern African countries, prevalence grew rapidly during the 1990's when the country was going through socio-political changes with the end of Apartheid. South Africa currently has the largest number of people living with HIV in the world and the epidemic is dominated by a unique subtype, HIV-1 subtype C. This large epidemic is also characterized by high level of genetic diversity. We hypothesize that this diversity is due to multiple introductions of the virus during the period of change. In this paper, we apply novel phylogeographic methods to estimate the number of viral imports and exportsfrom the start of the epidemic to the present. METHODS: We assembled 11,289 unique subtype C pol sequences from southern Africa. These represent one of the largest sequence datasets ever analyzed in the region. Sequences were stratified based on country of sampling and levels of genetic diversity were estimated for each country. Sequences were aligned and a maximum-likelihood evolutionary tree was inferred. Least-Squares Dating was then used to obtain a dated phylogeny from which we estimated the number of introductions into and exports out of South Africa using parsimony-based ancestral location reconstructions. RESULTS: Our results identified 189 viral introductions into South Africa with the largest number of introductions attributed to Zambia (n=109), Botswana (n=32), Malawi (n=26) and Zimbabwe (n=13). South Africa also exported many viral lineages to its neighbours. The bulk viral imports and exports appear to have occurred between 1985 and 2000, coincident with the period of socio-political transition. CONCLUSION: The high level of subtype C genetic diversity in South Africa is related to multiple introductions of the virus to the country. While the number of viral imports and exports we identified was highly sensitive to the number of samples included from each country, they mo
Nakagawa F, van Sighem A, Thiebaut R, et al., 2016, A method to estimate the size and characteristics of HIV-positive populations using an individual-based stochastic simulation model, Epidemiology, Vol: 27, Pages: 247-256, ISSN: 1531-5487
It is important not only to collect epidemiologic data onHIV but to also fully utilize such information to understand the epidemicover time and to help inform and monitor the impact of policiesand interventions. We describe and apply a novel method to estimatethe size and characteristics of HIV-positive populations. The methodwas applied to data on men who have sex with men living in the UKand to a pseudo dataset to assess performance for different data availability.The individual-based simulation model was calibrated using an approximate Bayesian computation-based approach. In 2013,48,310 (90% plausibility range: 39,900–45,560) men who have sexwith men were estimated to be living with HIV in the UK, of whom10,400 (6,160–17,350) were undiagnosed. There were an estimated3,210 (1,730–5,350) infections per year on average between 2010and 2013. Sixty-two percent of the total HIV-positive population arethought to have viral load <500 copies/ml. In the pseudo-epidemicexample, HIV estimates have narrower plausibility ranges and arecloser to the true number, the greater the data availability to calibratethe model. We demonstrate that our method can be applied to settingswith less data, however plausibility ranges for estimates will be widerto reflect greater uncertainty of the data used to fit the model.
Ratmann O, van Sighem A, Bezemer D, et al., 2016, Sources of HIV infection among men having sex with men and implications for prevention, Science Translational Medicine, Vol: 8, ISSN: 1946-6242
Bezemer D, Cori A, Ratmann O, et al., 2015, Dispersion of the HIV-1 Epidemic in Men Who Have Sex with Men in the Netherlands: A Combined Mathematical Model and Phylogenetic Analysis., PLOS Medicine, Vol: 12, Pages: e1001898-e1001898, ISSN: 1549-1277
BACKGROUND: The HIV-1 subtype B epidemic amongst men who have sex with men (MSM) is resurgent in many countries despite the widespread use of effective combination antiretroviral therapy (cART). In this combined mathematical and phylogenetic study of observational data, we aimed to find out the extent to which the resurgent epidemic is the result of newly introduced strains or of growth of already circulating strains. METHODS AND FINDINGS: As of November 2011, the ATHENA observational HIV cohort of all patients in care in the Netherlands since 1996 included HIV-1 subtype B polymerase sequences from 5,852 patients. Patients who were diagnosed between 1981 and 1995 were included in the cohort if they were still alive in 1996. The ten most similar sequences to each ATHENA sequence were selected from the Los Alamos HIV Sequence Database, and a phylogenetic tree was created of a total of 8,320 sequences. Large transmission clusters that included ≥10 ATHENA sequences were selected, with a local support value ≥ 0.9 and median pairwise patristic distance below the fifth percentile of distances in the whole tree. Time-varying reproduction numbers of the large MSM-majority clusters were estimated through mathematical modeling. We identified 106 large transmission clusters, including 3,061 (52%) ATHENA and 652 Los Alamos sequences. Half of the HIV sequences from MSM registered in the cohort in the Netherlands (2,128 of 4,288) were included in 91 large MSM-majority clusters. Strikingly, at least 54 (59%) of these 91 MSM-majority clusters were already circulating before 1996, when cART was introduced, and have persisted to the present. Overall, 1,226 (35%) of the 3,460 diagnoses among MSM since 1996 were found in these 54 long-standing clusters. The reproduction numbers of all large MSM-majority clusters were around the epidemic threshold value of one over the whole study period. A tendency towards higher numbers was visible in recent years, especially in the more recently
Pillay D, Herbeck J, Cohen MS, et al., 2015, PANGEA-HIV: phylogenetics for generalised epidemics in Africa, LANCET INFECTIOUS DISEASES, Vol: 15, Pages: 259-261, ISSN: 1473-3099
Jombart T, Aanensen DM, Baguelin M, et al., 2014, OutbreakTools: A new platform for disease outbreak analysis using the R software, Epidemics, Vol: 7, Pages: 28-34, ISSN: 1755-4365
The investigation of infectious disease outbreaks relies on the analysis of increasingly complex and diverse data, which offer new prospects for gaining insights into disease transmission processes and informing public health policies. However, the potential of such data can only be harnessed using a number of different, complementary approaches and tools, and a unified platform for the analysis of disease outbreaks is still lacking. In this paper, we present the new R package OutbreakTools, which aims to provide a basis for outbreak data management and analysis in R. OutbreakTools is developed by a community of epidemiologists, statisticians, modellers and bioinformaticians, and implements classes and methods for storing, handling and visualizing outbreak data. It includes real and simulated outbreak datasets. Together with a number of tools for infectious disease epidemiology recently made available in R, OutbreakTools contributes to the emergence of a new, free and open-source platform for the analysis of disease outbreaks.
Ratmann O, Donker G, Meijer A, et al., 2012, Phylodynamic Inference and Model Assessment with Approximate Bayesian Computation: Influenza as a Case Study, PLoS Computational Biology, Vol: 8, ISSN: 1553-7358
A key priority in infectious disease research is to understand the ecological and evolutionary drivers of viral diseases from data on disease incidence as well as viral genetic and antigenic variation. We propose using a simulation-based, Bayesian method known as Approximate Bayesian Computation (ABC) to fit and assess phylodynamic models that simulate pathogen evolution and ecology against summaries of these data. We illustrate the versatility of the method by analyzing two spatial models describing the phylodynamics of interpandemic human influenza virus subtype A(H3N2). The first model captures antigenic drift phenomenologically with continuously waning immunity, and the second epochal evolution model describes the replacement of major, relatively long-lived antigenic clusters. Combining features of long-term surveillance data from the Netherlands with features of influenza A (H3N2) hemagglutinin gene sequences sampled in northern Europe, key phylodynamic parameters can be estimated with ABC. Goodness-of-fit analyses reveal that the irregularity in interannual incidence and H3N2's ladder-like hemagglutinin phylogeny are quantitatively only reproduced under the epochal evolution model within a spatial context. However, the concomitant incidence dynamics result in a very large reproductive number and are not consistent with empirical estimates of H3N2's population level attack rate. These results demonstrate that the interactions between the evolutionary and ecological processes impose multiple quantitative constraints on the phylodynamic trajectories of influenza A(H3N2), so that sequence and surveillance data can be used synergistically. ABC, one of several data synthesis approaches, can easily interface a broad class of phylodynamic models with various types of data but requires careful calibration of the summaries and tolerance parameters.
This data is extracted from the Web of Science and reproduced under a licence from Thomson Reuters. You may not copy or re-distribute this data in whole or in part without the written consent of the Science business of Thomson Reuters.