Imperial College London

DrThibautJombart

Faculty of MedicineSchool of Public Health

Senior Lecturer
 
 
 
//

Contact

 

+44 (0)20 7594 3658t.jombart Website

 
 
//

Location

 

UG11Norfolk PlaceSt Mary's Campus

//

Summary

 

Publications

Publication Type
Year
to

90 results found

Thompson R, Stockwin J, van Gaalen R, Polonsky J, Kamvar Z, Demarsh A, Dahlqwist E, Miguel E, Jombart T, Lessler J, Cauchemez S, Cori Aet al., 2019, Improved inference of time-varying reproduction numbers during infectious disease outbreaks, Epidemics, Vol: 29, Pages: 1-11, ISSN: 1755-4365

Accurate estimation of the parameters characterising infectious disease transmission is vital for optimising control interventions during epidemics. A valuable metric for assessing the current threat posed by an outbreak is the time-dependent reproduction number, i.e. the expected number of secondary cases caused by each infected individual. This quantity can be estimated using data on the numbers of observed new cases at successive times during an epidemic and the distribution of the serial interval (the time between symptomatic cases in a transmission chain). Some methods for estimating the reproduction number rely on pre-existing estimates of the serial interval distribution and assume that the entire outbreak is driven by local transmission. Here we show that accurate inference of current transmissibility, and the uncertainty associated with this estimate, requires: (i) up-to-date observations of the serial interval to be included, and; (ii) cases arising from local transmission to be distinguished from those imported from elsewhere. We demonstrate how pathogen transmissibility can be inferred appropriately using datasets from outbreaks of H1N1 influenza, Ebola virus disease and Middle-East Respiratory Syndrome. We present a tool for estimating the reproduction number in real-time during infectious disease outbreaks accurately, which is available as an R software package (EpiEstim 2.2). It is also accessible as an interactive, user-friendly online interface (EpiEstim App), permitting its use by non-specialists. Our tool is easy to apply for assessing the transmission potential, and hence informing control, during future outbreaks of a wide range of invading pathogens.

Journal article

Dighe A, Jombart T, Van Kerkhove MD, Ferguson Net al., 2019, A systematic review of MERS-CoV seroprevalence and RNA prevalence in dromedary camels: implications for animal vaccination, Epidemics, Vol: 29, ISSN: 1755-4365

Human infection with Middle East Respiratory Syndrome Coronavirus (MERS-CoV) is driven by recurring dromedary-to-human spill-over events, leading decision-makers to consider dromedary vaccination. Dromedary vaccine candidates in the development pipeline are showing hopeful results, but gaps in our understanding of the epidemiology of MERS-CoV in dromedaries must be addressed to design and evaluate potential vaccination strategies. We aim to bring together existing measures of MERS-CoV infection in dromedary camels to assess the distribution of infection, highlighting knowledge gaps and implications for animal vaccination. We systematically reviewed the published literature on MEDLINE, EMBASE and Web of Science that reported seroprevalence and/or prevalence of active MERS-CoV infection in dromedary camels from both cross-sectional and longitudinal studies. 60 studies met our eligibility criteria. Qualitative syntheses determined that MERS-CoV seroprevalence increased with age up to 80–100% in adult dromedaries supporting geographically widespread endemicity of MERS-CoV in dromedaries in both the Arabian Peninsula and countries exporting dromedaries from Africa. The high prevalence of active infection measured in juveniles and at sites where dromedary populations mix should guide further investigation – particularly of dromedary movement – and inform vaccination strategy design and evaluation through mathematical modelling.

Journal article

Moraga P, Dorigatti I, Kamvar ZN, Piatkowski P, Toikkanen SE, Nagraj VP, Donnelly CA, Jombart Tet al., epiflows: an R package for risk assessment of travel-related spread of disease, F1000Research, Vol: 7, Pages: 1374-1374

<ns4:p>As international travel increases worldwide, new surveillance tools are needed to help identify locations where diseases are most likely to be spread and prevention measures need to be implemented. In this paper we present <ns4:italic>epiflows</ns4:italic>, an R package for risk assessment of travel-related spread of disease. <ns4:italic>epiflows</ns4:italic> produces estimates of the expected number of symptomatic and/or asymptomatic infections that could be introduced to other locations from the source of infection. Estimates (average and confidence intervals) of the number of infections introduced elsewhere are obtained by integrating data on the cumulative number of cases reported, population movement, length of stay and information on the distributions of the incubation and infectious periods of the disease. The package also provides tools for geocoding and visualization. We illustrate the use of <ns4:italic>epiflows</ns4:italic> by assessing the risk of travel-related spread of yellow fever cases in Southeast Brazil in December 2016 to May 2017.</ns4:p>

Journal article

Moraga P, Dorigatti I, Kamvar ZN, Piatkowski P, Toikkanen SE, Nagraj VP, Donnelly CA, Jombart Tet al., 2019, epiflows: an R package for risk assessment of travel-related spread of disease, F1000Research, Vol: 7, Pages: 1374-1374

<ns4:p>As international travel increases worldwide, new surveillance tools are needed to help identify locations where diseases are most likely to be spread and prevention measures need to be implemented. In this paper we present <ns4:italic>epiflows</ns4:italic>, an R package for risk assessment of travel-related spread of disease. <ns4:italic>epiflows</ns4:italic> produces estimates of the expected number of symptomatic and/or asymptomatic infections that could be introduced to other locations from the source of infection. Estimates (average and confidence intervals) of the number of infections introduced elsewhere are obtained by integrating data on the cumulative number of cases reported, population movement, length of stay and information on the distributions of the incubation and infectious periods of the disease. The package also provides tools for geocoding and visualization. We illustrate the use of <ns4:italic>epiflows</ns4:italic> by assessing the risk of travel-related spread of yellow fever cases in Southeast Brazil in December 2016 to May 2017.</ns4:p>

Journal article

Cori A, Kamvar ZN, Stockwin J, Jombart T, Thompson R, Dahlqwist Eet al., 2019, annecori/EpiEstim: EpiEstim Cran 2.2-1

new CRAN version of EpiEstim including all new features described in Thompson et al. (currently in review in Epidemics journal).

Software

Stockwin J, Thompson R, Cori A, Jombart T, Kamvar ZN, Fitzjohn Ret al., 2019, jstockwin/EpiEstimApp: v1.0.0

Source code for the EpiEstim app.

Software

Polonsky JA, Baidjoe A, Kamvar ZN, Cori A, Durski K, Edmunds WJ, Eggo RM, Funk S, Kaiser L, Keating P, de Waroux OLP, Marks M, Moraga P, Morgan O, Nouvellet P, Ratnayake R, Roberts CH, Whitworth J, Jombart Tet al., 2019, Outbreak analytics: a developing data science for informing the response to emerging pathogens, Philosophical Transactions B: Biological Sciences, Vol: 374, ISSN: 0962-8436

Despite continued efforts to improve health systems worldwide, emerging pathogen epidemics remain a major public health concern. Effective response to such outbreaks relies on timely intervention, ideally informed by all available sources of data. The collection, visualization and analysis of outbreak data are becoming increasingly complex, owing to the diversity in types of data, questions and available methods to address them. Recent advances have led to the rise of outbreak analytics, an emerging data science focused on the technological and methodological aspects of the outbreak data pipeline, from collection to analysis, modelling and reporting to inform outbreak response. In this article, we assess the current state of the field. After laying out the context of outbreak response, we critically review the most common analytics components, their inter-dependencies, data requirements and the type of information they can provide to inform operations in real time. We discuss some challenges and opportunities and conclude on the potential role of outbreak analytics for improving our understanding of, and response to outbreaks of emerging pathogens.This article is part of the theme issue ‘Modelling infectious disease outbreaks in humans, animals and plants: epidemic forecasting and control‘. This theme issue is linked with the earlier issue ‘Modelling infectious disease outbreaks in humans, animals and plants: approaches and important themes’.

Journal article

Leiva C, Taboada S, Kenny NJ, Combosch D, Giribet G, Jombar T, Riesgo Aet al., 2019, Population substructure and signals of divergent adaptive selection despite admixture in the sponge Dendrilla antarctica from shallow waters surrounding the Antarctic Peninsula, MOLECULAR ECOLOGY, Vol: 28, Pages: 3151-3170, ISSN: 0962-1083

Journal article

Sewell T, Zhu J, Rhodes J, Hagen F, Mels JF, Fisher M, Jombart Tet al., 2019, Non-random distribution of azole resistance across the global population of Aspergillus fumigatus, mBio, Vol: 10, ISSN: 2150-7511

The emergence of azole resistance in the pathogenic fungus Aspergillus fumigatus has continued to increase, with the dominant resistance mechanisms, consisting of a 34-nucleotide tandem repeat (TR34)/L98H and TR46/Y121F/T289A, now showing a structured global distribution. Using hierarchical clustering and multivariate analysis of 4,049 A. fumigatus isolates collected worldwide and genotyped at nine microsatellite loci using analysis of short tandem repeats of A. fumigatus (STRAf), we show that A. fumigatus can be subdivided into two broad clades and that cyp51A alleles TR34/L98H and TR46/Y121F/T289A are unevenly distributed across these two populations. Diversity indices show that azole-resistant isolates are genetically depauperate compared to their wild-type counterparts, compatible with selective sweeps accompanying the selection of beneficial mutations. Strikingly, we found that azole-resistant clones with identical microsatellite profiles were globally distributed and sourced from both clinical and environmental locations, confirming that azole resistance is an international public health concern. Our work provides a framework for the analysis of A. fumigatus isolates based on their microsatellite profile, which we have incorporated into a freely available, user-friendly R Shiny application (AfumID) that provides clinicians and researchers with a method for the fast, automated characterization of A. fumigatus genetic relatedness. Our study highlights the effect that azole drug resistance is having on the genetic diversity of A. fumigatus and emphasizes its global importance upon this medically important pathogenic fungus.IMPORTANCE Azole drug resistance in the human-pathogenic fungus Aspergillus fumigatus continues to emerge, potentially leading to untreatable aspergillosis in immunosuppressed hosts. Two dominant, environmentally associated resistance mechanisms, which are thought to have evolved through selection by the agricultural application of azole fungic

Journal article

Campbell F, Cori A, Ferguson N, Jombart Tet al., 2019, Bayesian inference of transmission chains using timing of symptoms, pathogen genomes and contact data, PLoS Computational Biology, Vol: 15, ISSN: 1553-734X

There exists significant interest in developing statistical and computational tools for inferring ‘who infected whom’ in an infectious disease outbreak from densely sampled case data, with most recent studies focusing on the analysis of whole genome sequence data. However, genomic data can be poorly informative of transmission events if mutations accumulate too slowly to resolve individual transmission pairs or if there exist multiple pathogens lineages within-host, and there has been little focus on incorporating other types of outbreak data. We present here a methodology that uses contact data for the inference of transmission trees in a statistically rigorous manner, alongside genomic data and temporal data. Contact data is frequently collected in outbreaks of pathogens spread by close contact, including Ebola virus (EBOV), severe acute respiratory syndrome coronavirus (SARS-CoV) and Mycobacterium tuberculosis (TB), and routinely used to reconstruct transmission chains. As an improvement over previous, ad-hoc approaches, we developed a probabilistic model that relates a set of contact data to an underlying transmission tree and integrated this in the outbreaker2 inference framework. By analyzing simulated outbreaks under various contact tracing scenarios, we demonstrate that contact data significantly improves our ability to reconstruct transmission trees, even under realistic limitations on the coverage of the contact tracing effort and the amount of non-infectious mixing between cases. Indeed, contact data is equally or more informative than fully sampled whole genome sequence data in certain scenarios. We then use our method to analyze the early stages of the 2003 SARS outbreak in Singapore and describe the range of transmission scenarios consistent with contact data and genetic sequence in a probabilistic manner for the first time. This simple yet flexible model can easily be incorporated into existing tools for outbreak reconstruction and should

Journal article

Jombart T, Kamvar ZN, Cai J, Pulliam J, Chisholm S, Fitzjohn R, Schumacher J, Bhatia Set al., 2019, reconhub/incidence: Incidence version 1.7.0

Incidence can now handle standardised weeks starting on any day thanks to the aweek package :tada:library(incidence)library(ggplot2)library(cowplot)d <- as.Date("2019-03-11") + -7:6setNames(d, weekdays(d))#> Monday Tuesday Wednesday Thursday Friday #> "2019-03-04" "2019-03-05" "2019-03-06" "2019-03-07" "2019-03-08" #> Saturday Sunday Monday Tuesday Wednesday #> "2019-03-09" "2019-03-10" "2019-03-11" "2019-03-12" "2019-03-13" #> Thursday Friday Saturday Sunday #> "2019-03-14" "2019-03-15" "2019-03-16" "2019-03-17"imon <- incidence(d, "mon week") # also ISO weekitue <- incidence(d, "tue week")iwed <- incidence(d, "wed week")ithu <- incidence(d, "thu week")ifri <- incidence(d, "fri week")isat <- incidence(d, "sat week")isun <- incidence(d, "sun week") # also MMWR week and EPI weekpmon <- plot(imon, show_cases = TRUE, labels_week = FALSE)ptue <- plot(itue, show_cases = TRUE, labels_week = FALSE)pwed <- plot(iwed, show_cases = TRUE, labels_week = FALSE)pthu <- plot(ithu, show_cases = TRUE, labels_week = FALSE)pfri <- plot(ifri, show_cases = TRUE, labels_week = FALSE)psat <- plot(isat, show_cases = TRUE, labels_week = FALSE)psun <- plot(isun, show_cases = TRUE, labels_week = FALSE)s <- scale_x_date(limits = c(as.Date("2019-02-26"), max(d) + 7L))plot_grid(pmon + s,ptue + s,pwed + s,pthu + s,pfri + s,psat + s,psun + s)multi-weeks/months/years can now be handledlibrary(incidence)library(outbreaks)d <- ebola_sim_clean$linelist$date_of_onseth <- ebola_sim_clean$linelist$hospitalplot(incidence(d, interval = "1 epiweek", group = h))plot(incidence(d, interval = "2 epiweeks", group = h))plot(incide

Software

Dighe A, Jombart T, van Kerkhove M, Ferguson Net al., 2019, A mathematical model of the transmission of middle East respiratory syndrome coronavirus in dromedary camels (Camelus dromedarius), Publisher: ELSEVIER SCI LTD, Pages: 1-1, ISSN: 1201-9712

Conference paper

Kamvar ZN, Cai J, Pulliam JRC, Schumacher J, Jombart Tet al., 2019, Epidemic curves made easy using the R package incidence., F1000Research, Vol: 8, ISSN: 2046-1402

The epidemiological curve (epicurve) is one of the simplest yet most useful tools used by field epidemiologists, modellers, and decision makers for assessing the dynamics of infectious disease epidemics. Here, we present the free, open-source package incidence for the R programming language, which allows users to easily compute, handle, and visualise epicurves from unaggregated linelist data. This package was built in accordance with the development guidelines of the R Epidemics Consortium (RECON), which aim to ensure robustness and reliability through extensive automated testing, documentation, and good coding practices. As such, it fills an important gap in the toolbox for outbreak analytics using the R software, and provides a solid building block for further developments in infectious disease modelling. incidence is available from https://www.repidemicsconsortium.org/incidence.

Journal article

Kamvar Z, Cai J, Pulliam JRC, Schumacher J, Jombart Tet al., Epidemic curves made easy using the R package incidence [version 1; referees: awaiting peer review], F1000Research, Vol: 8, ISSN: 2046-1402

The epidemiological curve (epicurve) is one of the simplest yet most useful tools used by field epidemiologists, modellers, and decision makers for assessing the dynamics of infectious disease epidemics. Here, we present the free, open-source package incidence for the R programming language, which allows users to easily compute, handle, and visualise epicurves from unaggregated linelist data. This package was built in accordance with the development guidelines of the R Epidemics Consortium (RECON), which aim to ensure robustness and reliability through extensive automated testing, documentation, and good coding practices. As such, it fills an important gap in the toolbox for outbreak analytics using the R software, and provides a solid building block for further developments in infectious disease modelling. incidence is available from https://www.repidemicsconsortium.org/incidence.

Journal article

Jombart T, Kamvar ZN, Cai J, Pulliam J, Chisholm S, Fitzjohn R, Schumacher J, Bhatia Set al., 2019, reconhub/incidence 1.5

☣:chart_with_upwards_trend::chart_with_downwards_trend:☣ Compute and visualise incidence

Software

Cori A, Nouvellet P, Garske T, Bourhy H, Nakouné E, Jombart Tet al., 2018, A graph-based evidence synthesis approach to detecting outbreak clusters: An application to dog rabies, PLoS Computational Biology, Vol: 14, ISSN: 1553-734X

Early assessment of infectious disease outbreaks is key to implementing timely and effective control measures. In particular, rapidly recognising whether infected individuals stem from a single outbreak sustained by local transmission, or from repeated introductions, is crucial to adopt effective interventions. In this study, we introduce a new framework for combining several data streams, e.g. temporal, spatial and genetic data, to identify clusters of related cases of an infectious disease. Our method explicitly accounts for underreporting, and allows incorporating preexisting information about the disease, such as its serial interval, spatial kernel, and mutation rate. We define, for each data stream, a graph connecting all cases, with edges weighted by the corresponding pairwise distance between cases. Each graph is then pruned by removing distances greater than a given cutoff, defined based on preexisting information on the disease and assumptions on the reporting rate. The pruned graphs corresponding to different data streams are then merged by intersection to combine all data types; connected components define clusters of cases related for all types of data. Estimates of the reproduction number (the average number of secondary cases infected by an infectious individual in a large population), and the rate of importation of the disease into the population, are also derived. We test our approach on simulated data and illustrate it using data on dog rabies in Central African Republic. We show that the outbreak clusters identified using our method are consistent with structures previously identified by more complex, computationally intensive approaches.

Journal article

Jombart T, Kamvar Z, Cai J, Chisholm S, Fitzjohn R, Schumacher J, Bhatia Set al., 2018, reconhub/incidence: Incidence version 1.5.3

This is a patch release that fixes an issue with handling single-group incidence curves.You can install this version like so:remotes::install_github("reconhub/incidence@1.5.3")

Software

Campbell F, Didelot X, Fitzjohn R, Ferguson N, Cori A, Jombart Tet al., 2018, outbreaker2: a modular platform for outbreak reconstruction, BMC Bioinformatics, Vol: 19, ISSN: 1471-2105

Background:Reconstructing individual transmission events in an infectious disease outbreak can provide valuable information and help inform infection control policy. Recent years have seen considerable progress in the development of methodologies for reconstructing transmission chains using both epidemiological and genetic data. However, only a few of these methods have been implemented in software packages, and with little consideration for customisability and interoperability. Users are therefore limited to a small number of alternatives, incompatible tools with fixed functionality, or forced to develop their own algorithms at considerable personal effort.Results:Here we present outbreaker2, a flexible framework for outbreak reconstruction. This R package re-implements and extends the original model introduced with outbreaker, but most importantly also provides a modular platform allowing users to specify custom models within an optimised inferential framework. As a proof of concept, we implement the within-host evolutionary model introduced with TransPhylo, which is very distinct from the original genetic model in outbreaker, and demonstrate how even complex model results can be successfully included with minimal effort.Conclusions:outbreaker2provides a valuable starting point for future outbreak reconstruction tools, and represents a unifying platform that promotes customisability and interoperability. Implemented in the R software, outbreaker2joins a growing body of tools for outbreak analysis

Journal article

Nagraj VP, Randhawa N, Campbell F, Crellen T, Sudre B, Jombart Tet al., 2018, epicontacts: Handling, visualisation and analysis of epidemiological contacts, f1000research Open for Science

Epidemiological outbreak data is often captured in line list and contact format to facilitate contact tracing for outbreak control. epicontacts is an R package that provides a unique data structure for combining these data into a single object in order to facilitate more efficient visualisation and analysis. The package incorporates interactive visualisation functionality as well as network analysis techniques. Originally developed as part of the Hackout3 event, it is now developed, maintained and featured as part of the R Epidemics Consortium (RECON). The package is available for download from the Comprehensive R Archive Network (CRAN) and GitHub .

Journal article

Nagraj VP, Randhawa N, Campbell F, Crellen T, Sudre B, Jombart Tet al., 2018, epicontacts: Handling, visualisation and analysis of epidemiological contacts, F1000Research, ISSN: 2046-1402

Epidemiological outbreak data is often captured in line list and contact format to facilitate contact tracing for outbreak control. epicontacts is an R package that provides a unique data structure for combining these data into a single object in order to facilitate more efficient visualisation and analysis. The package incorporates interactive visualisation functionality as well as network analysis techniques. Originally developed as part of the Hackout3 event, it is now developed, maintained and featured as part of the R Epidemics Consortium (RECON). The package is available for download from the Comprehensive R Archive Network (CRAN) and GitHub .

Journal article

Moraga P, Dorigatti I, Kamvar Z, Piatkowski P, Toikkanen S, Nagraj VP, Donnelly C, Jombart Tet al., 2018, epiflows : an R package for risk assessment of travel- related spread of disease [version 1; referees: 2 approved with reservations], F1000Research, Vol: 7, ISSN: 2046-1402

As international travel increases worldwide, new surveillance tools are needed to help identify locations where diseases are most likely to be spread and prevention measures need to be implemented. In this paper we present epiflows, an R package for risk assessment of travel-related spread of disease. epiflows produces estimates of the expected number of symptomatic and/or asymptomatic infections that could be introduced to other locations from the source of infection. Estimates (average and confidence intervals) of the number of infections introduced elsewhere are obtained by integrating data on the cumulative number of cases reported, population movement, length of stay and information on the distributions of the incubation and infectious periods of the disease. The package also provides tools for geocoding and visualization. We illustrate the use of epiflows by assessing the risk of travel-related spread of yellow fever cases in Southeast Brazil in December 2016 to May 2017.

Journal article

Beugin M-P, Gayet T, Pontier D, Devillard S, Jombart Tet al., 2018, A fast likelihood solution to the genetic clustering problem, Methods in Ecology and Evolution, Vol: 9, Pages: 1006-1016, ISSN: 2041-210X

The investigation of genetic clusters in natural populations is an ubiquitous problem in a range of fields relying on the analysis of genetic data, such as molecular ecology, conservation biology and microbiology. Typically, genetic clusters are defined as distinct panmictic populations, or parental groups in the context of hybridisation. Two types of methods have been developed for identifying such clusters: model-based methods, which are usually computer-intensive but yield results which can be interpreted in the light of an explicit population genetic model, and geometric approaches, which are less interpretable but remarkably faster.Here, we introduce snapclust, a fast maximum-likelihood solution to the genetic clustering problem, which allies the advantages of both model-based and geometric approaches. Our method relies on maximising the likelihood of a fixed number of panmictic populations, using a combination of geometric approach and fast likelihood optimisation, using the Expectation-Maximisation (EM) algorithm. It can be used for assigning genotypes to populations and optionally identify various types of hybrids between two parental populations. Several goodness-of-fit statistics can also be used to guide the choice of the retained number of clusters.Using extensive simulations, we show that snapclust performs comparably to current gold standards for genetic clustering as well as hybrid detection, with some advantages for identifying hybrids after several backcrosses, while being orders of magnitude faster than other model-based methods. We also illustrate how snapclust can be used for identifying the optimal number of clusters, and subsequently assign individuals to various hybrid classes simulated from an empirical microsatellite dataset.snapclust is implemented in the package adegenet for the free software R, and is therefore easily integrated into existing pipelines for genetic data analysis. It can be applied to any kind of co-dominant markers, and ca

Journal article

Dupuis JR, Bremer FT, Jombart T, Sim SB, Geib SMet al., 2018, mvmapper: Interactive spatial mapping of genetic structures, Molecular Ecology Resources, Vol: 18, Pages: 362-367, ISSN: 1755-098X

Characterizing genetic structure across geographic space is a fundamental challenge in population genetics. Multivariate statistical analyses are powerful tools for summarizing genetic variability, but geographic information and accompanying metadata are not always easily integrated into these methods in a user-friendly fashion. Here, we present a deployable Python-based web-tool, mvmapper, for visualizing and exploring results of multivariate analyses in geographic space. This tool can be used to map results of virtually any multivariate analysis of georeferenced data, and routines for exporting results from a number of standard methods have been integrated in the R package adegenet, including principal components analysis (PCA), spatial PCA, discriminant analysis of principal components, principal coordinates analysis, nonmetric dimensional scaling and correspondence analysis. mvmapper's greatest strength is facilitating dynamic and interactive exploration of the statistical and geographic frameworks side by side, a task that is difficult and time-consuming with currently available tools. Source code and deployment instructions, as well as a link to a hosted instance of mvmapper, can be found at https://popphylotools.github.io/mvMapper/.

Journal article

Campbell F, Strang C, Ferguson N, Cori A, Jombart Tet al., 2018, When are pathogen genome sequences informative of transmission events?, PLoS Pathogens, Vol: 14, ISSN: 1553-7366

Recent years have seen the development of numerous methodologies for reconstructing transmission trees in infectious disease outbreaks from densely sampled whole genome sequence data. However, a fundamental and as of yet poorly addressed limitation of such approaches is the requirement for genetic diversity to arise on epidemiological timescales. Specifically, the position of infected individuals in a transmission tree can only be resolved by genetic data if mutations have accumulated between the sampled pathogen genomes. To quantify and compare the useful genetic diversity expected from genetic data in different pathogen outbreaks, we introduce here the concept of ‘transmission divergence’, defined as the number of mutations separating whole genome sequences sampled from transmission pairs. Using parameter values obtained by literature review, we simulate outbreak scenarios alongside sequence evolution using two models described in the literature to describe transmission divergence of ten major outbreak-causing pathogens. We find that while mean values vary significantly between the pathogens considered, their transmission divergence is generally very low, with many outbreaks characterised by large numbers of genetically identical transmission pairs. We describe the impact of transmission divergence on our ability to reconstruct outbreaks using two outbreak reconstruction tools, the R packages outbreaker and phybreak, and demonstrate that, in agreement with previous observations, genetic sequence data of rapidly evolving pathogens such as RNA viruses can provide valuable information on individual transmission events. Conversely, sequence data of pathogens with lower mean transmission divergence, including Streptococcus pneumoniae, Shigella sonnei and Clostridium difficile, provide little to no information about individual transmission events. Our results highlight the informational limitations of genetic sequence data in certain outbreak scenarios, and

Journal article

Moraga P, Dorigatti I, Kamvar ZN, Piatkowski P, Toikkanen SE, Nagraj VP, Donnelly CA, Jombart Tet al., 2018, epiflows: an R package for risk assessment of travel-related spread of disease., F1000Research, Vol: 7, ISSN: 2046-1402

As international travel increases worldwide, new surveillance tools are needed to help identify locations where diseases are most likely to be spread and prevention measures need to be implemented. In this paper we present epiflows, an R package for risk assessment of travel-related spread of disease.  epiflows produces estimates of the expected number of symptomatic and/or asymptomatic infections that could be introduced to other locations from the source of infection. Estimates (average and confidence intervals) of the number of infections introduced elsewhere are obtained by integrating data on the cumulative number of cases reported, population movement, length of stay and information on the distributions of the incubation and infectious periods of the disease. The package also provides tools for geocoding and visualization. We illustrate the use of epiflows by assessing the risk of travel-related spread of yellow fever cases in Southeast Brazil in December 2016 to May 2017.

Journal article

Nagraj VP, Randhawa N, Campbell F, Crellen T, Sudre B, Jombart Tet al., 2018, epicontacts: Handling, visualisation and analysis of epidemiological contacts., F1000Research, Vol: 7, ISSN: 2046-1402

Epidemiological outbreak data is often captured in line list and contact format to facilitate contact tracing for outbreak control. epicontacts is an R package that provides a unique data structure for combining these data into a single object in order to facilitate more efficient visualisation and analysis. The package incorporates interactive visualisation functionality as well as network analysis techniques. Originally developed as part of the Hackout3 event, it is now developed, maintained and featured as part of the R Epidemics Consortium (RECON). The package is available for download from the Comprehensive R Archive Network (CRAN) and GitHub.

Journal article

Montano V, Jombart T, 2017, An Eigenvalue Test for spatial Principal Component Analysis, BMC Bioinformatics, Vol: 18, ISSN: 1471-2105

BackgroundThe spatial Principal Component Analysis (sPCA, Jombart (Heredity 101:92-103, 2008) is designed to investigate non-random spatial distributions of genetic variation. Unfortunately, the associated tests used for assessing the existence of spatial patterns (global and local test; (Heredity 101:92-103, 2008) lack statistical power and may fail to reveal existing spatial patterns. Here, we present a non-parametric test for the significance of specific patterns recovered by sPCA.ResultsWe compared the performance of this new test to the original global and local tests using datasets simulated under classical population genetic models. Results show that our test outperforms the original global and local tests, exhibiting improved statistical power while retaining similar, and reliable type I errors. Moreover, by allowing to test various sets of axes, it can be used to guide the selection of retained sPCA components.ConclusionsAs such, our test represents a valuable complement to the original analysis, and should prove useful for the investigation of spatial genetic patterns.

Journal article

Paradis E, Gosselin T, Grunwald NJ, Jombart T, Manel S, Lapp Het al., 2017, Towards an integrated ecosystem of R packages for the analysis of population genetic data, MOLECULAR ECOLOGY RESOURCES, Vol: 17, Pages: 1-4, ISSN: 1755-098X

Journal article

Jombart T, Kendall M, Almagro-Garcia J, Colijn Cet al., 2017, Treespace: statistical exploration of landscapes of phylogenetic trees, Molecular Ecology Resources, Vol: 17, Pages: 1385-1392, ISSN: 1755-0998

The increasing availability of large genomic data sets as well as the advent of Bayesian phylogenetics facilitates the investigation of phylogenetic incongruence, which can result in the impossibility of representing phylogenetic relationships using a single tree. While sometimes considered as a nuisance, phylogenetic incongruence can also reflect meaningful biological processes as well as relevant statistical uncertainty, both of which can yield valuable insights in evolutionary studies. We introduce a new tool for investigating phylogenetic incongruence through the exploration of phylogenetic tree landscapes. Our approach, implemented in the R package treespace, combines tree metrics and multivariate analysis to provide low-dimensional representations of the topological variability in a set of trees, which can be used for identifying clusters of similar trees and group-specific consensus phylogenies. treespace also provides a user-friendly web interface for interactive data analysis and is integrated alongside existing standards for phylogenetics. It fills a gap in the current phylogenetics toolbox in R and will facilitate the investigation of phylogenetic results.

Journal article

Cori A, Donnelly CA, dorigatti, ferguson NM, fraser, garske, jombart, Nedjati-Gilani G, Nouvellet, Riley, Van Kerkhove, Mills, Blake IMet al., 2017, Key data for outbreak evaluation: building on the Ebola experience, Philosophical Transactions of the Royal Society B: Biological Sciences, Vol: 372, ISSN: 1471-2970

Following the detection of an infectious disease outbreak, rapid epidemiological assessmentis critical to guidean effectivepublic health response. To understand the transmission dynamics and potential impact of an outbreak, several types of data are necessary. Here we build on experience gained inthe West AfricanEbolaepidemic and prior emerging infectious disease outbreaksto set out a checklist of data needed to: 1) quantify severity and transmissibility;2) characterise heterogeneities in transmission and their determinants;and 3) assess the effectiveness of different interventions.We differentiate data needs into individual-leveldata (e.g. a detailed list of reported cases), exposure data(e.g.identifying where / howcases may have been infected) and populationlevel data (e.g.size/demographicsof the population(s)affected andwhen/where interventions were implemented). A remarkable amount of individual-level and exposuredata was collected during the West African Ebola epidemic, which allowed the assessment of (1) and (2). However,gaps in population-level data (particularly around which interventions were applied whenand where)posed challenges to the assessment of (3).Herewehighlight recurrent data issues, give practical suggestions for addressingthese issues and discuss priorities for improvements in data collection in future outbreaks.

Journal article

This data is extracted from the Web of Science and reproduced under a licence from Thomson Reuters. You may not copy or re-distribute this data in whole or in part without the written consent of the Science business of Thomson Reuters.

Request URL: http://wlsprd.imperial.ac.uk:80/respub/WEB-INF/jsp/search-html.jsp Request URI: /respub/WEB-INF/jsp/search-html.jsp Query String: respub-action=search.html&id=00606825&limit=30&person=true