Imperial College London

Professor Axel Gandy

Faculty of Natural SciencesDepartment of Mathematics

Chair in Statistics



+44 (0)20 7594 8518a.gandy Website




530Huxley BuildingSouth Kensington Campus





Publication Type

73 results found

Lamprinakou S, Barahona M, Flaxman S, Filippi S, Gandy A, McCoy EJet al., 2023, BART-based inference for Poisson processes, Computational Statistics & Data Analysis, Vol: 180, Pages: 107658-107658, ISSN: 0167-9473

Journal article

Gandy A, Jana K, Veraart A, 2022, Scoring predictions at extreme quantiles, AStA Advances in Statistical Analysis, Vol: 106, Pages: 527-544, ISSN: 0002-6018

Prediction of quantiles at extreme tails is of interest in numerousapplications. Extreme value modelling provides various competing predictorsfor this point prediction problem. A common method of assessment of a setof competing predictors is to evaluate their predictive performance in a givensituation. However, due to the extreme nature of this inference problem, it canbe possible that the predicted quantiles are not seen in the historical records,particularly when the sample size is small. This situation poses a problem tothe validation of the prediction with its realisation. In this article, we proposetwo non-parametric scoring approaches to assess extreme quantile predictionmechanisms. The proposed assessment methods are based on predicting a sequence of equally extreme quantiles on different parts of the data. We thenuse the quantile scoring function to evaluate the competing predictors. Theperformance of the scoring methods is compared with the conventional scoring method and the superiority of the former methods are demonstrated in asimulation study. The methods are then applied to reanalyse cyber Netflowdata from Los Alamos National Laboratory and daily precipitation data at astation in California available from Global Historical Climatology Network.

Journal article

Faria NR, Mellan TA, Whittaker C, Claro IM, Candido DDS, Mishra S, Crispim MAE, Sales FC, Hawryluk I, McCrone JT, Hulswit RJG, Franco LAM, Ramundo MS, de Jesus JG, Andrade PS, Coletti TM, Ferreira GM, Silva CAM, Manuli ER, Pereira RHM, Peixoto PS, Kraemer MU, Gaburo N, Camilo CDC, Hoeltgebaum H, Souza WM, Rocha EC, de Souza LM, de Pinho MC, Araujo LJT, Malta FS, de Lima AB, Silva JDP, Zauli DAG, Ferreira ACDS, Schnekenberg RP, Laydon DJ, Walker PGT, Schlueter HM, dos Santos ALP, Vidal MS, Del Caro VS, Filho RMF, dos Santos HM, Aguiar RS, Proenca-Modena JLP, Nelson B, Hay JA, Monod M, Miscouridou X, Coupland H, Sonabend R, Vollmer M, Gandy A, Prete CA, Nascimento VH, Suchard MA, Bowden TA, Pond SLK, Wu C-H, Ratmann O, Ferguson NM, Dye C, Loman NJ, Lemey P, Rambaut A, Fraiji NA, Carvalho MDPSS, Pybus OG, Flaxman S, Bhatt S, Sabino ECet al., 2021, Genomics and epidemiology of the P.1 SARS-CoV-2 lineage in Manaus, Brazil, Science, Vol: 372, Pages: 815-821, ISSN: 0036-8075

Cases of severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) infection in Manaus, Brazil, resurged in late 2020 despite previously high levels of infection. Genome sequencing of viruses sampled in Manaus between November 2020 and January 2021 revealed the emergence and circulation of a novel SARS-CoV-2 variant of concern. Lineage P.1 acquired 17 mutations, including a trio in the spike protein (K417T, E484K, and N501Y) associated with increased binding to the human ACE2 (angiotensin-converting enzyme 2) receptor. Molecular clock analysis shows that P.1 emergence occurred around mid-November 2020 and was preceded by a period of faster molecular evolution. Using a two-category dynamical model that integrates genomic and mortality data, we estimate that P.1 may be 1.7- to 2.4-fold more transmissible and that previous (non-P.1) infection provides 54 to 79% of the protection against infection with P.1 that it provides against non-P.1 lineages. Enhanced global genomic surveillance of variants of concern, which may exhibit increased transmissibility and/or immune evasion, is critical to accelerate pandemic responsiveness.

Journal article

Volz E, Mishra S, Chand M, Barrett JC, Johnson R, Geidelberg L, Hinsley WR, Laydon DJ, Dabrera G, O'Toole Á, Amato R, Ragonnet-Cronin M, Harrison I, Jackson B, Ariani CV, Boyd O, Loman NJ, McCrone JT, Gonçalves S, Jorgensen D, Myers R, Hill V, Jackson DK, Gaythorpe K, Groves N, Sillitoe J, Kwiatkowski DP, COVID-19 Genomics UK COG-UK consortium, Flaxman S, Ratmann O, Bhatt S, Hopkins S, Gandy A, Rambaut A, Ferguson NMet al., 2021, Assessing transmissibility of SARS-CoV-2 lineage B.1.1.7 in England, Nature, Vol: 593, Pages: 266-269, ISSN: 0028-0836

The SARS-CoV-2 lineage B.1.1.7, designated a Variant of Concern 202012/01 (VOC) by Public Health England1, originated in the UK in late Summer to early Autumn 20202. Whole genome SARS-CoV-2 sequence data collected from community-based diagnostic testing shows an unprecedentedly rapid expansion of the B.1.1.7 lineage during Autumn 2020, suggesting a selective advantage. We find that changes in VOC frequency inferred from genetic data correspond closely to changes inferred by S-gene target failures (SGTF) in community-based diagnostic PCR testing. Analysis of trends in SGTF and non-SGTF case numbers in local areas across England shows that the VOC has higher transmissibility than non-VOC lineages, even if the VOC has a different latent period or generation time. The SGTF data indicate a transient shift in the age composition of reported cases, with a larger share of under 20 year olds among reported VOC than non-VOC cases. Time-varying reproduction numbers for the VOC and cocirculating lineages were estimated using SGTF and genomic data. The best supported models did not indicate a substantial difference in VOC transmissibility among different age groups. There is a consensus among all analyses that the VOC has a substantial transmission advantage with a 50% to 100% higher reproduction number.

Journal article

Hilbers AP, Brayshaw DJ, Gandy A, 2021, Efficient quantification of the impact of demand and weather uncertainty in power system models, IEEE Transactions on Power Systems, Vol: 36, Pages: 1771-1779, ISSN: 0885-8950

This paper introduces a novel approach to quantify the effect of forwardpropagated demand and weather uncertainty on power system planning andoperation model outputs. Recent studies indicate that such samplinguncertainty, originating from demand and weather time series inputs, should notbe ignored. However, established uncertainty quantification approaches fail inthis context due to the computational resources and additional data requiredfor Monte Carlo-based analysis. The method introduced here quantifiesuncertainty on model outputs using a bootstrap scheme with shorter time seriesthan the original, enhancing computational efficiency and avoiding the need forany additional data. It both quantifies output uncertainty and determines thesample length required for desired confidence levels. Simulations performed ontwo generation and transmission expansion planning models and one unitcommitment and economic dispatch model illustrate the method's efficacy. A testis introduced allowing users to determine whether estimated uncertainty boundsare valid. The models, data and code applying the method are provided asopen-source software.

Journal article

Laydon D, Mishra S, Hinsley W, Samartsidis P, Flaxman S, Gandy A, Ferguson N, Bhatt Set al., 2021, Modelling the impact of the Tier system on SARS-CoV-2 transmission in the UK between the first and second national lockdowns, BMJ Open, Vol: 11, ISSN: 2044-6055

Objective To measure the effects of the tier system on the COVID-19 pandemic in the UK between the first and second national lockdowns, before the emergence of the B.1.1.7 variant of concern.Design This is a modelling study combining estimates of real-time reproduction number Rt (derived from UK case, death and serological survey data) with publicly available data on regional non-pharmaceutical interventions. We fit a Bayesian hierarchical model with latent factors using these quantities to account for broader national trends in addition to subnational effects from tiers.Setting The UK at lower tier local authority (LTLA) level. 310 LTLAs were included in the analysis.Primary and secondary outcome measures Reduction in real-time reproduction number Rt.Results Nationally, transmission increased between July and late September, regional differences notwithstanding. Immediately prior to the introduction of the tier system, Rt averaged 1.3 (0.9–1.6) across LTLAs, but declined to an average of 1.1 (0.86–1.42) 2 weeks later. Decline in transmission was not solely attributable to tiers. Tier 1 had negligible effects. Tiers 2 and 3, respectively, reduced transmission by 6% (5%–7%) and 23% (21%–25%). 288 LTLAs (93%) would have begun to suppress their epidemics if every LTLA had gone into tier 3 by the second national lockdown, whereas only 90 (29%) did so in reality.Conclusions The relatively small effect sizes found in this analysis demonstrate that interventions at least as stringent as tier 3 are required to suppress transmission, especially considering more transmissible variants, at least until effective vaccination is widespread or much greater population immunity has amassed.

Journal article

Flaxman S, Mishra S, Scott J, Ferguson N, Gandy A, Bhatt Set al., 2020, The effect of interventions on COVID-19 Reply, NATURE, Vol: 588, Pages: E29-E32, ISSN: 0028-0836

Journal article

Unwin H, Mishra S, Bradley V, Gandy A, Mellan T, Coupland H, Ish-Horowicz J, Vollmer M, Whittaker C, Filippi S, Xi X, Monod M, Ratmann O, Hutchinson M, Valka F, Zhu H, Hawryluk I, Milton P, Ainslie K, Baguelin M, Boonyasiri A, Brazeau N, Cattarino L, Cucunuba Z, Cuomo-Dannenburg G, Dorigatti I, Eales O, Eaton J, van Elsland S, Fitzjohn R, Gaythorpe K, Green W, Hinsley W, Jeffrey B, Knock E, Laydon D, Lees J, Nedjati-Gilani G, Nouvellet P, Okell L, Parag K, Siveroni I, Thompson H, Walker P, Walters C, Watson O, Whittles L, Ghani A, Ferguson N, Riley S, Donnelly C, Bhatt S, Flaxman Set al., 2020, State-level tracking of COVID-19 in the United States, Nature Communications, Vol: 11, Pages: 1-9, ISSN: 2041-1723

As of 1st June 2020, the US Centers for Disease Control and Prevention reported 104,232 confirmed or probable COVID-19-related deaths in the US. This was more than twice the number of deaths reported in the next most severely impacted country. We jointly model the US epidemic at the state-level, using publicly available deathdata within a Bayesian hierarchical semi-mechanistic framework. For each state, we estimate the number of individuals that have been infected, the number of individuals that are currently infectious and the time-varying reproduction number (the average number of secondary infections caused by an infected person). We use changes in mobility to capture the impact that non-pharmaceutical interventions and other behaviour changes have on therate of transmission of SARS-CoV-2. We estimate thatRtwas only below one in 23 states on 1st June. We also estimate that 3.7% [3.4%-4.0%] of the total population of the US had been infected, with wide variation between states, and approximately 0.01% of the population was infectious. We demonstrate good 3 week model forecasts of deaths with low error and good coverage of our credible intervals.

Journal article

Mishra S, Scott J, Zhu H, Ferguson NM, Bhatt S, Flaxman S, Gandy Aet al., 2020, A COVID-19 Model for Local Authorities of the United Kingdom

<jats:title>Abstract</jats:title><jats:p>We propose a new framework to model the COVID-19 epidemic of the United Kingdom at the level of local authorities. The model fits within a general framework for semi-mechanistic Bayesian models of the epidemic, with some important innovations: we model the proportion of infections that result in reported deaths and cases as random variables. This is in contrast to standard frameworks that model the latent infection as a deterministic function of time varying reproduction number, <jats:italic>R</jats:italic><jats:sub><jats:italic>t</jats:italic></jats:sub>. The model is tailored and designed to be updated daily based on publicly available data. We envisage the model to be useful for now-casting and short-term projections of the epidemic as well as estimating historical trends. The model fits are available on a public website, <jats:ext-link xmlns:xlink="" ext-link-type="uri" xlink:href=""></jats:ext-link>. The model is currently being used by the Scottish government in their decisions on interventions within Scotland [1, issue 24 to now].</jats:p>

Journal article

Okell LC, Verity R, Katzourakis A, Volz EM, Watson OJ, Mishra S, Walker P, Whittaker C, Donnelly CA, Riley S, Ghani AC, Gandy A, Flaxman S, Ferguson NM, Bhatt Set al., 2020, Host or pathogen-related factors in COVID-19 severity? Reply, LANCET, Vol: 396, Pages: 1397-1397, ISSN: 0140-6736

Journal article

Monod M, Blenkinsop A, Xi X, Hebert D, Bershan S, Tietze S, Bradley VC, Chen Y, Coupland H, Filippi S, Ish-Horowicz J, McManus M, Mellan T, Gandy A, Hutchinson M, T Unwin HJ, C Vollmer MA, Weber S, Zhu H, Bezancon A, Ferguson NM, Mishra S, Flaxman S, Bhatt S, Ratmann Oet al., 2020, Report 32: Age groups that sustain resurging COVID-19 epidemics in the United States

<jats:title>Summary</jats:title><jats:p>Following initial declines, in mid 2020, a resurgence in transmission of novel coronavirus disease (COVID-19) has occurred in the United States and parts of Europe. Despite the wide implementation of non-pharmaceutical interventions, it is still not known how they are impacted by changing contact patterns, age and other demographics. As COVID-19 disease control becomes more localised, understanding the age demographics driving transmission and how these impacts the loosening of interventions such as school reopening is crucial. Considering dynamics for the United States, we analyse aggregated, age-specific mobility trends from more than 10 million individuals and link these mechanistically to age-specific COVID-19 mortality data. In contrast to previous approaches, we link mobility to mortality via age specific contact patterns and use this rich relationship to reconstruct accurate transmission dynamics. Contrary to anecdotal evidence, we find little support for age-shifts in contact and transmission dynamics over time. We estimate that, until August, 63.4% [60.9%-65.5%] of SARS-CoV-2 infections in the United States originated from adults aged 20-49, while 1.2% [0.8%-1.8%] originated from children aged 0-9. In areas with continued, community-wide transmission, our transmission model predicts that re-opening kindergartens and elementary schools could facilitate spread and lead to additional COVID-19 attributable deaths over a 90-day period. These findings indicate that targeting interventions to adults aged 20-49 are an important consideration in halting resurgent epidemics and preventing COVID-19-attributable deaths when kindergartens and elementary schools reopen.</jats:p><jats:sec><jats:title>One sentence summary</jats:title><jats:p>Adults aged 20-49 are a main driver of the COVID-19 epidemic in the United States; yet, in areas with resurging epidemics, opening schools will lea

Journal article

Monod M, Blenkinsop A, Xi X, Herbert D, Bershan S, Tietze S, Bradley V, Chen Y, Coupland H, Filippi S, Ish-Horowicz J, McManus M, Mellan T, Gandy A, Hutchinson M, Unwin H, Vollmer M, Weber S, Zhu H, Bezancon A, Ferguson N, Mishra S, Flaxman S, Bhatt S, Ratmann O, Ainslie K, Baguelin M, Boonyasiri A, Boyd O, Cattarino L, Cooper L, Cucunuba Perez Z, Cuomo-Dannenburg G, Djaafara A, Dorigatti I, van Elsland S, Fitzjohn R, Gaythorpe K, Geidelberg L, Green W, Hamlet A, Jeffrey B, Knock E, Laydon D, Nedjati Gilani G, Nouvellet P, Parag K, Siveroni I, Thompson H, Verity R, Walters C, Donnelly C, Okell L, Bhatia S, Brazeau N, Eales O, Haw D, Imai N, Jauneikaite E, Lees J, Mousa A, Olivera Mesa D, Skarp J, Whittles Let al., 2020, Report 32: Targeting interventions to age groups that sustain COVID-19 transmission in the United States, Pages: 1-32

Following ini􀀂al declines, in mid 2020, a resurgence in transmission of novel coronavirus disease (COVID-19) has occurred in the United States and parts of Europe. Despite the wide implementa􀀂on of non-pharmaceu􀀂cal inter-ven􀀂ons, it is s􀀂ll not known how they are impacted by changing contact pa􀀁erns, age and other demographics. As COVID-19 disease control becomes more localised, understanding the age demographics driving transmission and how these impact the loosening of interven􀀂ons such as school reopening is crucial. Considering dynamics for the United States, we analyse aggregated, age-specific mobility trends from more than 10 million individuals and link these mechanis􀀂cally to age-specific COVID-19 mortality data. In contrast to previous approaches, we link mobility to mortality via age specific contact pa􀀁erns and use this rich rela􀀂onship to reconstruct accurate trans-mission dynamics. Contrary to anecdotal evidence, we find li􀀁le support for age-shi􀀃s in contact and transmission dynamics over 􀀂me. We es􀀂mate that, un􀀂l August, 63.4% [60.9%-65.5%] of SARS-CoV-2 infec􀀂ons in the United States originated from adults aged 20-49, while 1.2% [0.8%-1.8%] originated from children aged 0-9. In areas with con􀀂nued, community-wide transmission, our transmission model predicts that re-opening kindergartens and el-ementary schools could facilitate spread and lead to considerable excess COVID-19 a􀀁ributable deaths over a 90-day period. These findings indicate that targe􀀂ng interven􀀂ons to adults aged 20-49 are an important con-sidera􀀂on in hal􀀂ng resurgent epidemics, and preven􀀂ng COVID-19-a􀀁ributable deaths when kindergartens and elementary schools reopen.

Journal article

Hilbers A, Brayshaw D, Gandy A, 2020, Importance subsampling for power system planning under multi-year demand and weather uncertainty, PMAPS 2020 (the 16th International Conference on Probabilistic Methods Applied to Power Systems), Publisher: IEEE, Pages: 1-6

This paper introduces a generalised version ofimportance subsamplingfor time series reduction/aggregation inoptimisation-based power system planning models. Recent studiesindicate that reliably determining optimal electricity (investment)strategy under climate variability requires the consideration ofmultiple years of demand and weather data. However, solvingplanning models over long simulation lengths is typically com-putationally unfeasible, and established time series reductionapproaches induce significant errors. Theimportance subsamplingmethod reliably estimates long-term planning model outputs atgreatly reduced computational cost, allowing the considerationof multi-decadal samples. The key innovation is a systematicidentification and preservation of relevant extreme events inmodeling subsamples. Simulation studies on generation andtransmission expansion planning models illustrate the method’senhanced performance over established “representative days”clustering approaches. The models, data and sample code aremade available as open-source software.

Conference paper

Ding D, Gandy A, Hahn G, 2020, A simple method for implementing Monte Carlo tests, Computational Statistics, Vol: 35, Pages: 1373-1392, ISSN: 0943-4062

We consider a statistical test whose p value can only be approximated using Monte Carlo simulations. We are interested in deciding whether the p value for an observed data set lies above or below a given threshold such as 5%. We want to ensure that the resampling risk, the probability of the (Monte Carlo) decision being different from the true decision, is uniformly bounded. This article introduces a simple open-ended method with this property, the confidence sequence method (CSM). We compare our approach to another algorithm, SIMCTEST, which also guarantees an (asymptotic) uniform bound on the resampling risk, as well as to other Monte Carlo procedures without a uniform bound. CSM is free of tuning parameters and conservative. It has the same theoretical guarantee as SIMCTEST and, in many settings, similar stopping boundaries. As it is much simpler than other methods, CSM is a useful method for practical applications.

Journal article

Flaxman S, Mishra S, Gandy A, Unwin HJT, Mellan TA, Coupland H, Whittaker C, Zhu H, Berah T, Eaton JW, Monod M, Perez Guzman PN, Schmit N, Cilloni L, Ainslie K, Baguelin M, Boonyasiri A, Boyd O, Cattarino L, Cucunuba Perez Z, Cuomo-Dannenburg G, Dighe A, Djaafara A, Dorigatti I, van Elsland S, Fitzjohn R, Gaythorpe K, Geidelberg L, Grassly N, Green W, Hallett T, Hamlet A, Hinsley W, Jeffrey B, Knock E, Laydon D, Nedjati Gilani G, Nouvellet P, Parag K, Siveroni I, Thompson H, Verity R, Volz E, Walters C, Wang H, Watson O, Winskill P, Xi X, Walker P, Ghani AC, Donnelly CA, Riley SM, Vollmer MAC, Ferguson NM, Okell LC, Bhatt Set al., 2020, Estimating the effects of non-pharmaceutical interventions on COVID-19 in Europe, Nature, Vol: 584, Pages: 257-261, ISSN: 0028-0836

Following the emergence of a novel coronavirus1 (SARS-CoV-2) and its spread outside of China, Europe has experienced large epidemics. In response, many European countries have implemented unprecedented non-pharmaceutical interventions such as closure of schools and national lockdowns. We study the impact of major interventions across 11 European countries for the period from the start of COVID-19 until the 4th of May 2020 when lockdowns started to be lifted. Our model calculates backwards from observed deaths to estimate transmission that occurred several weeks prior, allowing for the time lag between infection and death. We use partial pooling of information between countries with both individual and shared effects on the reproduction number. Pooling allows more information to be used, helps overcome data idiosyncrasies, and enables more timely estimates. Our model relies on fixed estimates of some epidemiological parameters such as the infection fatality rate, does not include importation or subnational variation and assumes that changes in the reproduction number are an immediate response to interventions rather than gradual changes in behavior. Amidst the ongoing pandemic, we rely on death data that is incomplete, with systematic biases in reporting, and subject to future consolidation. We estimate that, for all the countries we consider, current interventions have been sufficient to drive the reproduction number Rt below 1 (probability Rt< 1.0 is 99.9%) and achieve epidemic control. We estimate that, across all 11 countries, between 12 and 15 million individuals have been infected with SARS-CoV-2 up to 4th May, representing between 3.2% and 4.0% of the population. Our results show that major non-pharmaceutical interventions and lockdown in particular have had a large effect on reducing transmission. Continued intervention should be considered to keep transmission of SARS-CoV-2 under control.

Journal article

Okell LC, Verity R, Watson OJ, Mishra S, Walker P, Whittaker C, Katzourakis A, Donnelly CA, Riley S, Ghani AC, Gandy A, Flaxman S, Ferguson NM, Bhatt Set al., 2020, Have deaths from COVID-19 in Europe plateaued due to herd immunity?, LANCET, Vol: 395, Pages: E110-E111, ISSN: 0140-6736

Journal article

Gandy A, Veraart LAM, 2020, Compound poisson models for weighted networks with applications in finance, Mathematics and Financial Economics, Vol: 15, Pages: 131-153, ISSN: 1862-9660

We develop a modelling framework for estimating and predicting weighted network data. Theedge weights in weighted networks often arise from aggregating some individual relationships between the nodes. Motivated by this, we introduce a modelling framework for weighted networksbased on the compound Poisson distribution. To allow for heterogeneity between the nodes, weuse a regression approach for the model parameters. We test the new modelling framework on twotypes of financial networks: a network of financial institutions in which the edge weights representexposures from trading Credit Default Swaps and a network of countries in which the edge weightsrepresent cross-border lending. The compound Poisson Gamma distributions with regression fit thedata well in both situations. We illustrate how this modelling framework can be used for predictingunobserved edges and their weights in an only partially observed network. This is for examplerelevant for assessing systemic risk in financial networks.

Journal article

Scott J, Gandy A, 2020, State-dependent Kernel selection for conditional sampling of graphs, Journal of Computational and Graphical Statistics, Vol: 29, Pages: 847-858, ISSN: 1061-8600

This article introduces new efficient algorithms for two problems: sampling conditional on vertex degrees in unweighted graphs, and conditional on vertex strengths in weighted graphs. The resulting conditional distributions provide the basis for exact tests on social networks and two-way contingency tables. The algorithms are able to sample conditional on the presence or absence of an arbitrary set of edges. Existing samplers based on MCMC or sequential importance sampling are generally not scalable; their efficiency can degrade in large graphs with complex patterns of known edges. MCMC methods usually require explicit computation of a Markov basis to navigate the state space; this is computationally intensive even for small graphs. Our samplers do not require a Markov basis, and are efficient both in sparse and dense settings. The key idea is to carefully select a Markov kernel on the basis of the current state of the chain. We demonstrate the utility of our methods on a real network and contingency table. Supplementary materials for this article are available online.

Journal article

Mellan T, Hoeltgebaum H, Mishra S, Whittaker C, Schnekenberg R, Gandy A, Unwin H, Vollmer M, Coupland H, Hawryluk I, Rodrigues Faria N, Vesga J, Zhu H, Hutchinson M, Ratmann O, Monod M, Ainslie K, Baguelin M, Bhatia S, Boonyasiri A, Brazeau N, Charles G, Cooper L, Cucunuba Perez Z, Cuomo-Dannenburg G, Dighe A, Djaafara A, Eaton J, van Elsland S, Fitzjohn R, Fraser K, Gaythorpe K, Green W, Hayes S, Imai N, Jeffrey B, Knock E, Laydon D, Lees J, Mangal T, Mousa A, Nedjati Gilani G, Nouvellet P, Olivera Mesa D, Parag K, Pickles M, Thompson H, Verity R, Walters C, Wang H, Wang Y, Watson O, Whittles L, Xi X, Okell L, Dorigatti I, Walker P, Ghani A, Riley S, Ferguson N, Donnelly C, Flaxman S, Bhatt Set al., 2020, Report 21: Estimating COVID-19 cases and reproduction number in Brazil

Brazil is an epicentre for COVID-19 in Latin America. In this report we describe the Brazilian epidemicusing three epidemiological measures: the number of infections, the number of deaths and the reproduction number. Our modelling framework requires sufficient death data to estimate trends, and wetherefore limit our analysis to 16 states that have experienced a total of more than fifty deaths. Thedistribution of deaths among states is highly heterogeneous, with 5 states—São Paulo, Rio de Janeiro,Ceará, Pernambuco and Amazonas—accounting for 81% of deaths reported to date. In these states, weestimate that the percentage of people that have been infected with SARS-CoV-2 ranges from 3.3% (95%CI: 2.8%-3.7%) in São Paulo to 10.6% (95% CI: 8.8%-12.1%) in Amazonas. The reproduction number (ameasure of transmission intensity) at the start of the epidemic meant that an infected individual wouldinfect three or four others on average. Following non-pharmaceutical interventions such as school closures and decreases in population mobility, we show that the reproduction number has dropped substantially in each state. However, for all 16 states we study, we estimate with high confidence that thereproduction number remains above 1. A reproduction number above 1 means that the epidemic isnot yet controlled and will continue to grow. These trends are in stark contrast to other major COVID19 epidemics in Europe and Asia where enforced lockdowns have successfully driven the reproductionnumber below 1. While the Brazilian epidemic is still relatively nascent on a national scale, our resultssuggest that further action is needed to limit spread and prevent health system overload.


Vollmer M, Mishra S, Unwin H, Gandy A, Melan T, Bradley V, Zhu H, Coupland H, Hawryluk I, Hutchinson M, Ratmann O, Monod M, Walker P, Whittaker C, Cattarino L, Ciavarella C, Cilloni L, Ainslie K, Baguelin M, Bhatia S, Boonyasiri A, Brazeau N, Charles G, Cooper L, Cucunuba Perez Z, Cuomo-Dannenburg G, Dighe A, Djaafara A, Eaton J, van Elsland S, Fitzjohn R, Gaythorpe K, Green W, Hayes S, Imai N, Jeffrey B, Knock E, Laydon D, Lees J, Mangal T, Mousa A, Nedjati Gilani G, Nouvellet P, Olivera Mesa D, Parag K, Pickles M, Thompson H, Verity R, Walters C, Wang H, Wang Y, Watson O, Whittles L, Xi X, Ghani A, Riley S, Okell L, Donnelly C, Ferguson N, Dorigatti I, Flaxman S, Bhatt Set al., 2020, Report 20: A sub-national analysis of the rate of transmission of Covid-19 in Italy

Italy was the first European country to experience sustained local transmission of COVID-19. As of 1st May 2020, the Italian health authorities reported 28; 238 deaths nationally. To control the epidemic, the Italian government implemented a suite of non-pharmaceutical interventions (NPIs), including school and university closures, social distancing and full lockdown involving banning of public gatherings and non essential movement. In this report, we model the effect of NPIs on transmission using data on average mobility. We estimate that the average reproduction number (a measure of transmission intensity) is currently below one for all Italian regions, and significantly so for the majority of the regions. Despite the large number of deaths, the proportion of population that has been infected by SARS-CoV-2 (the attack rate) is far from the herd immunity threshold in all Italian regions, with the highest attack rate observed in Lombardy (13.18% [10.66%-16.70%]). Italy is set to relax the currently implemented NPIs from 4th May 2020. Given the control achieved by NPIs, we consider three scenarios for the next 8 weeks: a scenario in which mobility remains the same as during the lockdown, a scenario in which mobility returns to pre-lockdown levels by 20%, and a scenario in which mobility returns to pre-lockdown levels by 40%. The scenarios explored assume that mobility is scaled evenly across all dimensions, that behaviour stays the same as before NPIs were implemented, that no pharmaceutical interventions are introduced, and it does not include transmission reduction from contact tracing, testing and the isolation of confirmed or suspected cases. We find that, in the absence of additional interventions, even a 20% return to pre-lockdown mobility could lead to a resurgence in the number of deaths far greater than experienced in the current wave in several regions. Future increases in the number of deaths will lag behind the increase in transmission intensity and so a


Hawryluk I, Mellan TA, Hoeltgebaum H, Mishra S, Schnekenberg RP, Whittaker C, Zhu H, Gandy A, Donnelly CA, Flaxman S, Bhatt Set al., 2020, Inference of COVID-19 epidemiological distributions from Brazilian hospital data, Journal of The Royal Society Interface, Vol: 17, Pages: 20200596-20200596, ISSN: 1742-5662

Knowing COVID-19 epidemiological distributions, such as the time from patient admission to death, is directly relevant to effective primary and secondary care planning, and moreover, the mathematical modelling of the pandemic generally. We determine epidemiological distributions for patients hospitalized with COVID-19 using a large dataset (N = 21 000 − 157 000) from the Brazilian Sistema de Informação de Vigilância Epidemiológica da Gripe database. A joint Bayesian subnational model with partial pooling is used to simultaneously describe the 26 states and one federal district of Brazil, and shows significant variation in the mean of the symptom-onset-to-death time, with ranges between 11.2 and 17.8 days across the different states, and a mean of 15.2 days for Brazil. We find strong evidence in favour of specific probability density function choices: for example, the gamma distribution gives the best fit for onset-to-death and the generalized lognormal for onset-to-hospital-admission. Our results show that epidemiological distributions have considerable geographical variation, and provide the first estimates of these distributions in a low and middle-income setting. At the subnational level, variation in COVID-19 outcome timings are found to be correlated with poverty, deprivation and segregation levels, and weaker correlation is observed for mean age, wealth and urbanicity.

Journal article

Gandy A, Hahn G, Ding D, 2019, Implementing Monte Carlo tests with p-value buckets, SCANDINAVIAN JOURNAL OF STATISTICS, Vol: 47, Pages: 950-967, ISSN: 0303-6898

Journal article

Jin S, Savioli N, Marvao AD, Dawes TJW, Gandy A, Rueckert D, O'Regan DPet al., 2019, Joint analysis of clinical risk factors and 4D cardiac motion for survival prediction using a hybrid deep learning network, Publisher: arXiv

In this work, a novel approach is proposed for joint analysis of highdimensional time-resolved cardiac motion features obtained from segmentedcardiac MRI and low dimensional clinical risk factors to improve survivalprediction in heart failure. Different methods are evaluated to find theoptimal way to insert conventional covariates into deep prediction networks.Correlation analysis between autoencoder latent codes and covariate features isused to examine how these predictors interact. We believe that similarapproaches could also be used to introduce knowledge of genetic variants tosuch survival networks to improve outcome prediction by jointly analysingcardiac motion traits with inheritable risk factors.

Working paper

Hilbers A, Brayshaw D, Gandy A, 2019, Importance subsampling: Improving power system planning under climate-based uncertainty, Applied Energy, Vol: 251, Pages: 1-12, ISSN: 0306-2619

Recent studies indicate that the effects of inter-annual climate-based variability in power system planning are significant and that long samples of demand & weather data (spanning multiple decades) should be considered. At the same time, modelling renewable generation such as solar and wind requires high temporal resolution to capture fluctuations in output levels. In many realistic power system models, using long samples at high temporal resolution is computationally unfeasible. This paper introduces a novel subsampling approach, referred to as importance subsampling, allowing the use of multiple decades of demand & weather data in power system planning models at reduced computational cost. The methodology can be applied in a wide class of optimisation-based power system simulations. A test case is performed on a model of the United Kingdom created using the open-source modelling framework Calliope and 36 years of hourly demand and wind data. Standard data reduction approaches such as using individual years or clustering into representative days lead to significant errors in estimates of optimal system design. Furthermore, the resultant power systems lead to supply capacity shortages, raising questions of generation capacity adequacy. In contrast, importance subsampling leads to accurate estimates of optimal system design at greatly reduced computational cost, with resultant power systems able to meet demand across all 36 years of demand & weather scenarios.

Journal article

Veraart LAM, Gandy A, 2019, Adjustable network reconstruction with applications to CDS exposures, Journal of Multivariate Analysis, Vol: 172, Pages: 193-209, ISSN: 0047-259X

This paper is concerned with reconstructing weighted directed networks from the total in- and out-weight of each node. This problem arises for example in the analysis of systemic risk of partially observed financial networks. Typically a wide range of networks is consistent with this partial information. We develop an empirical Bayesian methodology that can be adjusted such that the resulting networks are consistent with the observations and satisfy certain desired global topological properties such as a given mean density, extending the approach by Gandy and Veraart (2017). Furthermore we propose a new fitness-based model within this framework. We provide a case study based on a data set consisting of 89 fully observed financial networks of credit default swap exposures. We reconstruct those networks based on only partial information using the newly proposed as well as existing methods. To assess the quality of the reconstruction, we use a wide range of criteria, including measures on how well the degree distribution can be captured and higher order measures of systemic risk. We find that the empirical Bayesian approach performs best.

Journal article

Noven R, Veraart A, Gandy A, 2018, A latent trawl process model for extreme values, Journal of Energy Markets, Vol: 11, Pages: 1-24, ISSN: 1756-3607

This paper presents a new model for characterising temporaldependence in exceedancesabove a threshold. The model is based on the class of trawl processes, which are stationary,infinitely divisible stochastic processes. The model for extreme values is constructed byembedding a trawl process in a hierarchical framework, which ensures that the marginaldistribution is generalised Pareto, as expected from classical extreme value theory. Wealso consider a modified version of this model that works witha wider class of generalisedPareto distributions, and has the advantage of separating marginal and temporal depen-dence properties. The model is illustrated by applicationsto environmental time series,and it is shown that the model offers considerable flexibilityin capturing the dependencestructure of extreme value data

Journal article

Gandy A, Veraart LAM, 2017, A Bayesian methodology for systemic risk assessment in financial networks, Management Science, Vol: 63, Pages: 4428-4446, ISSN: 0025-1909

We develop a Bayesian methodology for systemic risk assessment in financial networks such as theinterbank market. Nodes represent participants in the network and weighted directed edges representliabilities. Often, for every participant, only the total liabilities and total assets within this network areobservable. However, systemic risk assessment needs the individual liabilities. We propose a modelfor the individual liabilities, which, following a Bayesian approach, we then condition on the observedtotal liabilities and assets and, potentially, on certain observed individual liabilities. We construct aGibbs sampler to generate samples from this conditional distribution. These samples can be used instress testing, giving probabilities for the outcomes of interest. As one application we derive defaultprobabilities of individual banks and discuss their sensitivity with respect to prior information includedto model the network. An R-package implementing the methodology is provided.

Journal article

Gandy A, Kvaløy JT, 2017, spcadjust: an R package for adjusting for estimation error in control charts, The R Journal, Vol: 9, Pages: 458-476, ISSN: 2073-4859

In practical applications of control charts the in-control state and the corresponding chartparameters are usually estimated based on some past in-control data. The estimation error thenneeds to be accounted for. In this paper we present an R package,spcadjust, which implements abootstrap based method for adjusting monitoring schemes to take into account the estimation error.By bootstrapping the past data this method guarantees, with a certain probability, a conditionalperformance of the chart. Inspcadjustthe method is implement for various types of Shewhart,CUSUM and EWMA charts, various performance criteria, and both parametric and non-parametricbootstrap schemes. In addition to the basic charts, charts based on linear and logistic regressionmodels for risk adjusted monitoring are included, and it is easy for the user to add further charts. Useof the package is demonstrated by examples.

Journal article

Lau FDH, Gandy A, 2016, Enhancing football league tables, Significance, Vol: 13, Pages: 8-9, ISSN: 1740-9705

League tables are commonly used to represent the current state of a competition, in football and other sports. But they do not tell the full story. F. Din-Houn Lau and Axel Gandy suggest a few improvements.

Journal article

Gandy A, Lau F, 2016, The chopthin algorithm for resampling, IEEE Transactions on Signal Processing, Vol: 64, Pages: 4273-4281, ISSN: 1941-0476

Resampling is a standard step in particle filters and more generally sequential Monte Carlo methods. We present an algorithm, called chopthin, for resampling weighted particles. In contrast to standard resampling methods the algorithm does not produce a set of equally weighted particles; instead it merely enforces an upper bound on the ratio between the weights. Simulation studies show that the chopthin algorithm consistently outperforms standard resampling methods. The algorithms chops up particles with large weight and thins out particles with low weight, hence its name. It implicitly guarantees a lower bound on the effective sample size. The algorithm can be implemented efficiently, making it practically useful. We show that the expected computational effort is linear in the number of particles. Implementations for C++, R (on CRAN), Python and Matlab are available.

Journal article

This data is extracted from the Web of Science and reproduced under a licence from Thomson Reuters. You may not copy or re-distribute this data in whole or in part without the written consent of the Science business of Thomson Reuters.

Request URL: Request URI: /respub/WEB-INF/jsp/search-html.jsp Query String: respub-action=search.html&id=00500172&limit=30&person=true