Modelling an unprecedented pandemic

The vital role of team-based, collaborative epidemiology and disease modelling in managing pandemics

Abstract image of microscopic biological mutation virus

Many people struggle to properly grasp probabilities, risk and concepts like exponential growth, based on intuition alone. We underestimate the likelihood of outlier events - sometimes dubbed 'Black Swans' - and we’re constantly subject to confirmation biases. It’s not all our fault. Our brains and senses likely evolved for what Richard Dawkins coined a ‘Middle World’ between the microscopic realm of atoms and germs and the cosmic scale of planets and stars.

With the tools of modern science, we’ve found ways to answer questions about the world. Yet there’s always uncertainty.

During the novel coronavirus pandemic, scientists have tried to understand the behaviour of the virus as well as its effects on very large numbers of human hosts, attempting to bridge these two worlds coherently. But ‘the science’ isn’t a singular entity: it’s highly collaborative, consensus based – and yes, subject to uncertainty. 

We have the language to communicate across disciplines and that’s incredibly important.
Professor Azra Ghani

Nevertheless, the role of epidemiologists has been central in the pandemic and the mathematical models they produce have been instrumental in understanding how the virus might impact populations, helping to inform government policy around the world.

But epidemiology itself is a rapidly evolving field, and again, highly collaborative, as Professor Azra Ghani, Chair in Infectious Disease Epidemiology and a key member of Imperial’s COVID-19 Response team, explains:

“I think perhaps sometimes we are viewed as mathematical modellers working in isolation from the wider public health community; when actually we’re epidemiologists first, who happen to have a more quantitative background and we work quite closely across disciplines. We’ll talk to an immunologist to understand how the immune response is working, we’ll talk to the clinicians to understand the clinical processes.

"Students in my group have previously spent time working in an immunology lab or out in the field in other epidemics. We’re embedded in that broader public health community. Crucially, some of us have been working in this area for a long time, so we have the language to communicate across disciplines and that’s incredibly important.

“Therefore mathematical modelling isn’t just a case of producing these projections, but it’s a way of formalising a lot of the conversations that we have with our public health colleagues and we work in partnership in that way.”

Mobilising a team

Groups of academic epidemiologists, including those from Imperial’s COVID-19 Response Team, have been working since the early days of the outbreak in January to try and understand it using a variety of tools.

abstract image of a cell under microscope

They are embedded within Imperial’s MRC Centre for Global Infectious Disease Analysis (MRC GIDA) and the Abdul Latif Jameel Institute for Disease and Emergency Analytics (J-IDEA) both based in the School of Public Health.

Initially the team’s efforts were focused on gathering data from China, including from the Ministry of Health website and regional websites in the country, then trying to understand how fast the virus was spreading, how many were requiring hospitalisation and ultimately the fatality ratio.

“What we call ‘statistics’, or ‘data science’, or ‘modelling’ – they are not clearly divisible and there are different types of model,” says Professor Neil Ferguson, who leads the COVID-19 Response Team and is Director of both MRC GIDA and the Jameel Institute.

“Early on we had very limited data to go on, so we used very simple models that can be written down mathematically to project the early phase of the epidemic. As we move on, we’re trying to capture patterns of transmission in populations, using classical epidemic models.

"Then there’s the most complex models we use, where we’re interested in modelling specific interventions, using simulations which give the finest scale of representation of disease transmission. It really depends on the application” (See 'Anatomy of a model').

The COVID-19 Response Team itself grew from around 10-15 researchers in the early days of the outbreak, to over 50 people at present. This includes experienced epidemiologists, through to postdoctoral researchers as well as PhD students – who have put their individual projects on hold to work on the pandemic.

“We’ve been running the MRC Centre for 10 years now ,” Professor Ferguson says. “We have dealt with a lot of infectious disease crises, clearly none quite on the scale of this, but we exist to provide quality support and we have an established process for creating teams that work together. Also, the team has grown beyond just the centre, we’ve pulled in people from the Departments of Mathematics and Statistics section for example.”

Professor Neil Ferguson

There are around 20 work-streams in total currently running, and four or five key teams that run models and produce outputs that have helped to inform government policy in the UK and around the world. Crucially the teams cross-check each other’s models to ensure that they are producing consistent results.

Professor Azra Ghani and her group have built a relatively simple compartmental model (see ‘Anatomy of a model’), aimed at tracking the epidemic in over 100 Low and Middle Income Countries (LMICs) around the world. In many cases they have leveraged existing connections, that were built for example through previous collaborative work on malaria and HIV.

The group is now working with the UK’s Department for International Development (DFID) to produce a public, easy-to-use dashboard for short-term disease projections and scenario analysis tool for longer-term planning in LMICs. These projections can be used by the countries themselves as well as global partners such as the World Health Organisation (WHO) to link to their resource planning tools.

“What we are aiming to do is to make very short-term forecasts initially on what demand on healthcare will be,” Professor Ghani says. “That’s a big concern we have in the lower and middle income settings, since health capacity is very constrained; a lot of work by organisations is being done to get protective equipment and facilities in, including oxygen supplies and maybe ventilators. So these very short term forecasts of healthcare demand are helpful for those planning purposes at the country level and global level.”

Another group within the Imperial response team, led by Dr Samir Bhatt, is also working with policy-makers around the word, including in Brazil, Italy, South Africa and the State of New York. They take a slightly different, but complementary, approach to modelling than most of the other groups. By examining publicly available data on COVID-19 deaths, they back calculate in time what that implies for infections in a certain period, then correlate those infection rates with the timing of interventions or Google mobility data.

We can say that these sort of policies will have this sort of effect, broadly on the epidemic trajectory, but it’s up to policy-makers to determine whether the benefits of that in terms of health impact are worth the economic cost.
Professor Neil Ferguson

Then there are two groups that provide scientific advice more formally to the UK government. Dr Marc Baguelin (who has a joint role between Imperial and the London School of Hygiene and Tropical Medicine) and his group produce ‘real-time modelling’ estimates of the all-important effective reproduction number (Rt) of the virus as well healthcare capacity and more recently antibody prevalence in the population. Marc also sits on the UK government’s Scientific Pandemic Influenza Modelling Group (SPI-M) which provides these estimates and data to the Scientific Advisory Group for Emergencies (SAGE).

While overseeing the COVID-19 Response Team, Professor Ferguson also leads a group that runs individual-based simulation models (see 'Anatomy of a model’), which have helped to inform UK policy. Alongside similar models from other academic institutes these have been important in understanding the impact of various societal interventions – or ‘lockdown’ measures – as well hospital capacity.

“There are two forms of consensus: within Imperial we were using consistent sets of parameters for different models and there’s also the broader consensus across all the modelling groups in the UK who are feeding into SAGE, and all the groups were broadly converging on consistent estimates,” says Professor Ferguson, adding that there are always trade-offs.

“If you’re doing things in real time in a very short period of time, and the purpose is to inform policy, inevitably you can’t include everything. There will be uncertainty and you have to prioritise. We can say that these sort of policies will have this sort of effect, broadly on the epidemic trajectory, but it’s up to policy-makers to determine whether the benefits of that in terms of health impact are worth the economic cost.”

Building technical capability

While modellers are ‘epidemiologists first, with a quantitative background’ as Professor Ghani noted previously, they are highly experienced in writing computer code to do science.

abstract background data visualisation

Academia and industry both played a central role in the invention of coding languages in the early 1950s, but have since evolved in different directions reflecting very different aims. Generally, scientists use code as a tool to answer questions and understand phenomena – whether that’s how galaxies evolve, the climate changes or diseases progress.

The code is rarely ‘finished’ and evolves as more is known about the underlying science. By contrast, commercial software tends to be a self-contained product, relatively simple and highly automated, in order to be used by sometimes millions of people in the case of say Microsoft Word.

As theoretical cosmologist and modeller Dr Phil Bull from Queen Mary University of London explains in a recent blog post: “Programming patterns and norms that work for one may not work for the other … exhortations to keep code simple, might actually interfere with the intended applications of a scientific code.”

Nevertheless, the two fields have been collaborating more closely in recent years, and some of the techniques and approaches from commercial software development lend themselves to some scientific models and ‘data pipeline’ procedures. (The more complex individual-based models tend to still require close scientific/epidemiological supervision).

The code is rarely ‘finished’ and evolves as more is known about the underlying science.

Over the past few years the Imperial MRC GIDA has been building this capability, and has a growing team of around six software engineers (soon to be eight), mostly with industry backgrounds. They’ve previously worked on creating software tools to use in epidemics – recently Ebola and now of course COVID-19.

As Professor Ferguson explains: “One of the reasons we invested in building a software engineering group is that most of what we do is analysing the epidemiological data then fitting some transmission models to that data to draw conclusions. Having generic reusable code for that is clearly an advantage, that’s why we've moved in that direction. That trend was already dominant, I expect that will accelerate from a software engineering point of view.”

Dr Rich Fitzjohn leads that MRC GIDA software engineering group and since January has been putting in place a lot of the software infrastructure to support the team, including a data reporting framework and data processing pipelines.

Woman stands with device in image depicting futuristic data visualisation

Dr Fitzjohn says: “A lot of what I do working with researchers is trying to write tools so that they can work closer to the maths, closer to the biology, and get less bogged down in software engineering.

“We write our code in a way that people from industry would completely recognise because we recruit people from industry and we all learn from each other. All of the software that sits underneath this work is all tested with unit tests, integration tests, continuous integration, it’s all been version controlled, all been developed, properly licensed and well documented.”

As well as releasing their reports and forecasts, all the Imperial groups working on COVID-19 have released all the source code underpinning the models onto the online repository GitHub. The teams have also collaborated with Microsoft, Github and respected industry programmer John Carmack to verify and cross-check the code and also build tools so that others can more easily understand and use the code properly. An independent codecheck, led by Cambridge University’s Dr Stephen Eglen, backed Imperial’s work.

Indeed, the team have recently released some files to help anyone to reproduce the results of the influential Report 9 using the source code.

Next steps

Helping the world come out of lockdown, modelling the impact of vaccination programmes and working collaboratively worldwide.

Abstract background with a dynamic wave.

Since the start of the pandemic, the Imperial COVID-19 Response Team has published reports as new data become available for analysis – attempting to strike a balance between the rigorous, methodical scientific method and the pressing public and government need for new information.

This was particularly important during the early stages of the pandemic, when transmission was growing exponentially, with infections doubling every three or four days, and information was needed as to what interventions were going to be effective in flattening the curve and maintaining intensive care capacity.

As well as publishing rapid response reports, the Imperial College COVID-19 Response Team has continued to work on deep analysis, publishing in peer-reviewed journals. Professor Ferguson and colleagues are working with the UK Government to understand the impact of gradually lifting locking and introducing ‘test and trace’ measures – and they plan to publish this soon under peer-review.

Professor Ghani’s group is also working with countries as they come out of lockdown. Most recently they were looking at Nigeria, which had closed state borders within the country and is now relaxing some of those, integrating household quarantine, contact tracing as well as locally targeted responses. To capture this spatial movement has required the running of quite a complex individual-based model.

“It’s been very hard for lower income settings to do this type of stringent lockdown; it has much wider implications for people’s livelihoods, and that remains a concern as they can’t sustain it and they are moving out of that relatively quickly now,” Professor Ghani says.

“I think the challenge in the next month or two is going to be to work out what is sustainable in each setting, it’s going to be very context dependent.”

Hopefully once we put this crisis behind us, there will be an ability to bring countries together again.
Professor Neil Ferguson

Several members of the team also have relevant experience in modelling the impact of vaccination programmes worldwide, through the Vaccine Impact Modelling Consortium. This has already proven beneficial in the current crisis as researchers engage with other groups planning vaccine trials. Vaccine modelling will also be essential in the event of a successful candidate emerging, in order to prioritise roll-out to certain groups.

Prior to the current crisis Professor Ferguson and a number of other well-known scientists around the world were in the early stages of preparing a bid for what essentially represents a global network of early warning systems for emerging infectious disease outbreaks. Then the immediate demands of the current crisis took over, but he hopes that is something that can at least be partly realised in the future.

“What I’d hope to see is those long-term platforms put in place; networks linking leading academic groups together with public health agencies. At the moment I’m rather disheartened by the lack of international cooperation in this epidemic, as well as the politicisation and unfounded criticism of the WHO. Policy-making has mostly been at a national level, there hasn’t been as much coordination as I would have liked to have seen.

“Hopefully once we put this crisis behind us, there will be an ability to bring countries together again.”

Explained: Anatomy of a model

The core engine at the heart of many models of infectious diseases – from HIV to flu through COVID-19 – is the ‘S-I-R model’. It splits the population into three basic groups: Susceptible-Infective-Removed. Everyone is assumed to be born susceptible and capable of being infected. Those who have contracted the disease and are capable of passing it to susceptibles are the ‘infectives’. Then the ‘removed’ group is comprised of people who have had the disease and recovered and are now immune, or those who have died.

Even in this basic form, we can run the model and understand how the disease might progress, based on estimates of the theoretical R0 value at the beginning of the epidemic, where one person passes the pathogen to say two or three others, each of those passing it on, branching out in the population. It can also, in a general sense, illustrate the importance of social isolation for those infected. By staying at home until fully recovered, people effectively take themselves from the infected class straight to the removed class without spreading the virus. Indeed simple models like this have proved useful in the work Imperial has done with lower and middle-income countries in tracking the virus.

But the S-I-R model doesn’t say anything about the characteristics of those people in the three groups, and you can take it further by creating more compartments, for example recognising that age is important factor in the spread of the disease.

The most complex form of modelling takes this compartmentalisation to the extreme and attempts to simulate the characteristics, movement and behaviour of every individual in a population. In order to build this ‘synthetic population’, data are required on how many people live in households, the age distribution in a country, how many people are in full time education, full time work, school class sizes, the number of teachers and so on. You can then start to ask quite detailed questions of what might happen in the future, and how certain actions might affect the spread of the virus  for example, implementing school closures or household quarantine compelling everybody in the household to stay at home if one person gets sick.

Abstract data background with blue numbers

Explained: The perils of exponential growth

Exponential growth has been a key theme of the pandemic, and something most people intuitively struggle to imagine. An ancient story about the invention of chess testifies to this. The inventor was offered any reward he desired by his king – so he asks for a single grain of rice to be placed on the first square of the chessboard, plus two on the second, four on the third, doubling each time across the board. Though a little baffled, the king immediately grants the seemingly trivial request. Yet his mathematicians soon inform him that he doesn’t have anything like the resources available – by the time you get to square 64, there are over 18 quintillion grains of rice on the board.

All infectious diseases have the potential to grow exponentially for a period in this way. One person passes the pathogen to two or three others (depending on the basic Reproduction number, R0) and each of those passes it on to others, branching out in the population exponentially. The number of cases can double every five days or less (each square of the board … but the clock is ticking). Infections can seem small and manageable in the early days of an outbreak. Then quite suddenly, they’re not.

That’s why it’s so important to run the maths of the infection in the early days, often projecting some alarming worst-case scenarios, as well as interventions that might help ‘flatten the curve’ and stop the runway growth. Of course every disease has different characteristics, and responds to different interventions differently.

We’ve had major outbreaks and pandemics in recent memory. But for a multitude of reasons, including interventions that proved effective, we haven’t seen the real explosion of cases and deaths we’ve had with coronavirus. It’s perhaps easy to forget that the exponential potential is always there. That is until we’re reminded by a virus that comes along with a perfect storm of characteristics: high infectivity, silent incubation, relatively high mortality. And crucially, no vaccine and no effective pharmacological treatments. That leaves only public health measures to defend ourselves.

Infographic depicting exponential growth - the first image shows one person
Infographic depicting exponential growth - the second image shows three people
Infographic depicting exponential growth - the third image shows 12 people
Infographic depicting exponential growth - the fourth image shows many people


The Forum is Imperial’s policy engagement programme. It connects Imperial researchers with policy makers to discover new thinking on global challenges. Our features provide a shop window into the world leading research taking place at Imperial and provide insight into how it can inform and contribute to public policy debates.