About us

Imperial College London’s Big Data & Analytical Unit (BDAU) is a multidisciplinary team of data specialists which collaborates with a large network of researchers across the College.

The BDAU is primarily a data services-oriented unit with a remit to provide technical and analytical support to its customers, partners, and collaborators across the data lifecycle, from data application and data access to data analysis and visualisation of outputs. This is done primarily through the BDAU Secure Environment (BDAU SE) which provides a standard platform for researchers to securely hold and analyse data.

BDAU Secure Environment (BDAU SE)

The BDAU SE, is an ISO 27001:2013 certified research environment and compliant with NHS Data Security and Protection Toolkit (EE133887-BDAU), providing:
• a standard operating/access model,
• secure data storage and processing environment and
• analysis software (including R, Python, Stata, SPSS, MATLAB, etc).


Our team

Mahsa Mazidi

Mahsa Mazidi profile photo

Mahsa Mazidi
Director, Big Data and Analytical Unit

Davina Tijani

Profile photo

Davina Tijani
Data Operations Manager, Big Data and Analytical Unit

BDAU users and data source, 2023

This infograms shows the number of users of the BDAU in 2023, by Department, and the data sets by source.

Can I use the BDAU SE to store and analyse my data?

You may use the BDAU SE if your research project involves the use of personal data which needs to be handled in a secure way to ensure compliance with regulations such as GDPR, or if the data provider for a specific dataset, requires their data to be stored in a secure environment to meet a set of data security standards. Data which is publicly available or is aggregated typically does not need to be stored in the BDAU SE.

The BDAU SE can be used for data storage and analysis while your project is ongoing. It is not a solution for long-term storage of data and archiving.

Due to our current policies, personally identifiable data (e.g. name, date of birth, IP address, etc) cannot be hosted in the BDAU SE. Therefore, data needs to be fully de-identified before it can be transferred to the BDAU SE.

Imaging, audio, and video data generally cannot be hosted in the BDAU SE.

Our services

  1. Data Acquisition – Initiating, advising on and implementing the process of obtaining datasets required for individual or multiple studies. This includes applying to bodies such as NHS England and MHRA for use of the restricted datasets such as HES and CPRD. 
  2. Secure Data Storage – The BDAU has resources for securely storing and analysing de-identified data within its certified research environment, the BDAU SE.
  3. Data Analysis – The BDAU is able to partner with researchers to support quantitative analysis of data. We have a wide range of expertise in statistical programs, programming languages, and advanced statistical techniques such as natural language processing and machine learning.
  4. Data Visualisation – The BDAU helps with visualisation of data to aid in providing answers to research questions. This is mostly done using the interactive data visualisation tools such as Tableau and Power BI, but expertise is available in other statistical programs.

Our data sources

Below is the list of data sources for datasets commonly used by the BDAU SE users for specific approved research projects.

1. NHS England

  • Hospital Episode Statistics (HES) – HES is an administrative dataset that records the details of all inpatient, outpatient and A&E attendances at NHS hospitals in England. It has been widely used by researchers to assess usage levels by patients and costs incurred due to hospital treatment within the NHS in England.
  • Data Access – To access NHS England datasets such as HES, researchers need to complete the Data Access Request Service (DARS) process. If you are planning to use the BDAU SE for storage and analysis of HES data, please contact us so we can assist with completing the relevant sections of your DARS application. See more information about the DARS process and NHS England charges on the NHS Digital DARS webpage.

2. Medicines and Healthcare products Regulatory Agency (MHRA)

  • Clinical Practice Research Datalink (CPRD) - CPRD is a clinical dataset that is comprised of anonymised primary care records. Research using CPRD data has resulted in over 2,800 publications which have led to improvements in drug safety, best practice, and clinical guidelines. CPRD can be linked to other health-related patient datasets such as HES, ONS Death Registry and Socio-economic measures to provide a fuller picture of the patient care record.
  • Data Access – Imperial has a multi-study annual license agreement with CPRD, with trained fob-holders who can extract the data for researchers against specific study specifications, following protocol approval via CPRD's Research Data Governance (RDG) process. Please contact us at bdau@imperial.ac.uk when you are planning your project to discuss the process and the associated costs.

3. NHS Improvement

  • National Reporting & Learning System (NRLS) – NRLS is a central database of patient safety incidents from healthcare staff that occur in hospitals in England & Wales. NRLS contains 10 million coded records of patient safety incidents from 2004 onwards.
  • Data Access – The BDAU team has experience applying for, storing and analysing NRLS datasets for research purposes. If you are interested in using NRLS data for your research, please contact us at bdau@imperial.ac.uk.

4. Imperial College Healthcare NHS Trust

  • Imperial College Healthcare NHS Trust (ICHT) – ICHT is a database of patient health records collected within the Trust. Applicants can request to use the ICHT data in research, service evaluation or clinical audit.
  • Data Access – If you are interested in using data from ICHT, please visit the NIHR Imperial Biomedical Research Centre data webpage.

5. Open Data Sets

  • These are open-access datasets provided by national, regional and local government entities that can be used for any purpose. They mostly contain anonymised, aggregate-level data and can, by being linked with other datasets, help to provide deeper insights into healthcare and other issues. For example, see the UK Government Open Data Repository.

Please contact us at bdau@imperial.ac.uk for further information or if you have any questions.