In this section

Introduction to research data management

What is research data management?

Research data management refers to how you will look after the data you collect or generate during your research. It covers activities such as planning for your data management needs at the start of your project, organising, storing and securing data during your project and ensuring long-term preservation, data sharing and reuse at the end of your project.

Why manage research data?

Data management is increasingly recognised as an essential part of good research practice. Responsibly managed data is important for research integrity, transparency and open science. Many funders now expect data that supports published findings or has potential for reuse in future research to be made publicly available with as few restrictions as possible whenever legal and ethical restrictions allow.

The benefits of research data management include: 

Reduces the risk of data loss
Makes it easier to find and understand data
Helps make data authentic, accurate and reliable
Improves research integrity and reproducibility
Facilitates data sharing and reuse
Enables compliance with funder and publisher policies

What are research data?

Research data are any materials that you collect or generate during your research project that can be used to support or verify your research findings. The UKRI Concordat on Open Research Data defines research data as ‘… the evidence that underpins the answer to the research question, and can be used to validate findings regardless of its form (e.g. print, digital, or physical)'. Research data can be generated or collected for different purposes and through different processes:

Observational: data captured in real-time, usually irreplaceable e.g. sensor data, survey data, sample data, neuroimages
Experimental: data from laboratory equipment, often reproducible, but can be expensive to reproduce e.g. gene sequences, chromatograms, toroid magnetic field data
Simulation: data generated from test models where the model and metadata are more important than output data e.g. climate models, economic models
Derived or compiled: data is reproducible, but expensive e.g. text and data mining, compiled database, 3D models
Reference or canonical: a (static or organic) conglomeration or collection of smaller (peer-reviewed) datasets most probably published and curated e.g. gene sequence databanks, chemical structures, spatial data portals

Examples of research data include:

text documents
spreadsheets
audio and video recordings
photographs, films
collections of digital objects
questionnaires, transcripts of interviews
sensor readings
models, algorithms, scripts

Having a clear understanding of the types of data you will collect or generate will help you make informed decisions about managing your data effectively.

Where can I find additional help and support?

Contact the Research Data Management team by booking a 1-2-1 consultation or send us an email at rdm-enquiries@imperial.ac.uk.

What does my funder require?

Find out what your funder requires in relation to research data management

What does my publisher require?

Find out what your publisher requires in relation to research data management

Imperial RDM Policy

Read the Imperial College London research data management policy

Read Imperial's research data management policy