Frequently asked questions
Imperial's Research Data Management policy
What is research data management?
Research data management describes the decisions made to ensure that data collected and used during a research project are properly cared for and maintained. It involves planning for data management needs at the start of the project, organising and storing data during the project, and preparing data for archiving and sharing beyond the end of the project. Good management of research data is beneficial to researchers and ensures that data remains available to validate claims made in published research.
What is ‘research data’?
Research data is the evidence that underpins all research conclusions (except those which are purely theoretical) and includes data that has been collected, observed, generated, created or obtained from commercial, government or other sources, for subsequent analysis and synthesis to produce original research results. These results are then used to generate research papers and submitted for publication.
Does the College have a research data management policy?
Imperial College London is committed to promoting the highest standards of academic research, including excellence in research data management. Our research data management policy is designed to give academics clear guidance on the principles of research data management, while allowing them flexibility to select the tools that best suit their approach.
What are my responsibilities as PI?
Principal investigators have overall responsibility for the effective management of research data generated within or obtained for their research, including by their research groups. This includes developing a data management plan (DMP) at the start of a project, ensuring that data is stored in a way that minimises the risk of loss and making shareable data publicly available where this is required to validate published research. See the College’s research data management policy for details. The Library and ICT will provide training, guidance and services to support PIs.
I am a PhD student. Does the policy apply to me?
PhD students are not required to comply with the College’s data management policy but are advised to follow the data management principles outlined in the College’s research data management policy and on the College web pages. They are also are encouraged to archive and, where appropriate, share the data that supports their published thesis findings by depositing their data with a trusted repository to improve the visibility of their research.
I have developed software as part of my research project. Does this count as data?
Like data, software can be a valuable research output in its own right. In many cases data cannot be fully understood without the software that was used to generate it. Where software is developed as part of a research project, PIs are required to archive the version of the software that was used to generate/analyse the data and to inform the Library of the location, using Symplectic. Software should be archived even when it cannot be made publicly available. Additional information is available on our web page Making research software open and shareable.
Where can I find out about funder requirements for managing and sharing research data?
Many funding bodies now require or expect researchers to archive and share data which supports published findings. Links to funder data policies are available on our web pages.
What is a data management plan?
A data management plan or DMP is a document that outlines how you will handle your data both during your project and after it is completed. Having a data management plan will help you identify risks to your data, clarify any legal and ethical issues and prepare for any additional costs and resources that might be needed. Many funders require a DMP as part of the grant application. See our web page on How to complete a data management plan for additional information and support.
Where can I store my research data?
The College's recommended platform for storing research data is the Research Data Store (RDS). PIs for projects with active research grants are eligible for 2TB online storage with additional storage available at cost. HPC users have access to 1TB of 'ephemeral storage' (30 days max).
Individual researchers and postgraduate students who don't have access to the RDS can use their College One Drive for Business account to store research data. All staff and students have access to 5TB online storage.
Where can I store sensitive data?
The College recommends that sensitive data is saved and shared using the cloud-service OneDrive for Business. We do not recommend using portable media or personal laptops/PCs to store sensitive data, but if this is unavoidable devices should be password protected and files encrypted. Make sure you have a back-up copy.
Where can I archive my data?
Data that support published findings and/or are considered to have long-term value should be deposited with an established data repository or community database to aid data preservation, sharing and reuse. Where possible, data should be deposited with a domain/subject specific repository. re3data.org is a registry of data repositories that allows you to search by subject.
If no subject repository is available, we recommend depositing with a general-purpose repository such as Zenodo or Figshare. Alternatively, you might consider depositing your data with the College's Research Data Repository.
Where can I publish my data?
If there are no restrictions on data sharing and the data can be made publicly available, we recommend depositing with a data repository (see FAQ ‘Where can I archive my data?’). Depositing with a reputable data repository will also ensure the long-term preservation of your data. Most data repositories will give you a Digital Object Identifier (DOI) for your dataset, enabling it to be cited and tracked like a regular publication.
Do I have to publish all my data?
Where possible, data required to validate or reproduce published results, or which has potential value for future research should be made public, but it is important to ensure that publication will not infringe any legal or ethical obligations, for example if the dataset contains personal or patient identifiable data, commercially sensitive information or third party copyright material. However, even sensitive data can be shared if appropriate safeguards and measures are put in place (see FAQ ‘My data contains sensitive information’).
Furthermore, in some cases - e.g. if the data are generated as a result of running computer simulations - it might be algorithm or script rather than the data that should be preserved and shared.
My data contains sensitive information. Does that mean I don’t have to share it?
Not all data can be shared openly without restriction. Data sharing must comply with legal and ethical obligations. Data that contain personally identifiable information must follow the requirements set out by the GDPR. However, even sensitive data can be shared if appropriate safeguards are in place to avoid unwarranted disclosure and protect data confidentiality e.g.
- include information about how the data will be shared in informed consent
- remove direct and indirect personal identifiers by means of anonymisation techniques
- deposit your data with a repository that provides restricted access to datasets (see this list of repositories that enable restricted access)
- if there are no suitable data repositories, use a data sharing agreement to specify who can access the data and under what terms and conditions
See our web page Sharing sensitive data for more information
I’m collaborating with another institution. Who owns the data and who is responsible for sharing it?
The College owns any IP created by College Employees in the course of their normal duties unless explicitly stated otherwise. If you are part of a multi-partnered research project with academic or commercial partners, agreements or legal contracts should be in place clarifying IP ownership and determining which data can be shared and which cannot. Guidance on contracts and intellectual property is available from the Research Office. Where there is potential for commercialisation, researchers are advised to contact the College’s Industry Partnerships and Commercialisation (IPC) team.
My analysis is based on data sourced from a third party. Am I allowed to publish this data?
If you have obtained data from a third party, you will need to check any licencing or contractual agreements to clarify whether or not the data can be shared.
When should I publish my data?
Data should be made available when the research that it supports is published. In some cases, it may be necessary to embargo the data for a reasonable period, for example when other outputs based on the data have not yet been published. The College recognises the PI’s entitlement to be the first to publish based on data they have generated. If the PI will not be able to publish their findings by the funder’s deadline for data sharing, the PI should request an extension of the embargo period from the funder.
What is a data access statement?
All research publications produced by Imperial authors must include a statement on how the underlying data can be accessed. This is in line with UKRI and individual funder policies. Our web page How to write a data access statement contains examples of data access statements.
What training and support does the College provide to help researchers manage their data?
Links to training and resources can be found on our web pages.