Research data is the evidence that underpins all research conclusions (except those which are purely theoretical) and includes data that have been collected, observed, generated, created or obtained from commercial, government or other sources, for subsequent analysis and synthesis to produce original research results. These results are then used to generate research papers and submitted for publication.



Research data can be generated or collected for different purposes and through different processes:

  • Observational: data captured in real-time, usually irreplaceable e.g. sensor data, survey data, sample data, neuroimages
  • Experimental: data from laboratory equipment, often reproducible, but can be expensive to reproduce e.g. gene sequences, chromatograms, toroid magnetic field data
  • Simulation: data generated from test models where the model and metadata are more important than output data e.g. climate models, economic models
  • Derived or compiled: data is reproducible, but expensive e.g. text and data mining, compiled database, 3D models
  • Reference or canonical: a (static or organic) conglomeration or collection of smaller (peer-reviewed) datasets most probably published and curated e.g. gene sequence databanks, chemical structures, spatial data portals


Research data comes in many varied formats:

  • Discipline specific – Flexible Image Transport System (FITS) in astronomy, Crystallographic Information File (CIF) in chemistry 
  • Instrumental specific – Olympic Confocal Microscope Data Format, Carl Zeiss, Digital Microscopic Image Format (ZVI)
  • Models – 3D, statistical
  • Multimedia – jpeg, tiff, dicom, mpeg, quicktime
  • Numerical – Statistical Package for the Social Sciences (SPSS), Stata, Excel
  • Software – Java, C
  • Text – flat text files, Word, Portable Document Format (PDF), Rich Text Format (RTF), Extensible Mark-up Language (XML)


Research data (traditional and electronic research) may include all of the following:

  • Audiotapes, videotapes
  • Collection of digital objects acquired and generated during the process of research
  • Contents of a software application (input, output, logfiles for analysis software, source code, schemas)
  • Data files
  • Database contents (video, audio, text, images)
  • Documents (text, Word), spreadsheets 
  • Laboratory notebooks, field notebooks, diaries 
  • Methodologies and workflows
  • Models, algorithms, scripts
  • Photographs, films
  • Questionnaires, transcripts, codebooks 
  • Slides, artefacts, specimens, samples 
  • Standard operating procedures and protocols
  • Test responses

The following research records may also be important to manage during and beyond the life of a project:

  • Correspondence (email and paper-based)
  • Project files
  • Grant applications
  • Ethics applications
  • Technical reports
  • Research reports
  • Master lists
  • Signed consent forms