Folder structure

Well-organised file names and folder structures make it easier to find and keep track of data and other files. Having a logical file structure can prevent your files from becoming disorganised. When collaborating with others, a logical file structure is even more important. There are many ways to organise files; think carefully about the best structure for your project. Some useful tips include:

  • Structure files hierarchically. Start with fewer numbers of folders for broader topics at the higher level with folders for more specific topics nested within these
  • Keep the number of folders and sub-folders manageable. Aim for a balance between breadth and depth
  • Use folders to group files according to topics which will make it easier for you to browse and retrieve files. Examples include grouping files by data types (raw v processed), file types (data, publications, administrative documents) or research activity (experiments, surveys)
  • Folder names should be meaningful and concisely describe the contents of the folder
  • Keep ongoing and completed work separated e.g. move final versions into separate folders from drafts

The UK Data Service have an example file structure

File naming conventions

Good file names can provide useful clues about the content and status of a file, can uniquely identify a file, and can help in classifying files. A good file naming strategy can also help prevent files being accidentally deleted or overwritten.

Consider which elements will help you quickly identify the content of your files. Common elements to consider using in a file naming strategy include:

  • Date (date format should be YYYY-MM-DD)
  • Researcher initials
  • Project identifier
  • Description of content
  • Version number

Some general principles to follow when naming files include:

  • Be consistent with your file and folder naming
  • Keep file names short but meaningful
  • Avoid using special characters such as ?!@*%{[<>)
  • Avoid using periods, spaces or slashes to separate characters in a file name. Use hyphens or underscores or capitalise using CamelCase instead
  • Order the elements in a file name in the most appropriate way to retrieve the record

 File naming examples:

  • 20130311_interview2_audio.wav
  • 20130311_interview2_trans.rtf
  • 20130311_interview2_image.jpg
  • 20160104_ProjectA_Ex1Test1_SmithE_v1.xlsx
  • 20160104_ProjectA_MeetingNotes_SmithE_v2.docx

 

Sources CESSDA Data Management Expert Guide: File naming and folder structure and  Harvard Biomedical Data Management: File Naming Conventions

 Versioning

Version control is a way of recording changes made to files during the lifetime of a project. It enables you to distinguish between current and older versions of data. Version control also ensures that research publications cite the exact version of the dataset that underpins published results.

There are various ways file versioning can be managed, including:

  • File naming system

A simple method of data versioning is to use consecutive numbering to indicate whether a change is major or minor with decimals used for minor changes (e.g. v1; v1.1; v2; v2.1; v2.2).

  • Version control table

If multiple people are collaboratively working on the same file decide on a common versioning strategy and consider including a version control table within each file where versions, dates, authors and details of changes can be recorded. The UK Data Service have an example of a version control table.

  • File sharing services

Some software programs (e.g. Microsoft Office) and file sharing services (e.g. One Drive for Business) provide automatic version control which can also be useful when working on files collaboratively.

  • Version control software

There are also dedicated version control tools available such as Git described in a blog post by Open Knowledge Foundation: Git (and Github) for Data or TortoiseSVN. These are especially useful for working collaboratively on code or software.

Additional resources