Background

The Research Computing and Data Science Programme at the Graduate School received funding from the Imprerial Fund for Learning and Teaching Innovation to build a pilot for a learning resource - Research Computing and Data Science Exemplars (ReCoDE).

ReCoDE supports a critical need for additional intermediate-level teaching in the research software space that goes beyond basic instruction to provide skills and expertise in developing sustainable research software - a skill that underpins nearly all research disciplines. It offers examples and guidance in the use of software engineering techniques, and related best practices for making research software robust and maintainable as well as supporting reproducibility of research outputs.

ReCoDE consist of an online collection of research computing or data science exemplars. These are stand-alone exemplary projects that can be studied from rich annotation and code in GitHub. They were developed by graduate teaching assistants in collaboration with RSEs from the RCS. The exemplars pull together a variety of skills needed to make a transition from theoretical learning to developing a large software-based project. 

Our Contribution

For about 2 months, each member of the RSE team partner with one of the postgraduate students enrolled in the programme to support improving the quality of the code as well as the student’s software engineering skills. Each project had its own requirements but generally they involved having regular meetings to discuss the implementation of the solutions, reviewing pull request, and testing the software to validate installation and usage instructions, as well as the results.

The projects were very diverse both in their scientific background and in the technologies they used. Some of the topics covered included using containers for increased reproducibility, data analysis pipelines with Nextflow, machine learning using PyTorch at Imperial’s HPC or optimisation of Fortran code for efficient calculations, among others. This variety of topics, to some extent, challenged the skills of the RSE Team itself and encourage to learn new tools to provide appropriate advice to the students.

Outcomes

The result of this pilot initiative was the publication of 5 exemplar projects, following the best software engineering standards. They were designed as a learning tool for future students as much as a reference for other projects that might want to adopt the same approaches, and obviously also as valuable research tools used in active research activities. The project webpage contains more information about the outcomes of the project.

Testimonials

Dr Katerina Michalickova, Head of Research Computing and Data Science Programme Graduate School:

When I started thinking about the ReCoDE proposal, I decided almost immediately to involve the RSEs. Their input was to be essential to the idea behind ReCoDE, which showcases best practices in developing research computing projects. I was also excited about providing an opportunity for doctoral students to directly collaborate with experts and vice versa. Providing this kind of learning experience became the first positive outcome of ReCoDE. Now, when the pilot is completed, I am very happy to report that the idea worked well!

 Bethan Cracknell Daniels, Research Postgraduate, School of Public Health:

As a self-taught PhD student, I have inevitably picked up some bad coding habits. Working with Diego was therefore a fantastic opportunity to refine my current programming skills and learn new ones. In particular, he helped me ensure my code is functional and easy to test, as well as provided me with several helpful resources which I have continued to use in my PhD.

Tom Hodson, Research Postgraduate, Physics:

Dan from the RSE Team was really helpful in providing a link to the wider body of best practices in research software engineering. Without him it would have been much more difficult to validate what we were doing with the project and to spot whether we were missing anything obvious.

Emily Muller, Research Postgraduate, School of Public Health:

I enjoyed working on my ReCoDE exemplar project as it meant creating something which can be useful for others. A lot of the content was coming from my research but working with the Graduate School to make it an online resource made my work a thousand times better. Together Adrian, we dug into the details of formatting, testing, documentation, and modularisation. All of which I have transferred to my research moving forwards. I wish I had learnt of all this in my first year, but alas, such is the way of the PhD.