Supervised by Dr Gonçalo Correia 

I came to do UROP through my third-year project at the Institute of Reproductive and Developmental Biology. This project enabled me to understand the metabolomics of cervical vaginal fluid (CVF) and the use of liquid chromatography-mass spectrometry (LC-MS/MS) to characterise the vaginal environment. I was familiarised with the challenges of analysing complex metabolomics datasets and wanted to continue developing the computational skills needed to address these problems. Therefore, I applied for a UROP in the same research group. 

My previous project combined laboratory sample preparation and computational analysis. UROP focused more on data processing and annotation. I have received basic training in LC-MS data processing, and I can undertake further computing tasks. My preparation involved reviewing the datasets I had previously worked with and revising R-based workflows for data handling. I was also familiarised with annotation software including SIRIUS and NIST Mass Spectral Search, which were central to my UROP work.  

During UROP, I carried out several tasks focusing on computational metabolomics. Through this work I gained specific technical skills, I was more proficient in handling large annotation tables with R by using packages such as dplyr and ggplot2 for filtering and visualisation. I have learned how to critically evaluate annotation scores from different software tools, as well as how to integrate results from multiple pipelines into a single dataset. I also developed an approach to determine the thresholds for confidence and ensured annotation coverage against false positive results. 

During UROP, I carried a lot of dry lab skills and benefited from discussions with my supervisor, this taught me how to interpret the results of computational analysis biologically. We examined pie charts of chemical classes reinforced the importance of lipids, saccharides, and steroids as key groups of metabolites in CVF, and we also compared inclusion versus non-inclusion datasets also demonstrated how acquisition design can influence metabolite coverage, which is a consideration directly relevant to future data-dependent acquisition strategies. 

This UROP has influenced how I approach research in future. It gave me direct experience on advanced computational workflows for metabolomics, and reinforced my interest in data-driven biomedical research. For future research, I aim to build on this experience in further study, combining computational analysis with experimental approaches to investigate host–microbiome interactions.