In this section

Bioinformatics

The bioinformatics team can provide routine support for your project as a service, and undertake more complex analyses on a collaborative basis.

Experimental design

We can advise on experimental design and appropriate choice of assays and technology. Please contact us when planning your study if you would like to discuss suitable NGS platforms, numbers of replicates, sample size estimates or options for downstream analysis.

Standard data processing and QC pipelines

Sequencing data generated at the IGF will automatically be processed with our standard NGS pipelines which are primarily intended for quality control purposes but also provide output files including genomic alignments that may be useful in downstream analyses. Briefly, we demultiplex the raw Illumina data using bcl2fastq, remove generic adapters and generate fastq files. These files are then assessed against standard quality metrics using FastQC, FastQ Screen and MultiQC and the results summarised in an online report.

RNA-seq datasets are further trimmed with Fastp and aligned to the genome with STAR, gene counts are generated with FeatureCounts and transcripts quantified are with RSEM. Post-alignment quality metrics are collated from the post-alignment log files with MultiQC. For 10X single-cell data we run the standard Cellranger pipeline, collate appropriate Picard and Samtools quality metrics, and provide Scanpy QC and clustering data.

Mammalian DNA-seq datasets are further trimmed with Fastp, aligned to the genome with BWA and processed with Picard and Samtools to mark duplicates, add read groups and generate quality metrics that are summarised with MultiQC. For ChIP-seq and similar epigenomic sequencing, we also include metrics from Phantompeakqualtools and deepTools.

Data analysis

We carry out custom analyses on NGS and microarray data generated at the IGF and elsewhere. Common examples include differential expression and pathway analysis of RNA-seq data, variant calling from whole genome, exome or panel sequencing data, detection of somatic mutations in cancer data, and peak finding in ChIP-seq experiments. We can also help with downloading large datasets from (and submitting your own data to) public repositories, and applying for and managing controlled access data. Please contact us for further details.

Training

We can provide advice on using bioinformatics tools, databases, online resources and pipelines appropriate to your research and more formal training on commonly used NGS analysis techniques in periodic taught courses.

General enquiries

For general enquiries, please email igf@imperial.ac.uk

Useful links

Resources Guidelines Coverage Calculator 10X Genomics Video Tutorials

Funded by NIHR

Bioinformatics

Bioinformatics

General enquiries

Useful links

Faculty of Medicine

Find us on social media