Abstract:

This talk will show how statistical methods enables noise modeling and enhance the interpretation of 16s rRNA data. We should some examples of​ ​reproducible research we performed to predict preterm birth using data from a longitudinal analysis of vaginal microbiome. A multiplicity of choices and lack of consistent documentation at each stage of the sequential processing pipeline for the microbiome can lead to spurious results. We propose its replacement with reproducible and documented iterations using R packages dada2,​ ​knitr, phyloseq, ggplot2, ade4 and lme4. We were able to find specific microbial biomarkers of preterm birth which were validated on a separate set of patients. This is joint work with Ben J Callahan, PJ McMurdie and David Relman’s ​group at Stanford​.