Correcting Batch Effects in Microbiome Data

Batch Effects in 16S Datasets Complicate Cross-Study Comparisons

High-throughput data generation platforms, like mass-spectrometry, microarrays, and second-generation sequencing are susceptible to batch effects due to run-to-run variation in reagents, equipment, protocols, or personnel. Currently, batch correction methods are not commonly applied to microbiome sequencing datasets. In this paper, we compare different batch-correction methods applied to microbiome case-control studies. We introduce a model-free normalization procedure where features (i.e. bacterial taxa) in case samples are converted to percentiles of the equivalent features in control samples within a study prior to pooling data across studies. We look at how this percentile-normalization method compares to traditional meta-analysis methods for combining independent p-values and to limma and ComBat, widely used batch-correction models developed for RNA microarray data. Overall, we show that percentile-normalization is a simple, non-parametric approach for correcting batch effects and improving sensitivity in case-control meta-analyses.

You can read more about this work in our recent PloS Computational Biology article.

The code for running percentile normalization is available on github and can be applied as a QIIME2 plugin.

Recent Articles

  • Overall composition of gut microbiome in participants

    Variations in the Microbiome Associated with Health, Disease

    ISB researchers examined the associations between the gut microbiomes of about 3,400 people and roughly 150 host characteristics. The team looked at diet, medication use, clinical blood markers, and other lifestyle and clinical factors, and found evidence that variations of the gut microbiome are associated with health and disease.

  • Tracking population health through waste water

    Answering Nature’s Call: How Scientists Are Mining Sewage To Track Population Health

    Everybody pees and poops. What if there was a way to use the byproducts of our everyday bodily functions to understand the general health of a population? That is exactly what MIT’s Dr. Eric Alm is pursuing. In an ISB-Town Hall Seattle live stream, Alm discussed the promise of this novel form of public health tracking. 

  • Nick Bohmann Joins the Lab

    Nick Bohmann joins the lab as a Ph.D. student from the Molecular Engineering and Sciences program at UW. Nick graduated from Virginia Tech in 2019 with B.S. in Biological Systems Engineering. His research interests include genome-scale metabolic modeling and ‘omics based computational biology, specifically related to the human gut microbiome. Nick’s work in the lab will focus on using computational tools to enhance the predictive capability of models of the…