Correcting Batch Effects in Microbiome Data

Batch Effects in 16S Datasets Complicate Cross-Study Comparisons

High-throughput data generation platforms, like mass-spectrometry, microarrays, and second-generation sequencing are susceptible to batch effects due to run-to-run variation in reagents, equipment, protocols, or personnel. Currently, batch correction methods are not commonly applied to microbiome sequencing datasets. In this paper, we compare different batch-correction methods applied to microbiome case-control studies. We introduce a model-free normalization procedure where features (i.e. bacterial taxa) in case samples are converted to percentiles of the equivalent features in control samples within a study prior to pooling data across studies. We look at how this percentile-normalization method compares to traditional meta-analysis methods for combining independent p-values and to limma and ComBat, widely used batch-correction models developed for RNA microarray data. Overall, we show that percentile-normalization is a simple, non-parametric approach for correcting batch effects and improving sensitivity in case-control meta-analyses.

You can read more about this work in our recent PloS Computational Biology article.

The code for running percentile normalization is available on github and can be applied as a QIIME2 plugin.

Recent Articles

  • Dr. Christian Diener, postdoc in ISB's Gibbons Lab.

    New Modeling Tool Allows Microbiome Researchers to Map Community Ecology to Ecosystem Function

    A promising new open-source metabolic modeling tool provides microbiome researchers a path forward in predicting ecosystem function from community structure. News of the software package, called MICOM, was developed in part by researchers in ISB’s Gibbons Lab, and its uses were published in the journal mSystems.

  • Using Blood to Predict Gut Microbiome Diversity

    Predicting the alpha diversity of an individual’s gut microbiome is possible by examining metabolites in the blood. The robust relationship between host metabolome and gut microbiome diversity opens the door for a fast, cheap and reliable blood test to identify individuals with low gut diversity.

  • Use and abuse of correlations

    We recently published a Perspective Article in the ISME Journal on the ‘Use and abuse of correlation analyses in microbial ecology.’ In this piece, we highlight the pitfalls of inferring microbe-microbe interactions from sequencing data. The lead author, Alex Carr, wrote a blog post titled ‘Inferring microbial interactions from relative abundance: not as easy as you would think’ detailing his inspiration for writing this perspective. You can check out the…