News

Correcting Batch Effects in Microbiome Data

Batch Effects in 16S Datasets Complicate Cross-Study Comparisons

High-throughput data generation platforms, like mass-spectrometry, microarrays, and second-generation sequencing are susceptible to batch effects due to run-to-run variation in reagents, equipment, protocols, or personnel. Currently, batch correction methods are not commonly applied to microbiome sequencing datasets. In this paper, we compare different batch-correction methods applied to microbiome case-control studies. We introduce a model-free normalization procedure where features (i.e. bacterial taxa) in case samples are converted to percentiles of the equivalent features in control samples within a study prior to pooling data across studies. We look at how this percentile-normalization method compares to traditional meta-analysis methods for combining independent p-values and to limma and ComBat, widely used batch-correction models developed for RNA microarray data. Overall, we show that percentile-normalization is a simple, non-parametric approach for correcting batch effects and improving sensitivity in case-control meta-analyses.

You can read more about this work in our recent PloS Computational Biology article.

The code for running percentile normalization is available on github and can be applied as a QIIME2 plugin.

Recent Articles

  • Using Blood to Predict Gut Microbiome Diversity

    Predicting the alpha diversity of an individual’s gut microbiome is possible by examining metabolites in the blood. The robust relationship between host metabolome and gut microbiome diversity opens the door for a fast, cheap and reliable blood test to identify individuals with low gut diversity.

  • Use and abuse of correlations

    We recently published a Perspective Article in the ISME Journal on the ‘Use and abuse of correlation analyses in microbial ecology.’ In this piece, we highlight the pitfalls of inferring microbe-microbe interactions from sequencing data. The lead author, Alex Carr, wrote a blog post titled ‘Inferring microbial interactions from relative abundance: not as easy as you would think’ detailing his inspiration for writing this perspective. You can check out the…

  • Seeing the microbiome through a host lens

    Sean recently published a commentary in the journal mSystems that outlines a vision of defining ‘microbiome health’ through a host lens: i.e. determining what exact components of the variation in the microboita influence host phenotypes. Much of the variation in the microbiome likely has nothing to do with the health state of the host, but loss/gain of critical diversity and/or functionality can have a major impact on host health. To…