Paper intensive reading (二十):Batch effects correction for microbiome data

论文题目:Batch effects correction for microbiome data with Dirichlet-multinomial regression

scholar 引用:6

页数:8

发表时间:23 August 2018

发表刊物:Bioinformatics

作者:Zhenwei Dai1,2, Sunny H. Wong1,2, Jun Yu1,2 and Yingying Wei3,* 香港中文大学

摘要:

Motivation: Metagenomic sequencing techniques enable quantitative analyses of the microbiome. However, combining the microbial data from these experiments is challenging due to the variations between experiments. The existing methods for correcting batch effects do not consider the interactions between variables—microbial taxa in microbial studies—and the overdispersion of the microbiome data. Therefore, they are not applicable to microbiome data.
Results: We develop a new method, Bayesian Dirichlet-multinomial regression meta-analysis (BDMMA), to simultaneously model the batch effects and detect the microbial taxa associated with phenotypes. BDMMA automatically models the dependence among microbial taxa and is robust to the high dimensionality of the microbiome and their association sparsity. Simulation studies and real data analysis show that BDMMA can successfully adjust batch effects and substantially reduce false discoveries in microbial meta-analyses.
微生物荟萃分析

Discussion:

  • BDMMA captures the characteristics of metagenomic data and considers the dependence between microbial taxa. 该方法特点
  • As a result, BDMMA dramatically reduces the number of false discoveries and substantially improves the detection of associations compared with existing meta-analysis approaches. 相对于当前其他方法的优势
  • as shown by both the simulation studies and the application to CRC metagenome studies, BDMMA is able to identify the small set of taxa that are truly associated with the phenotypes with very low false discovery rates and high recalls. 在模拟数据集和真实数据集上均有测试
  • In our project, we focused on the shotgun metagenomic sequencing data for analyses. 16S数据应该也差不多
  • Previous research has demonstrated that the DM distribution is suitable for the genus-level data 已知
  • the DM distribution may not be a good model for the OTU-level 16S data because of its high sparsity. 可能不适用于OTU级别的16Sdata
  • Therefore, we would not recommend applying BDMMA to gene annotation profiles. 不建议应用于基因注释谱
  • BDMMA requires the batch information to be known for all the samples. 后续他们也许会研究基因批次信息缺失的情况
  • We envision that BDMMA will greatly facilitate meta-analysis of microbiome studies, especially for large consortium projects such as the American Gut Project and the MetaHIT project, which will ultimately improve disease diagnostics and treatments. 作者对该方法的前景预估

Introduction:

  • The different reagents, labs, platforms, or even just personnel, can all cause variations between batches. 
  • 一般常用的批次效应处理方法,如ComBat,SVA不适用于微生物,as they assume that different microbial taxa are independent. 
  • 原因: However, microbiome sequencing techniques generate count data that represent compositions. As a result, the read counts for different taxa are dependent.
  • 对这个问题,研究者两种方法处理:Usually, researchers either convert the raw read counts to species proportions or rarefy the read counts of all of the samples to the same total read counts.
  • The DM model addresses the overdispersion in microbial count data and considers the dependence among microbial taxa. 大量研究工作证明了DM模型在微生物领域的有效性
  • the mixed effect model (Laird and Ware, 1982), combined P-values based on the weighted Z-test (Zaykin, 2011), and ComBat (Johnson et al., 2007) 与三种方法对比

正文组织架构:

扫描二维码关注公众号,回复: 9728280 查看本文章

1. Introduction

2. Materials and methods

2.1 The BDMMA model

2.2 The effect of the abundance of associated microbial taxa

3. Results

3.1 Data description

3.2 Simulation studies

      3.2.1 Comparison with existing methods

      3.2.2 Sensitivity to hyperparameters

      3.2.3 Sensitivity to the abundance of associated microbial taxa

      3.2.4 Sensitivity to over-dispersion

3.3 Real data analysis

4. Discussion

发布了273 篇原创文章 · 获赞 16 · 访问量 2万+

猜你喜欢

转载自blog.csdn.net/wxw060709/article/details/104179524