QIIME 2用户文档. 16鉴定和过滤嵌合体序列q2-vsearch(2018.11)

版权声明:本文为“宏基因组”公众号原创,未经博主允许不得转载。 https://blog.csdn.net/woodcorpse/article/details/86676899

前情提要

鉴定和过滤嵌合体序列q2-vsearch

Identifying and filtering chimeric feature sequences with q2-vsearch

https://docs.qiime2.org/2018.11/tutorials/chimera/

注:最好按本教程顺序学习,想直接学习本章,至少完成本系列《1简介和安装》

在QIIME 2中进行嵌合体检验基于FeatureTable[Frequency]FeatureData[Sequences]对象。QIIME 2内嵌了vsearch的Uchime无参(de novo)和有参(reference)去嵌合体流程。对于此过程的细节,详见Uchime的论文和vsearch的帮助文档。(推荐USEARCH软件主页有比较详细的教程,vsearch帮助读起来不方便)

本节使用《6沙漠土壤分析Atacama soil》中的特征表。

数据下载

Obtain the data

mkdir qiime2-chimera-filtering-tutorial
cd qiime2-chimera-filtering-tutorial

wget \
  -O "atacama-table.qza" \
  "https://data.qiime2.org/2018.11/tutorials/chimera/atacama-table.qza"

wget \
  -O "atacama-rep-seqs.qza" \
  "https://data.qiime2.org/2018.11/tutorials/chimera/atacama-rep-seqs.qza"

image

无参嵌合体鉴定

Run de novo chimera checking

qiime vsearch uchime-denovo \
  --i-table atacama-table.qza \
  --i-sequences atacama-rep-seqs.qza \
  --output-dir uchime-dn-out

输入对象:

  • atacama-rep-seqs.qza: 代表序列
  • atacama-table.qza: 特征表
  • uchime-dn-out/nonchimeras.qza: 去嵌合序列
  • uchime-dn-out/chimeras.qza: 嵌合序列
  • uchime-dn-out/stats.qza: 统计

注:基于参考序列(有参,Reference-based)的嵌合体鉴定方法详见vsearch uchime-ref

可视化统计结果

Visualize summary stats

qiime metadata tabulate \
  --m-input-file uchime-dn-out/stats.qza \
  --o-visualization uchime-dn-out/stats.qzv

输入对象:

  • uchime-dn-out/stats.qzv

image

过滤特征表和序列

Filter input tables and sequences

过滤嵌合体和可疑序列

Exclude chimeras and “borderline chimeras”

qiime feature-table filter-features \
  --i-table atacama-table.qza \
  --m-metadata-file uchime-dn-out/nonchimeras.qza \
  --o-filtered-table uchime-dn-out/table-nonchimeric-wo-borderline.qza
qiime feature-table filter-seqs \
  --i-data atacama-rep-seqs.qza \
  --m-metadata-file uchime-dn-out/nonchimeras.qza \
  --o-filtered-data uchime-dn-out/rep-seqs-nonchimeric-wo-borderline.qza
qiime feature-table summarize \
  --i-table uchime-dn-out/table-nonchimeric-wo-borderline.qza \
  --o-visualization uchime-dn-out/table-nonchimeric-wo-borderline.qzv

输入对象:

  • uchime-dn-out/rep-seqs-nonchimeric-wo-borderline.qza
  • uchime-dn-out/table-nonchimeric-wo-borderline.qza

输入可视化结果:

  • uchime-dn-out/table-nonchimeric-wo-borderline.qzv

过滤嵌合但保留可疑序列

Exclude chimeras but retain “borderline chimeras”

qiime feature-table filter-features \
  --i-table atacama-table.qza \
  --m-metadata-file uchime-dn-out/chimeras.qza \
  --p-exclude-ids \
  --o-filtered-table uchime-dn-out/table-nonchimeric-w-borderline.qza
qiime feature-table filter-seqs \
  --i-data atacama-rep-seqs.qza \
  --m-metadata-file uchime-dn-out/chimeras.qza \
  --p-exclude-ids \
  --o-filtered-data uchime-dn-out/rep-seqs-nonchimeric-w-borderline.qza
qiime feature-table summarize \
  --i-table uchime-dn-out/table-nonchimeric-w-borderline.qza \
  --o-visualization uchime-dn-out/table-nonchimeric-w-borderline.qzv

输入对象:

  • uchime-dn-out/table-nonchimeric-w-borderline.qza
  • uchime-dn-out/rep-seqs-nonchimeric-w-borderline.qza

输入可视化结果:

  • uchime-dn-out/table-nonchimeric-w-borderline.qzv

Reference

Bolyen E, Rideout JR, Dillon MR, Bokulich NA, Abnet C, Al-Ghalith GA, Alexander H, Alm EJ, Arumugam M, Asnicar F, Bai Y, Bisanz JE, Bittinger K, Brejnrod A, Brislawn CJ, Brown CT, Callahan BJ, Caraballo-Rodríguez AM, Chase J, Cope E, Da Silva R, Dorrestein PC, Douglas GM, Durall DM, Duvallet C, Edwardson CF, Ernst M, Estaki M, Fouquier J, Gauglitz JM, Gibson DL, Gonzalez A, Gorlick K, Guo J, Hillmann B, Holmes S, Holste H, Huttenhower C, Huttley G, Janssen S, Jarmusch AK, Jiang L, Kaehler B, Kang KB, Keefe CR, Keim P, Kelley ST, Knights D, Koester I, Kosciolek T, Kreps J, Langille MG, Lee J, Ley R, Liu Y, Loftfield E, Lozupone C, Maher M, Marotz C, Martin BD, McDonald D, McIver LJ, Melnik AV, Metcalf JL, Morgan SC, Morton J, Naimey AT, Navas-Molina JA, Nothias LF, Orchanian SB, Pearson T, Peoples SL, Petras D, Preuss ML, Pruesse E, Rasmussen LB, Rivers A, Robeson, II MS, Rosenthal P, Segata N, Shaffer M, Shiffer A, Sinha R, Song SJ, Spear JR, Swafford AD, Thompson LR, Torres PJ, Trinh P, Tripathi A, Turnbaugh PJ, Ul-Hasan S, van der Hooft JJ, Vargas F, Vázquez-Baeza Y, Vogtmann E, von Hippel M, Walters W, Wan Y, Wang M, Warren J, Weber KC, Williamson CH, Willis AD, Xu ZZ, Zaneveld JR, Zhang Y, Zhu Q, Knight R, Caporaso JG. 2018. QIIME 2: Reproducible, interactive, scalable, and extensible microbiome data science. PeerJ Preprints 6:e27295v2 https://doi.org/10.7287/peerj.preprints.27295v2

译者简介

刘永鑫,博士。2008年毕业于东北农大微生物学专业。2014年中科院遗传发育所获生物信息学博士学位,2016年博士后出站留所工作,任宏基因组学实验室工程师,目前主要研究方向为宏基因组数据分析和植物微生物组。QIIME 2项目参与人,目前发于论文12篇,SCI收录9篇。2017年7月创办“宏基因组”公众号,目前分享宏基因组、扩增子原创文章500+篇,代表博文有《扩增子图表解读、分析流程和统计绘图三部曲》,关注人数3.5万+,累计阅读500万+。

猜你喜欢

写在后面

为鼓励读者交流、快速解决科研困难,我们建立了“宏基因组”专业讨论群,目前己有国内外5000+ 一线科研人员加入。参与讨论,获得专业解答,欢迎分享此文至朋友圈,并扫码加主编好友带你入群,务必备注“姓名-单位-研究方向-职称/年级”。技术问题寻求帮助,首先阅读《如何优雅的提问》学习解决问题思路,仍末解决群内讨论,问题不私聊,帮助同行。
image

学习扩增子、宏基因组科研思路和分析实战,关注“宏基因组”
image

image

点击阅读原文,跳转最新文章目录阅读
https://mp.weixin.qq.com/s/5jQspEvH5_4Xmart22gjMA

猜你喜欢

转载自blog.csdn.net/woodcorpse/article/details/86676899