Co-localization data and environment preparation

Co-localization data and environment preparation

1. Data preparation
If you need to do eqtl-GWAS co-localization, you need to follow the drug target tutorial and put the eqtl data in the smr directory.
If it is a pure GWAS-GWAS co-localization that involves local data, you need to organize it. into template SNP format, and requires chr, pos, samplesize

GWAS of online data will be processed using the ieugwasr and gwasglue packages.

If the local data does not have a samplesize column after downloading, you can use the following code to add it.

Add samplesize to local data

add_samplesize("文件名.txt", 10000)



2. R package environment preparation
After upgrading MendelR to 6.0 or above, you can use the prepare_colocalization() method to prepare related R packages,
mainly:
snpStats (used to analyze snp data)
coloc (main R package for co-localization, using Bayesian Method)
locuscomparer (visualization package)
ieugwasr (interface for extracting data from mrbase database)
gwasglue (processing of online ieu data)

 

 

prepare_colocalization()#准备共定位相关的包


The MendelR package integrates the data preprocessing of the above two methods, as well as the judgment and prompts of various boundary conditions. If there is local data when using it, prepare the data in the working directory, and then use one line of code to analyze the desired results.

For specific usage, please go to:

 eQTL-GWAS one-click analysis

mr_coloc_eqtl2gwas("HMGCR", "ieu-a-300")

        

     1. Data preparation

       1.eqtl data, stored in the smr directory

       2.gwas data, divided into online and local,

        Online data does not need to be processed in any way, the code has been made compatible

        Local data needs to be organized according to the template SNP format, and needs to have chr, pos, samplesize

process:

1. From eqtl data or this part of gene SNP

2. Based on this part of SNP, obtain the SNP corresponding to the range of online data or local data

3. Organize the data into coloc format

4. Perform coloc analysis and visualization

3. Parameter explanation

?mr_coloc_eqtl2gwas

Usage: similar to the code of smr one-click analysis

Special attention should be paid to:

gwas_type: The type of gwas phenotype cc is a categorical variable and quant is a continuous variable

gwas_s: represents the ratio of case/samplesize, such as case1000, samplesize200000, s=1000/200000=0.005

eqtl_samplesize: sample size of eqtl data, default eqtlGen 31864



三、共定位引用
locuscomparer画图:
https://github.com/boxiangliu/locuscomparer
If you use locuscompare, please cite the following paper: Abundant associations with gene expression complicate GWAS follow-up | Nature Genetics
Boxiang Liu, Michael J. Gloudemans, Abhiram S. Rao, Erik Ingelsson & Stephen B. Montgomery (2019) Abundant associations with gene expression complicate GWAS follow-up, Nature Genetics

coloc
https://github.com/chr1swallace/coloc

四、图例说明

image.png


The picture on the left represents the distribution of SNP in GWAS and QTL -log10(p). The smaller the p value, the higher it is above the Y-axis. The two separate tables on the right
represent the distribution of QTL and GWAS themselves (the abscissa is snp pos position)
The ordinate represents the -log10(p) value of the SNP in the GWAS/QTL data. The higher the value, the smaller the p value. The lead SNP is at the top.
R2 is the degree of linkage between a certain SNP and the lead SNP in the corresponding population.
It mainly displays the linkage disequilibrium of SNPs in the data
. The marked SNP is the largest value of PPH4 in the two data. Specifically, each SNP has data corresponding to H1~H4. Check the rsid marked in the co-localization data in the results

. , is the minimum value of the sum of pval in the two data, that is, leadSNP

 

Guess you like

Origin blog.csdn.net/weixin_46587777/article/details/132368982