R data analysis: the principle and practice of Mendelian randomization intermediary II

delta method

After the above process is run through, for the mediation analysis, we need to report the estimated value and confidence interval of the indirect effect, as well as the estimated value and confidence interval of the mediation ratio, similar to the following:

But in fact, we can’t get the above required values ​​(such as the standard error of the indirect effect and the standard error of the intermediary ratio) by running Mendel alone. One of the methods we need to use at this time is called the delta method.

As individual level data is not available in summary data MR, bootstrapping cannot be used to estimate the confidence intervals for the indirect effect or proportion mediated, but the delta method can be used to approximate these confidence intervals if samples are independent

The delta method can help us get the standard error of multiplying ab, so as to calculate the confidence interval of the mediation effect.

That is to say, we know the point estimates and standard errors of paths A and B. According to the above formula, we can get the confidence interval of the indirect effect. It is also easy to implement in R. Using the RMiation package, the author of this package has developed a shiny application. Input the estimated values ​​and standard errors of A and B to get the estimated value, standard error and confidence interval of the indirect effect (circled in the figure below):

bootstrap method

The bootstrap method can also be used to help us calculate confidence intervals for mediation effects and mediation proportions:

Bootstrapping is a technique used in inferential statistics that work on building random samples of single datasets again and again. Bootstrapping allows calculating measures such as mean, median, mode, confidence intervals, etc. of the sampling.

The basic idea of ​​bootstrap is to perform random sampling with replacement on the original analysis data to form a sample data set. Bootstrap 1000 times will form 1000 sample data sets. Each sample data set can be counted as the statistics we need. In this way, the distribution of statistics comes out, and there is also a confidence interval.

The code for calculating the mediation effect and mediation ratio after the bootstrap object is generated is as follows:

After the distribution of 1000 statistics is formed by bootstrap, the 0.025 and 0.975 percentiles are the 95 confidence intervals.

Here we add another example of the bootstrap process. When doing bootstrap, we need to use the boot function. There are three main parameters:

The most important of these is the statistic we need to calculate, which is given in the form of a function , and the function accepts no less than 2 parameters, one is the data, and the other is the sampling indices.

For comparison, I first posted the results of running Mendelian randomization:

In the result, there are normal b and corresponding se, and I run 5 more se through bootstrap, as a comparison demonstration, the code is as follows:

mr_function <- function(data, indices) {
  d <- dat[indices,] 
  jieguo <- mr(d) 
  return(jieguo  %>% pull(b)) 
}
reps_mr <- boot(data=dat, statistic=mr_function, R=1000)

In the above code, mr_function is the statistic parameter I want to feed to the boot function. In mr_function, I have declared the value I return, that is, the statistic I want to bootstrap is the b of the MR result, so after running, 5 b bootstrap SEs will come out.

Because the calculation time is too long, only 1000 data sets are set above, and the data is not very dense. Take a look at the situation:

It can be seen that the bootstrapt standard errors of the coefficients of the five methods are all out, but t3, which is the standard error of the IVW method, is the closest to the original value through the bootstrap, which should be one of the reasons why the report intermediary is based on the coefficient of IVW.

The purpose of the above demonstration is only the process of bootstrap. In fact, we need to change the return in mr_function to the statistics we need, that is, the mediation effect and the proportion of the mediation effect.

Propagation of error

The error contagion method can also be used when calculating the confidence interval of the mediation effect and mediation ratio, such as the following literature:

This method is easier to understand, and by the way, I will also write an example to introduce it to you.

Propagation of error refers to the methods used to determine how the uncertainty in a
calculated result is related to the uncertainties in the individual measure

The process of calculating the standard deviation of the product of the error contagion method is as follows:

It is completely polynomial multiplication in junior high school, which involves deleting smaller terms. It should be easy to understand, so I won’t write an explanation here. The standard deviation of the quotient is calculated as follows:

The intermediate process involves the limit of high school, and it is not difficult in general. The above method is the "error contagion method" Propagation of error. After mastering the algorithm of the standard deviation of the product and quotient, we can obtain the confidence interval of the intermediary effect ab after using Two-step MR to obtain a and b, and correspondingly use the standard deviation algorithm of the quotient to calculate the confidence interval of the intermediary proportion.

The method introduced above helps us calculate the standard error and compare it with the boundary value after normal approximation to get the corresponding p value. For example, after the mediation effect distribution is drawn and compared with the boundary value 0, the area under the horizontal axis and the 0-axis curve of the distribution curve is the p value.

The entire Mendelian randomization intermediary will be shared with you.

Guess you like

Origin blog.csdn.net/tm_ggplot2/article/details/128960105