R language-related analysis and robust linear regression analysis

Original link: http://tecdat.cn/?p=9484

table of Contents

How to do the test

Power Analysis


Introduction

Below is an example of species diversity shows how the correlation analysis and linear regression analysis in the R language.

 

How to do the test

Correlation and linear regression examples

 


Data = read.table(textConnection(Input),header=TRUE)

 

Simple graph data

                                                                      

plot(Species ~ Latitude, 
     data=Data, 
     pch=16,
     xlab = "Latitude", 
     ylab = "Species")

 

 

 

Correlation

You can use  cor.test function. It can perform Pearson, Kendall and Spearman correlation.

 

Pearson correlation

Pearson correlation is related to the most common form. Assumed that the data are linearly related, and the residuals were normally distributed.

 

cor.test( ~ Species + Latitude, 
         data=Data,
         method = "pearson",
         conf.level = 0.95)



Pearson's product-moment correlation



t = -2.0225, df = 15, p-value = 0.06134



       cor

-0.4628844

 

 

Kendall related

Kendall's rank correlation is a nonparametric test, it is not assumed that the distribution of data or data are linearly related. It ranked data to determine the degree of correlation.

 

 

cor.test( ~ Species + Latitude, 
         data=Data,
         method = "kendall",
         continuity = FALSE,
         conf.level = 0.95)

 

Kendall's rank correlation tau

 

z = -1.3234, p-value = 0.1857

 

       tau

-0.2388326

 

 

 

Spearman

Spearman rank correlation is a nonparametric test, it is not assumed that the distribution of data or data are linearly related. It sorts the data to determine the degree of correlation, and in order for the measurement.

 

 

 

 

 

Linear Regression

Linear regression can be used  lm perform the function. You can use lmrob perform robust regression function.

 



summary(model)                    # shows parameter estimates,
                                  # p-value for model, r-square

 

            Estimate Std. Error t value Pr(>|t|) 

(Intercept)  585.145    230.024   2.544   0.0225 *

Latitude     -12.039      5.953  -2.022   0.0613 .

 

Multiple R-squared:  0.2143,  Adjusted R-squared:  0.1619

F-statistic:  4.09 on 1 and 15 DF,  p-value: 0.06134

 

 



Response: Species

          Sum Sq Df F value  Pr(>F) 

Latitude  1096.6  1  4.0903 0.06134 .

Residuals 4021.4 15

 

 

 

Draw linear regression

 


plot(Species ~ Latitude,
     data = Data,
     pch=16,
     xlab = "Latitude", 
     ylab = "Species")

abline(int, slope,
       lty=1, lwd=2, col="blue")     #  style and color of line

 

 

 

Check the model assumptions

 

 

 

Linear Model residuals histogram. These residuals should be approximately normally distributed.

 

 

 

 

 

Graph of residuals versus predicted values. Residuals should be unbiased and evenly. 

 

 

 

Robust Regression

The linear regression is not sensitive to outliers in the response variable.

 

 



summary(model)                    # shows parameter estimates, r-square

 

            Estimate Std. Error t value Pr(>|t|) 

(Intercept)  568.830    230.203   2.471   0.0259 *

Latitude     -11.619      5.912  -1.966   0.0681 .

 

Multiple R-squared:  0.1846,  Adjusted R-squared:  0.1302

 

 
                   
anova(model, model.null)         # shows p-value for model

 

  pseudoDf Test.Stat Df Pr(>chisq) 

1       15                         

2       16    3.8634  1    0.04935 *

 

 

 

Drawing model

 

 

 

 

Examples of linear regression

 

 



summary(model)                    # shows parameter estimates, 
                                  # p-value for model, r-square

 

Coefficients:

            Estimate Std. Error t value Pr(>|t|)  

(Intercept)  12.6890     4.2009   3.021   0.0056 **

Weight        1.6017     0.6176   2.593   0.0154 *

 

Multiple R-squared:  0.2055,  Adjusted R-squared:  0.175

F-statistic: 6.726 on 1 and 26 DF,  p-value: 0.0154

 

###  Neither the r-squared nor the p-value agrees with what is reported

###    in the Handbook.

 

 

library(car)

Anova(model, type="II")           # shows p-value for effects in model

 

          Sum Sq Df F value Pr(>F) 

Weight     93.89  1  6.7258 0.0154 *

Residuals 362.96 26  

 

#     #     #

 

 

Power Analysis

Correlation power analysis

 

### --------------------------------------------------------------
### Power analysis, correlation
### --------------------------------------------------------------

pwr.r.test()

 

     approximate correlation power calculation (arctangh transformation)

 

              n = 28.87376 

 

If you have any questions, please leave a comment below. 

 

 

Big Data tribe  - Chinese professional third-party data service providers to provide customized one-stop data mining and statistical analysis consultancy services

Statistical analysis and data mining consulting services: y0.cn/teradat (Consulting Services, please contact the official website customer service )

Click here to send me a messageQQ:3025393450

 

QQ exchange group: 186 388 004 

[Service] Scene  

Research; the company outsourcing; online and offline one training; data reptile collection; academic research; report writing; market research.

[Tribe] big data to provide customized one-stop data mining and statistical analysis consultancy

 

Welcome attention to micro-channel public number for more information about data dry!
 
 

Welcome to elective our R language data analysis will be mining will know the course!

Guess you like

Origin www.cnblogs.com/tecdat/p/12048870.html