R language mathematical functions and statistical functions and probability function

A mathematical function
mathematical function in most applications scalar (individual values)
These functions are applied in numerical vectors, matrices, data block, which acts on the value of each individual
common mathematical functions
Function Description
abs (x) absolute value
sqrt (x) Square root
ceiling (x) is the smallest integer not less than x (rounded up)
a maximum integer floor (x) not greater than x (rounded down)
integer the trunc (x) taken in the direction of x 0 section
round (x, digits = n) rounds x to the specified number of decimal places
signif (x, digits = n) rounds x to the specified number of significant digits, signif (3.475, digits = 2 ) returns a value of 3.4
cos (x), sin (x ), tan (x) cosine, sine, tangent
acos (x), asin (x ), atan (x) inverse cosine, arcsine, arctangent
cosh (x), sinh (x ) , tanh (x) hyperbolic cosine, hyperbolic sine, hyperbolic tangent
acosh (x), asinh (x ), atanh (x) inverse hyperbolic cosine, inverse hyperbolic sine, hyperbolic tangent inverse
log (x, base = n) for taking x base n logarithm of
exp (x) exponential function

Second, the statistical functions
Many functions have optional parameters can affect the end result.
The mean (x) function whose parameters trim to discard the minimum and maximum number specified percentage, na.rm whether to reject other missing values.
Statistical Functions
Function Description
mean (x) average
median (x) median
sd (x) standard deviation
var (x) variance
mad (x) Absolute median
quantile (x, probs) seeking quantile. Wherein x is required to be numeric vector quantile, the value of a numeric vector probs consisting probability between [0,1]
Range (x) seeking range, the maximum and minimum return vectors, e.g., x <-c (1,2,3,4), range ( x) returns C (l, 4)
SUM (X) summing
diff (x, lag = n) differential lag, lag for a few specified hysteresis. Default. 1 = LAG
min (x) for the minimum
max (x) selecting the maximum value
scale (x, center = TRUE, scale = TRUE) the data object x by the center of the column (center (TRUE)) or a normalized center = TRUE , scale = TRUE)
examples

x <- c(1,2,3,4,5,6,7,8)
#利用统计函数求平均值和标准差
mean(x)
sd(x)
#不利用函数求平均值和修正标准差
n <- length(x)
n
meanx <- sum(x)/n

css <- sum((x-meanx)^2)
sdx <- sqrt(css/(n-1))
#数据的标准化
#默认情况下,函数scale()对矩阵或数据框的指定进行均值为0,标准差为1的标准化
newdata <- scale(mydata)
#对数据进行任意均值和标准差的标准化
#对数据进行均值为M标准差为SD的标准化
newdata <- scale(mydata)*SD + M
#对指定列进行标准化
#将列myvar进行均值为M标准差为SD的标准化
newdata <- transform(mydata,myvar=scale(myvar)*SD +M)

Third, the probability function
the probability function is normally used to generate simulated data with known characteristics, and calculating statistical probability function values user-written
probability function in the form of R;
[dpqr] distribution_abbreviation ()
where [] represents in which it refers distribution of certain aspects of
d = density function (density)
P = distribution function (distribution function)
Q = quantile function (function quantile)
R & lt = generates a random number (random variation)
the probability function is commonly used as
distribution abbreviation
Beta distribution beta
binomial binom
Cauchy distribution Cauchy
(non-central) chi-square distribution chisq
exponential distribution exp
F distribution f
Gamma gamma distribution
geometric distribution geom
super hyper geometric distribution
logarithmic normal distribution too lnorm
Logistic distribution logis
number of distribution multinom
negative binomial nbinom
positive state distribution norm
Poisson distribution pois
the Wilcoxon signed-rank distribution signrank
distribution t
uniform distribution UNIF
Weibull distribution Weibull
Wilcoxon rank-sum distribution Wilcox

example

#例子正太分布有关函数
#在区间[-3,3]上绘制标准正太曲线
#pretty()生成-3到3的30个等差值组成的向量
x <- pretty(c(-3,3),30)
y <- dnorm(x)
plot(x,y,type="l",
     xlab="Normal Deviate",
     ylab = "Density",
     yaxs="i")

#求位于z=1.96左侧的标准正太曲线下方面积是多少?
pnorm(1.96)
#求均值为500,标准差为100的正态分布的0.9分位点值为多少
qnorm(0.9,mean=500,sd=100)
#生成50个均值为50,标准差为10的正太随机数
#第一个参数是生成的随机数的个数,第二个参数为正太分布的均值,第三个参数为正太分布的标准差
rnorm(50,50,10)

3.2, a random seed setting
each parameter even if the random number generation identical, the resulting random numbers are different.
This is because each generate a random number of random seed is different.
When we want to generate the same random number in the same situation parameters, you can specify the same random seed to complete the operation
examples:

#函数runif()用来生成0到1区间上服从均匀分布的伪随机数
runif(5)
#同样函数和参数生成的随机数也不同
runif(5)
#设定随机种子生成相同的随机数
set.seed(1234)
runif(5)
#注意每次生成随机数时都需要重新设定随机种子这样才能生成相同的随机数
#也就是说设定的随机种子是一次性的只对最近的随机数使用。之后还需要重新设定相同的随机种子
set.seed(1234)
runif(5)

3.3, generating a polyhydric given mean vector and covariance matrix is too distribution data
, using MASS package murnorm () function to generate
n is the size of the resulting samples, mean is the mean vector, sigma is the variance - covariance matrix (or correlation matrix )
mvrnorm (n-, Mean, Sigma)

example

library(MASS)
#options()为环境设置函数,options(digits=3),设定R的整数表示能力为3位
options(digits=3)
set.seed(1234)
mean <- c(230.7,146.7,3.6)
sigma <- matrix(c(15360.8,6721.2,-47.1,
                  6721.2,4700.9,-16.5,
                  -47.1,-16.5,0.3),nrow=3,ncol=3)
mydata <- mvrnorm(50,mean,sigma)
mydata <- as.data.frame(mydata)

head(mydata,n=10)
Published 39 original articles · won praise 11 · views 10000 +

Guess you like

Origin blog.csdn.net/weixin_42712867/article/details/95575158