Function basic structure - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
input data type
vector:
sum, mean, sd, range, median, sort, order
Matrix or data frame:
cbind, rbind
Numeric matrix:
heatmap
eg regression analysis lm()
state <- as.data.frame(state.x77[, c("Murder","Population", "Illiteracy", "Income", "Frost")])
fit <- lm(Murder ~)
option parameter
1. Input control part
2. Output control part
3. Adjustment part
Common options
file: a file
data: Generally, you need to enter a data frame
x: Represents a single object, usually a vector, or a matrix or a list
x and y: the function requires two input variables
x, y, z: the function requires three input variables
formula: formula
na.rm: remove missing values
Adjustment parameters
Common parameters
The color option and obviously used to control the color
select is related to selection
font is related to the font
font.axis is the font of the axis
lty is line type
lwd is the line width
method is the software algorithm
Custom functions - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
function name
Function commands are related to functions and can be a combination of letters and numbers, but must start with a letter
function declaration
myfun <- function(option parameter){
function body
}
Common functions - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
Mathematical Statistics
probability function
d Probability density function
p distribution function
the inverse of the q distribution function
r generates random numbers from the same distribution
Add before the function, such as
Other probability distribution functions
other
set.seed(233) #Set random seed
runif(num) # Randomly generate num numbers between 0-1
runif(num, min=1, max = 100) # Randomly generate random numbers from 1 to 100
descriptive statistics
Descriptive statistics refers to the activities of describing characteristics of data using tabulation and classification, graphics and calculation of summary data. Descriptive statistical analysis is to statistically describe the relevant data of all variables in the survey population, mainly including data frequency analysis, central tendency analysis, dispersion degree analysis, distribution and some basic statistical graphics.
summary() #Perform detailed statistics on a data set, minimum value, maximum value, quartile, numerical variable mean, etc.
fivenum() #returns the basic five statistics
Hmisic::discribe()
pastecs::state.desc()
psych::disscribe() #trim can remove extreme values
psych::disscribe.by() #Can be calculated according to grouping
aggregate() #Calculate the data with the specified grouping information
doBy::summaryby() #Calculate multiple statistical values for multiple groups
frequency statistics
Frequency (Frequency), also known as "number of times". Refers to the number of occurrences of a number (flag value) representing a certain characteristic in the variable value. The frequencies arranged in sequence by group form a frequency series, which is used to illustrate the strength of the role of each group of marker values on the overall marker value.
split() #grouping
cut() #Split continuous data
table() #Frequency statistics
prop.table() # Calculate the frequency value
xtabs() #Write a variety of formulas according to different needs
margin.table() #marginal frequency, processed by row or column alone
addmargins() #Add the marginal sum to the frequency table
e.g
with(data = Arthritis(table(Treatment, Improved)))
xtabs(~Treatment + Improved, data = Arthritis)