R language data analysis
Overview of R language and data analysis
R language is an open source scripting language, born in 1993, R system is open source and free.
Data analysis process:
data import-data cleaning-data exploration-data modeling-visualization-report discovery
Basic operation commands
Note: write the package name at *
function | Description |
---|---|
getwed() | Show current working directory |
set () | Modify the current working directory |
ls () | Show all objects in the current workspace |
str () | Display object structure |
ls.str () | Display the structure of each variable in the object |
exists () | Whether there is an object in the current workspace |
rm () | Delete one or more objects |
q () | Exit R. Before this will ask whether to save the workspace |
.libPaths () | View the specific path of the folder in the computer |
install.packages () | Installation package |
library () | Display the list of installed packages |
search() | Display the list of loaded packages |
library("*") | Load this package |
detach(“packages:*”) | Remove package |
remove.packages(“*”) | Uninstall package |
Basic data type
Types of | Description | Judgment function | R language form |
---|---|---|---|
Logical | The binary data that represents the logical value has only two values, TRUE or FALSE. In R, the assignment of logical expressions will result in logical data, such as comparing the size of two numbers 2>1 equals TRUE | is.logocal() | TRUE , 2 <= 1 |
Floating point | Real numbers expressed in decimal, such as 1, 1.1, etc., are the basic data form used for calculation | is.double () | 3.14 |
Integer | Used to describe integers, such as 1, 2, 3. It should be noted that in the R language, the character L is added after the integer to represent the integer, otherwise it will be regarded as a floating point | is.integer() | 3L |
Character type | Used to represent a string | is.character () | “Hello”,“3.14” |
Plural | Used to represent complex values, where the imaginary part is represented by i, for example 2+3i | is.complex () | 1+i |
Primitive | Used to save the original bytes, where each byte is represented by two hexadecimal numbers, such as A3 | is.raw () | 00 |
Data type conversion
Logical type-integer type-floating point type-character type
Operator
Data structure in R
➢ Vector ➢ Matrix ➢ Array ➢ List ➢ Data frame ➢ Factor
Vector function:
✓Create vector
✓Access element
✓Add element
✓Delete element
✓Get vector length
Code display
#基本数据类型
getwd()
install.packages("stringr")
.libPaths()
library()
search()
library("stringr")
str_length ("Hello R!")
detach("package:stringr")
str_length ("Hello R!")
remove.packages("stringr")
library("stringr")
#基本数据类型
num <- 100;num
is.integer(num)#?
is.double(num)#?
typeof(num)
num2 <- 100L
typeof(num2)
is.logical(TRUE)
is.logical(T)
is.logical(5)#?
is.logical(0)#?
is.character("R program")
#数据类型转换
logi_vec <- T
typeof(logi_vec)
int_vec <- c(100L,200L)
typeof(int_vec)
double_vec <- c(10,20)
typeof(double_vec)
chr_vec <-c("伟大的","中国人民")
typeof(chr_vec)
typeof(c(logi_vec,int_vec))#?验证逻辑型和整数型
typeof(c(int_vec,double_vec))#?验证逻辑型和整数型
typeof(c(double_vec,chr_vec))#?验证逻辑型和整数型
typeof(c(logi_vec,int_vec,double_vec,chr_vec))#?验证逻辑型和整数型
1 == '1'
#在运算过程中,数据类型自动转换
2*T #?
10+FALSE #?
10+TRUE
exp(F) #?
10 & 0 #?
10 | 0 #?
#使用as***函数强制转换数据类型
as.numeric(F)
as.numeric("1000.01")
as.numeric("你好")
as.logical(0)
as.logical(10)
as.logical(-10)
as.logical("T")
as.logical("F")
as.character(c(T,F,TRUE,FALSE))
as.character(10.99)
#特殊值
#NA
a <- 100
a[1]
a[2]
num_vec <- c(1,2)
length(num_vec) <- 4
num_vec
#Inf 无穷大
10/0
-10/0
0/0
Inf-Inf
num_vec1 <- c(1,5,NaN)
length(num_vec1)
num_vec2 <- c(1,5,NULL)
length(num_vec2)
#运算符
#逻辑运算 & vs &&
logi_vec1 <- c(T,F,T)
logi_vec2 <- c(F,T,T)
logi_vec1 & logi_vec2
logi_vec1 && logi_vec2
logi_vec1 <- c(T,F,T) #向量长度不同,短的循环补齐
logi_vec2 <- c(T,T,T,F)
logi_vec1 & logi_vec2
logi_vec1 && logi_vec2
logi_vec1 <- c(T,F,T)
#逻辑运算| vs ||
logi_vec1 <- c(T,F,T)
logi_vec2 <- c(F,T,T)
logi_vec1 | logi_vec2
logi_vec1 || logi_vec2
a <-T
b <- 10L
c <- 20
d <-"R";
typeof(c(a,b,c,d))
#向量
vec <- c(1,5,6,8,9)
#访问元素
vec[1]
vec[0]
vec[2:3]
vec[2:5]
vec[c(1,3)]#访问不连续的怎么办
vec[c(1,3,2)] #想重复访问
#添加元素
vec
vec <- c(vec[1:2],10,vec[3])
vec
#删除元素
vec
vec <-vec[-3]
vec
#获取向量长度
vec <- letters
vec
length(vec)
vec[-length(vec)] #删除x,y,z怎么做?
vec[-length(vec):-length(vec)+2]
-length(vec):-length(vec)+2#注意加括号
-length(vec):(-length(vec)+2)
vec[-length(vec) : (-length(vec)+2)]
#创建向量
#Q:创建向量的方法
1:5
1:-5
c(1L,2.0,"a")
67#1,3,5,7,9,创建等差数列
?seq
example(seq)
seq(1,9,by =2)
#将某向量重复多次,创建向量
vec <-1:3
#1 2 3 1 2 3 1 2 3
?rep
example(rep)
rep(vec,3)
#111222333
rep(vec,each = 3)
#创建长度为0的向量
new.vec <- c()
length(new.vec)
new.vec
#判断某班级的学生的年龄是不是都是18岁以上?
stu <- sample(c(17,18,19),10,replace = T)
stu
all(stu >= 18)
stu <- sample(c(18,19),10,replace = T)
stu
any(stu<18)
#any all 你想想可以应用到什么场景中?
#向量运算,算术运算,关运算,逻辑运算
vec1 <- c(1,2)
vec2<- c(10,20)
vec1 * vec2
vec1 == vec2
vec | vec2
#向量运算–循环补齐
vec1 <- c (1,2)
vec2<-c (10,20,30)
vec1 + vec2
vec2<- c(10,20,30,40)
vec1 + vec2
Doraemon welcomes you!