[Organization] R - Study Notes - Getting Started

Download and related preparation

windows operating system

R download: https://www

.r-project.org/

Rstudio download: https://www.rstudio.com/products/rstudio/download/

Linux operating system CENTOS

R is already downloaded Rstudio after wget, the rpm file will not compile TAT without root privileges

Study book recommendation: https://xccds1977.blogspot.sg/2013/02/r.html

Reference: http://staff.ustc.edu.cn/~zwp/teach/Stat-Comp/R4beg_cn_2.0.pdf

                Going to the library to borrow "R in action"...

object

When R is running, all variables, data, functions and results are stored in the computer's memory activities in the form of objects, and are given corresponding names and codes.

Object's intrinsic properties:

    Type mode() Numeric, Character, Complex, Logical

    length length() the number of elements in the object

> x <- 1
> mode(x)
[1] "numeric"
> length(x)
[1] 1
> x <- "helloworld"; y <- TRUE; Z <- i
Error: object 'i' not found
> x <- "helloworld"; y <- TRUE; Z <- 1i
> mode(x);mode(y);mode(Z)
[1] "character"
[1] "logical"
[1] "complex"

Object operations:

Assignment/Modification: <-

remove: rm()

> rm(Z)
> ls()
[1] "x" "y"
> rm(list=ls())
> ls()
character(0)

Inf: +∞ -Inf: -∞ NaN: Not a Number

Classification of objects:

    Vector: distinction between external attributes dim and length,

    factor (numeric or character), array, matrix (two-dimensional array),

    Data frame: Consists of one or several vectors and/or factors, must be present, but can be of different data types,

    Time series: contains additional attributes such as frequency, time, etc.

    List: Can contain any type of object.

file read and write

R-readable data: data stored in text files (ASCII)

                        Other format files (Excel, SAS, SPAA) and access to SQL-type databases - advanced applications

Read:

read.table()

> mydata <- read.table("C:/data/test_data.txt")
#Create a data frame mydata
Warning message:
In read.table("C:/data/test_data.txt") :
  incomplete final line found by readTableHeader on 'C:/data/test_data.txt'
> View(mydata)
# Each variable in the data frame is named, the default is V1, V2...
> mydata$V1;mydata["V1"];mydata[,1] #Access variables individually
[1] 1 2 #Vector
  V1
1  1
2 2 #dataframe
[1] 1 2 #Vector

Default Default:

read.table(file, header = FALSE, sep = "", quote = "\"’", dec = ".",row.names, col.names, as.is = FALSE, na.strings = "NA",colClasses = NA, nrows = -1,skip = 0, check.names = TRUE, fill = !blank.lines.skip,strip.white = FALSE, blank.lines.skip = TRUE,comment.char = "#")

file

A filename (enclosed in "", or using a character variable) or a URL link (http://...) (for remote access to the file using a URL)
header Reflects whether the first line of this file contains variable names
sep field separator
quote Specifies the character used to store character data
dec

character used to represent the decimal point

row.names A vector holding the line names, or the serial number or name of a variable in the file, by default the line number is 1, 2, 3, . . .

col.names

character vector specifying column names (default: V1, V2, V3, . . . )
as.is

Controls whether to convert character variables to factor variables (if the value is FALSE), or to keep them as characters (TRUE)

as.is can be a logical, numeric or character vector, used to determine whether the variable is reserved as a character

na.strings Values ​​representing missing data (converted to NA)
colClasses a character vector specifying the data type of each column
nrows Maximum number of lines that can be read (negative values ​​are ignored)
skip Number of rows to skip before reading data
check.names If TRUE, check if the variable name is valid in R
fill If TRUE and not all rows have the same number of variables, fill with blanks
strip.white If sep is specified, if TRUE, removes extra spaces before and after character variables
blank.lines.skip If TRUE, ignore blank lines
comment.char 一个字符用来在数据文件中写注释,以这个字符开头的行将被忽略(要禁用这个参数,可使用comment.char = "")

scan()

可用于创建不同的对象,向量,矩阵...

> mydata <- scan("data.dat", what = list("", 0, 0))    #读取了文件data.dat中三个变量,第一个是字符型变量,后两个是数值型变量

scan(file = "", what = double(0), nmax = -1, n = -1, sep = "",quote = if (sep=="\n") "" else "’\"", dec = ".",skip = 0, nlines = 0, na.strings = "NA",flush = FALSE, fill = FALSE, strip.white = FALSE, quiet = FALSE,blank.lines.skip = TRUE, multi.line = TRUE, comment.char = "")

what 指定数据的类型(缺省值为数值型)
nmax

要读取数据的最大数量,如果what是一个列表,nmax则是可以读取的行数

(在缺省情况下,scan读取到文件最末端为止的所有数据)

n 要读取数据的最大数量(在缺省情况下,没有限制)

read.fwf():来读取文件中一些固定宽度格式的数据

> mydata <- read.fwf("C:/data/test_data.txt",widths=c(1,5,3,2)

原始数据:                 处理后:

      

存储:

write.table()

> d <- data.frame(obs=c(1,2,3),treat=c("A","B","C"),weight=c(2.3,NA,9))
> write.table(d,file="C:/data/test_data.txt")
> View(d)
> write.table(d,file="C:/data/test_data.txt",row.names=F,quote=F,sep="\t")

存储数据:                              列表显示:

qppend 如果为TRUE则在写入数据时不删除目标文件中可能已存在的数据,采取往后添加的方式
quote

一个逻辑型或者数值型向量:如果为TRUE,则字符型变量和因子写在双引号""中;

若quote是数值型向量则代表将欲写在""中的那些列的列标。

(两种情况下变量名都会被写在""中;若quote = FALSE则变量名不包含在双引号中)

row.names 一个逻辑值,决定行名是否写入文件;或指定要作为行名写入文件的字符型向量
col.names 一个逻辑值(决定列名是否写入文件);或指定一个要作为列名写入文件中的字符型向量

save():保存为R专有的文件格式

> save(d,file="C:/data/test_data.Rdata")
> setwd("C:/data")    #定义路径
> load("test_data.Rdata")    #加载到内存中

Guess you like

Origin http://43.154.161.224:23101/article/api/json?id=324612871&siteId=291194637
Recommended