R language data format conversionOne

Converting a matrix into a data frame It is troublesome to convert a data frame into a matrix
with as.data.frame(x)
. If there are both a string type and a numeric type, the value will be converted to a string type by default, and the data frame cannot be converted into a vector And factor
methods(as) function to view all as. The
most basic numeric type in function R is vector, vector can be converted into multiple types of data, for example, adding dimension to vector can be converted into matrix or array.
The eg
Insert picture description here
vector is converted into factor type data. The
Insert picture description here
vector is converted into a list.
Insert picture description here
Take a subset of the data frame. You can use the index to extract continuously or non-continuously.
Continuous x1 <- x[c(1:3),c(2:5)] Non-continuous x2 <- x[c(1,3,4),c(5,7,9)]
conditional extraction x <- x[which(x$column name==a certain value),] Note: Don’t forget Add a comma, two equal signs
are needed to determine the equation. You can also use the subset function to take a subset. It is more convenient to
sample from a larger data set in the field of data mining and machine learning. For example, you need two samples, one samples used to model, the model used to verify another sample, the sample may be used in the random sample R, easy to set the size of the sample, no return of the samples (each number can appear only once)
for chestnut:
for vector Sampling
Insert picture description here
Use the same code to sample the data frame, modify the corresponding data frame name and replace=T or F to represent sampling with return and sampling without return.

Next, learn how to delete the data of a fixed row:
the simplest way is to use negative index eg x[-1:-3,], put the comma behind to delete the corresponding column, and put the comma before delete the corresponding row.
The second method directly locks the row. Or column, by name, assign NULL

Adding and merging data frames:
For example, the data set USArrests measures four crime rates in 50 states in the United States.
Add a column eg to the data frame.
Insert picture description here
Insert picture description here
Insert picture description here
You can also use the function cbind to add a column. This is a direct merging. The above method is to recreate the data frame. Form, merging rows is more troublesome. The column and row names must be the same or rbind cannot run. If there are duplicates after merging, use the duplicated function to determine which values ​​are duplicates in the data frame or vector, and return logical values, and use the index to take out eg x[!duplicated(x),] to take out the non-duplicated parts. You can also directly use the unique function to directly extract the non-repeated parts.

Guess you like

Origin blog.csdn.net/m0_46445293/article/details/105467099