Use the CUT function in R to re-encode data in segments

In many SCI papers, continuous variables are compared in sections, such as age divided into young, middle-aged, and old, or a certain index is continuously high, medium, and low into several equal parts and then analyzed, as shown in the figure below As shown, the continuous gestational weeks are divided into early pregnancy, second pregnancy and late pregnancy
Insert picture description here
in R language. To implement this method, we need to segment the continuous variable (also called binning) and then re-encode the data. Analysis, this step is very important, this is to prepare for the subsequent analysis. Today we use the CUT function that comes with the R language to demonstrate the segmented re-encoding and data sorting of data.
Today we use the data of Breast cancer surviva that comes with the SPSS software as a demonstration. First, open Rstudiu to import the data and delete the missing value
library(foreign)
#import foreign package bc <- read.spss("E:/r/Breast cancer survival agec.sav",
use.value.labels=F, to.data.frame=T)
bc <- na.omit(bc)
Insert picture description here
Check the data
head(bc) The
Insert picture description here
second indicator is age, we plan to The age tie is divided into three intervals of high, middle and low
age1<-cut(bc$age,breaks = 3,labels = c(1,2,3))#Equally divided into 3 intervals, named 1, 2, 3
Insert picture description here
dc< -cbind(bc,age1)#Add variables to the table
Insert picture description here

In this way, the age is grouped and recoded. We can also group specific age groups
age2<-cut(bc$age,breaks=c(0,20,60,100),include.lowest=T,
labels = c(1,2,3))#divide age 3 intervals like 0-20, 20-60, 60 to 100
dd<-cbind(bc,age2)#Add variables to the table
Insert picture description here

You can also segment the age according to the percentile ratio
age3<-quantile(bc$age,c(0,.25,.50,.75,1))
dc<-cbind(bc,age3)#Add the variable form

Insert picture description here
Move your little hands and pay attention, more wonderful articles are all in the zero-based scientific research

Insert picture description here

Guess you like

Origin blog.csdn.net/dege857/article/details/108908997