The scitb5 function version 1.4 (interaction effect function P for interaction) is released - used to generate an interaction effect table with one click

In SCI articles, the interaction effect table (usually Table 5) can add icing on the cake to the article, increase the convincing power of the article, increase the credibility of the results, and also perform data mining.
insert image description here
I have introduced how to make an interaction effect table in the previous article "Teaching You to Make an Interaction Effect Table in R Language", and you can read it for details.
This time, I released the scitb5 function written by myself, referring to the forestmodel package and some other R packages. It is used to generate an interaction effect table with one click, mainly to save you some time and to perform data mining.
It should be noted that only the interaction effect function of the target variable X is a continuous variable is released, I am not responsible for the wrong purchase! !
The categorical variable is still being written, and it should be released after a while.
The scitb5 function supports logistic regression, cox regression and linear regression models. Let me demonstrate it below, first do logistic regression
and import our premature birth data (official account reply: premature birth data, you can get the data)

bc<-read.csv("E:/r/test/zaochan.csv",sep=',',header=TRUE)
bc <- na.omit(bc)
names(bc)
dput(names(bc))

insert image description here
First organize the data, this step is necessary, the variables you want to stratify must be converted into factors

bc$race<-ifelse(bc$race=="black",1,ifelse(bc$race=="white",2,3))
bc$smoke<-ifelse(bc$smoke=="nonsmoker",0,1)
bc$low<-factor(bc$low)
bc$race<-factor(bc$race)
bc$ht<-factor(bc$ht)
bc$ui<-factor(bc$ui)

Next import the function, I wrote it as a function file 1.4final.R, just use the source import directly

source("E:/r/test/1.4final.R")

After the import is successful, the following icon will appear, showing that there are 3 functions, which means that the loading is successful.
insert image description here
Next, we need to define the interaction, that is, the hierarchical variable. cov1 means the covariate that appears in your model, and Interaction means that you want to interact. For hierarchical variables, I set the interaction variables in the function to be included in the covariates, otherwise an error will be reported.

cov1<-c("lwt","smoke","ptl","ui","ftv","race")	
Interaction<-c("race","smoke","ui")

After the definition is complete, we can use the scitb5 function. This function needs the support of the survival package and the lmtest package. These two packages must be installed first. After using the function, the two packages will be loaded by itself. Let me explain the scitb5 function. Data is your data. It must be in the form of a data frame. x is your target variable. It must be a continuous variable. Y is your outcome variable. The family defines your model, and logistic regression is defined as logit.

out<-scitb5(data=bc,x="age",y="low",Interaction=Interaction,cov = cov1,family="logit")

insert image description here
A sentence code interaction effect table is generated. It's relatively simple. Next, do a cox regression and import the breast cancer data (reply from the official account: breast cancer, you can get the data)

library(foreign)
library("survival")
bc <- read.spss("E:/r/Breast cancer survival agec.sav",
                use.value.labels=F, to.data.frame=T)
bc <- na.omit(bc)

insert image description here
Categorical variables into factors

bc$er<-as.factor(bc$er)
bc$pr<-as.factor(bc$pr)
bc$ln_yesno<-as.factor(bc$ln_yesno)
bc$histgrad<-as.factor(bc$histgrad)
bc$pathscat<-as.factor(bc$pathscat)
dput(names(bc))

Define covariates and interaction variables

cov1<-c("pathsize", "lnpos", "er", "pr", "histgrad",
        "pathscat", "ln_yesno")
Interaction<-c("histgrad","er", "pr")

Use the function to generate the interaction effect table. When using cox regression, the time must be defined, otherwise an error will be reported.

out<-scitb5(data=bc,x="age",y="status",Interaction=Interaction,cov = cov1,time="time",family="cox")

insert image description here
Next, do linear regression, here is the car data of the ggplot2 package

bc<-as.data.frame(ggplot2::mpg)
dput(names(bc))

insert image description here
Categorical variables into factors

bc$cyl<-as.factor(bc$cyl)
bc$model<-as.factor(bc$model)
bc$drv<-as.factor(bc$drv)
bc$fl<-as.factor(bc$fl)
bc$class<-as.factor(bc$class)
bc$trans<-as.factor(bc$trans)

Define covariates and categorical variables

cov1<-c("model", "displ","cyl", "trans", "drv", 
        "cty", "hwy", "fl", "class")
Interaction<-c("drv","fl","class","model")

Make a table, the coefficient here in linear regression is represented by β.

out<-scitb5(data=bc,x="displ",y="hwy",Interaction=Interaction,cov = cov1,family="linear")

insert image description here
insert image description here
There is a problem here, why do we define four hierarchical variables of "drv", "fl", "class" and "model", and finally only make the variable "drv", and the information prompts "fl",
" The three variables "class", "model" are not suitable for stratification.
[1] "fl is Not suitable for layering"
[1] "class is Not suitable for layering"
[1] "model is Not suitable for layering"
I will talk about it later, we use the data of a fan to illustrate this problem , we first import this data

bc<-read.csv("E:/r/fensi/final1.csv",sep=',',header=TRUE)
dput(names(bc))

Convert categorical variables to factors

bc[,c("x2", "x3", "x4", "x5", "x6", "x7")] <- lapply(bc[,c("x2", "x3", "x4", "x5", "x6", "x7")], factor)
str(bc)

Define covariates and interaction variables

Interaction<-c("x2", "x3", "x4","x5")
cov<-c("x2", "x3", "x4", "x5", "x6", "x7")

Generate a table, here is also generated 3 values, x5 is not generated

out<-scitb5(data=bc,x="x1",y="y",Interaction=Interaction,cov = cov,family="logit")

insert image description here
insert image description here
"x5 is Not suitable for layering" is displayed, which means that the variable x5 is not suitable for layered interaction. Why? Let's take a look at the x5 data first, it has many layers, and it is classified into 4 layers

levels(factor(bc[,"x5"]))

insert image description here
Let's take a subgroup to see. When x5==1, the data of x6 and x7 are all 0, and there are no other variables. In this way, the model cannot be combined, so an error will be reported, so the function will automatically discard this variable.

be<-subset(bc,bc$x5==1)

insert image description here
Therefore, if you do not have much data and there are too many types of hierarchical variables, modeling will fail and classification cannot be performed. I also tried it in other packages, and the modeling failed as well.

For scitb5 function code, please refer to this article:

The scitb5 function version 1.4 (interaction effect function P for interaction) is released - used to generate an interaction effect table with one click

This function is still being updated, if there is any error or any good suggestion, please contact me.

Guess you like

Origin blog.csdn.net/dege857/article/details/130640120