众数插值法

版权声明:本文为博主原创文章,未经博主允许不得转载。 https://blog.csdn.net/G090909/article/details/54630005

library(Hmisc)

自定义众数函数

stat.mode <- function(x, rm.na = TRUE){
if (rm.na == TRUE){
y = x[!is.na(x)]
}
res = names(table(y))[which.max(table(y))]
return(res)
}

自定义函数,实现分组替补

my.impute <- function(data, category.col = NULL,
miss.col = NULL, method = stat.mode){
impute.data = NULL
for(i in as.character(unique(data[,category.col]))){
sub.data = subset(data, data[,category.col] == i)
sub.data[,miss.col] = impute(sub.data[,miss.col], method)
impute.data = c(impute.data, sub.data[,miss.col])
}
data[,miss.col] = impute.data
return(data)
}

final_house <- subset(my.impute(house, ‘区域’, ‘建筑时间’),select = c(区域,type.new,floow,面积,价格.W.,单价.平方米.,建筑时间))
final_house <- transform(final_house, builtdate2now = 2016-as.integer(substring(as.character(建筑时间),1,4)))
final_house <- subset(final_house, select = -建筑时间)

猜你喜欢

转载自blog.csdn.net/G090909/article/details/54630005
今日推荐