特征重要度展示 - 代码天地

特征重要度展示

其他 2018-12-13 00:12:03 阅读次数: 0

RF评价特征重要度，画出特征排行

import numpy as np
import matplotlib.pyplot as plt
import pandas as pd
from sklearn.ensemble import RandomForestClassifier
from sklearn.model_selection import train_test_split,GridSearchCV
from sklearn.metrics import classification_report

def read_data():
    # load pickle
    #df = pd.read_pickle("./output/killed_collision_normal2class.pkl")
    df = pd.read_pickle("./output/killed_collision_normal2class.pkl")
    X_train, X_test, y_train, y_test=train_test_split(df.drop(columns=["KILLED"]), df["KILLED"],
                     test_size=0.3, random_state=0)
    return df, X_train, X_test, y_train, y_test

#---------读取数据集
pd_data,X_train, X_test, y_train, y_test = read_data()

def feature_importance(features_num=20):
    if(features_num > X_train.shape[1]):
        print("the features num is too big for the  trainData")
        return

    forest = RandomForestClassifier(n_estimators=500,random_state=0,n_jobs=-1,max_features=20)
    forest.fit(X_train,y_train)
    y_true, y_pred = y_test, forest.predict(X_test)
    print(classification_report(y_true, y_pred))
    importance = forest.feature_importances_
    indices = np.argsort(importance)[::-1]
    print("----the importance of features and its importance_score------")
    j=1
    features_names=[]
    im_list= []
    for i in indices[0:features_num]:
        f_name = X_train.columns.values[i]
        print(j,f_name,importance[i])
        features_names.append(X_train.columns.values[i])
        im_list.append(importance[i])
        j+=1

    draw_importance(features_names,im_list)

def draw_importance(features,importances):
    indices = np.argsort(importances)
    print(indices)
    print(features)
    plt.title('Feature Importances')
    plt.barh(range(len(indices)), np.array(importances)[indices], color='b', align='center')
    plt.yticks(range(len(indices)), np.array(features)[indices])
    plt.xlabel('Relative Importance')
    plt.show()

if __name__=="__main__":
    feature_importance()

猜你喜欢

转载自blog.csdn.net/Dawei_01/article/details/80684035

特征重要度展示

GBDT 特征重要度计算

使用随机森林计算特征重要度

特征重要度整理 - 随机森林、

【评分卡模型】特征重要度之WoE、IV、BadRate

catboost 的实例应用附带特征重要度打印

Python机器学习：plot_importance()查看特征重要度

特征工程：衡量特征的重要型

特征重要性分析

特征选择-随机森林可以衡量特征的重要程度

计算特征相关性的方法，特征提取的方法，如何判断特征是否重要

随机森林的特征重要性原理

xgboost 特征重要性计算

sklearn:特征与树木森林的重要性

sklearn数据特征重要程度的筛选

模型融合---特征重要性评估

筛选重要特征的方法feature_importance_

Python算法的七个重要特征

并发编程的三个重要特征

特征重要性--feature_importance

机器学习特征重要性分析

特征选择 | 变量重要性衡量

xgboost 特征重要性选择 / 看所有特征哪个重要

lightgbm 特征重要性选择 / 看所有特征哪个重要

【模型可解释性系列一】树模型-拿到特征重要度-打印关键因素

特征匹配 opencv 图像相似度识别

基于随机森林做回归任务（数据预处理、MAPE指标评估、可视化展示、特征重要性、预测和实际值差异显示图）

XGBoost 中特征重要性和特征选择原理解析与实战

数据挖掘#特征工程（二）特征重要性及可解释性总结

XGBoost学习（六）：输出特征重要性以及筛选特征

今日推荐

基于大语言模型的开源知识库问答系统 MaxKB GitHub Star 数量突破 5,000 个！

美国拟限制 AI 大模型出口中国和俄罗斯

苹果将与 OpenAI 达成协议，将 ChatGPT 应用于 iPhone

openKylin 社区生态委员会第六次会议圆满召开

阿里云正式发布通义千问 2.5

Python 3.13 发布首个 Beta：实验性自由线程模式和 JIT、改进交互式解释器

Stack Overflow 拿我的代码去训练 AI 大模型，还封了我的账号

Pop!_OS 的 COSMIC 桌面完成 App Store 上架工作

《2024 年一季度互联网投融资运行情况》研究报告

报告：Django 仍然是 74% 开发者的首选

15 年前上了“FFmpeg 耻辱柱”，今天他还得谢谢咱——腾讯QQPlayer一雪前耻？

TIOBE 5 月榜单：Fortran “复活”进入 Top 10

周排行

记一下去大梅沙的准备（2018-05-26）

Spring 注解事务

基于HTTP协议的客户端缓存

阿里云rds 备份和还原

[PHP] 几个拖慢 PHP 程序/API 运行速度的点

python 代码风格------------PEP8规则

js控制json生成菜单——自制菜单（一）

将字符串: 'k:1|k1:2|k2:3|k3:4 ' ,处理成 python 字典: {'k':1, 'k1':2, ...}

微信小程序转支付宝小程序

Qt551.窗口滚动条

每日归档

更多

2024-05-13(18)

2024-05-12(0)

2024-05-11(38)

2024-05-10(38)

2024-05-09(35)

2024-05-08(42)

2024-05-07(14)

2024-05-06(40)

2024-05-05(0)

2024-05-04(7)