机器学习练习(一) - 代码天地

机器学习练习(一)

企业开发 2018-09-09 18:08:49 阅读次数: 0

数据集是从UCI中找的，是对乳腺癌病的预测判断的示例

import pandas as pd
from sklearn.preprocessing import LabelEncoder
from sklearn.cross_validation import train_test_split
from sklearn.preprocessing import StandardScaler
from sklearn.pipeline import Pipeline
from sklearn.svm import SVC
import numpy as np
from sklearn.metrics import confusion_matrix
import matplotlib.pyplot as plt
from sklearn.metrics import precision_score,recall_score,f1_score


file = pd.read_csv("http://archive.ics.uci.edu/ml/machine-learning-databases/breast-cancer-wisconsin/wdbc.data",header=None)

df = file
X = df.loc[:, 2:].values
y = df.loc[:, 1].values
le = LabelEncoder()
y = le.fit_transform(y)#类标整形化

#划分训练集和测试集
X_train,X_test,y_train,y_test = train_test_split(X, y, test_size=0.20,random_state=1)
#建立pipeline
pipe_svc = Pipeline([("scl",StandardScaler()),("clf",SVC(random_state=1))])
pipe_svc.fit(X_train,y_train)
y_pred = pipe_svc.predict(X_test)

# 混淆矩阵并可视化
confmat = confusion_matrix(y_true=y_test, y_pred=y_pred)  # 输出混淆矩阵
print(confmat)
fig, ax = plt.subplots(figsize=(2.5, 2.5))
ax.matshow(confmat, cmap=plt.cm.Blues, alpha=0.3)
for i in range(confmat.shape[0]):
    for j in range(confmat.shape[1]):
        ax.text(x=j, y=i, s=confmat[i, j], va="center", ha="center")

plt.xlabel("predicted label")
plt.ylabel('true label')
plt.savefig('shuju8')
plt.show()
# 召回率，准确率，F1
print("precision:%.3f" % precision_score(y_true=y_test, y_pred=y_pred))
print("recall:%.3f" % recall_score(y_true=y_test, y_pred=y_pred))
print("F1:%.3f" % f1_score(y_true=y_test, y_pred=y_pred))

运行结果是：
[[71 1]
[ 2 40]]

这里写图片描述

precision:0.976
recall:0.952
F1:0.964

其中左边是测试集预测有病的一共73个数据，猜对了71个2个猜错了，右边是预测没病的猜错了1个，猜对了40个。

猜你喜欢

转载自blog.csdn.net/huang_yong_peng/article/details/82530606

机器学习练习(一)

机器学习练习（一）——简单的Iris分类练习

吴恩达机器学习练习一

机器学习练习（一）-使用jupyter notebook

机器学习练习一——熔池状态识别

机器学习练习题

机器学习练习题（二）

Boston房价预测·机器学习练习

机器学习练习（四）——异常检测

机器学习练习数据来源

机器学习练习----决策树

python学习练习

吴恩达机器学习练习三

吴恩达机器学习练习二

TensorFlow入门简单机器学习练习

机器学习练习（三）——交叉验证Cross-validation

机器学习练习（二）——标准化Normalization

机器学习练习（五）——高斯异常点检测

吴恩达机器学习练习六

吴恩达机器学习练习五

吴恩达机器学习练习四

机器学习练习 1 - 线性回归（版本

吴恩达机器学习练习2——Logistic回归

机器学习练习--线性回归算法实现

记录机器学习练习中不懂的函数

机器学习练习三——困难样本强化模型

机器学习练习二——GAN算法生成图像

第1 -第10个机器学习练习

C语言学习练习【一维数组】

反思学习复习练习

今日推荐

火速冲上 GitHub 热榜 —— 开源编程语言、框架哪有这么可爱？

北京人形机器人创新中心发布全球首个纯电驱拟人奔跑的全尺寸人形机器人“天工”

LFOSSA 源来如此公开课 | 掌握云原生未来：CNCF 认证全面攻略与备考秘籍

国产云输入法——仅华为无云端数据上传安全问题

开源日报 | 工业开源项目OGG 1.0；姐姐，你要和我一起配置火狐吗；苹果AI遥遥落后？Fedora 40

开放签电子签章：停止新增，优化体验，前进更进（五一假期前工作）

周排行

Metasploit文件目录与入侵基本概念

跨域(CORS)请求问题[No 'Access-Control-Allow-Origin' header is present on the requested resource]常见解决方案

CodeIgniter 源码解读之 CodeIgniter.php（二）

SAS入门之（四）改变数据类型

初识元组

[数学建模]数学建模算法和模型（B站视频）（二）

Nginx 服务器源码安装配置流程

C#实现语音视频录制【基于MCapture + MFile】

开发进度4

下载安装vue的方法网址

每日归档

更多

2024-04-28(0)

2024-04-27(56)

2024-04-26(39)

2024-04-25(22)

2024-04-24(36)

2024-04-23(26)

2024-04-22(39)

2024-04-21(0)

2024-04-20(6)

2024-04-19(5)