python实现划分机器学习训练集与测试集

版权声明:本文为博主原创文章,未经博主允许不得转载。 https://blog.csdn.net/Void_worker/article/details/82319542
#!/usr/bin/env python3
# -*- coding: UTF-8 -*-

import numpy as np
from sklearn.model_selection import train_test_split
import pandas as pd

dataSetName = 'ionosphere'
dataSet = pd.read_csv(dataSetName + ".csv").values

# 读取的数据集shape = N*d (样例数*特征数)
# 先将数据集划分为输入数据和分类标签
X = dataSet[:, :-1] # 输入数据
labels = dataSet[:, -1] # 分类标签

X_train, X_test, y_train, y_test = train_test_split(X, labels, test_size=0.3, random_state=42)
# 训练集:测试集=7:3
# 概率划分,到该步骤就可以开始训练数据

参考:
https://blog.csdn.net/u010801439/article/details/79555857

猜你喜欢

转载自blog.csdn.net/Void_worker/article/details/82319542