UC irvine ML new version website data set uses /pd.read_csv to read data/openml website data set

How data is used on UC Irvine Machine Learning’s new website

New website https://archive.ics.uci.edu/:
Insert image description here

pd.read_csv() reads website data
. Since the new version of the website does not have
the usage method of the old version of Data Folder:

data = pd.read_csv("https://archive.ics.uci.edu/ml/machine-learning-databases/breast-cancer-wisconsin/breast-cancer-wisconsin.data"

Except for known links that can be used, any other data can no longer pass this method (I searched for a long time on the new website and couldn't find the Data Folder).
For example:
Insert image description here
So you can only try to download.
The downloaded data is similar to:
Insert image description here
How to use it?
Download the data first, then the .data data is what you want, just use pd.read_csv().

data = pd.read_csv('../data/breast_cancer_wisconsin_original/breast-cancer-wisconsin.data')

Some data acquisition types for machine learning

pd.read_csv()

  • Read local csv file
facebook = pd.read_csv('../data/FBlocation/train.csv')
  • Some URLs of the old version of uc i ml website
data = pd.read_csv("https://archive.ics.uci.edu/ml/machine-learning-databases/breast-cancer-wisconsin/breast-cancer-wisconsin.data", names=names)
  • The new version of the website data set needs to be downloaded first
data = pd.read_csv('../data/breast_cancer_wisconsin_original/breast-cancer-wisconsin.data')

Use the data sets that come with sklean.datasets

from sklearn.datasets import load_iris
iris = load_iris()

openml data set

Data set website: https://www.openml.org/search?type=data&status=active

from sklearn.datasets import fetch_openml
data = fetch_openml(data_id=853)
# 其中的data_id是网站中每个数据集的独特标号

Insert image description here
Insert image description here

Guess you like

Origin blog.csdn.net/weixin_46483785/article/details/132371212