1. Pandas库的DataFrame类型
index -> 行索引,colum -> 列索引
索引操作:
[ ] 索引colum,.loc[ ] 索引index
例:
dict = {'one': [1, 2, 3], 'two': [4, 5, 6]}
data = pd.DataFrame(dict, index=['a', 'b', 'c'])
one two
a 1 4
b 2 5
c 3 6
data['one']
a 1
b 2
c 3
Name: one, dtype: int64
data.loc['a']
one 1
two 4
Name: a, dtype: int64
2. 用pandas 读取csv,将其拆分为特征 X 和标签 Y,并将它们转变为NumPy数组
import pandas as pd
import numpy as np
data = pd.read_csv("data.csv")
# Separate the features and the labels into arrays called X and y
X = np.array(data[['x1', 'x2']])
Y = np.array(data['y'])