Python基础灬机器学习库(Numpy,Pandas,Matplotlib)

Numpy库

numpy库用于高性能科学计算和数据分析,是常用的高级数据分析库的基础包。

# 1.一维数组
arr1 = np.array([1, 2, 3])
print(arr1, arr1.dtype)

arr2 = np.array([1.2, 2.3, 3.4])
print(arr2, arr2.dtype)

arr3 = arr1 + arr2
print(arr3)

# [1 2 3] int32
# [1.2 2.3 3.4] float64
# [2.2 4.3 6.4]


# 2.二维数组
arr1 = np.array([[1, 2, 3], [4, 5, 6]])
print(arr1, arr1.dtype)
# [[1 2 3]
#  [4 5 6]] int32


# 3.定义全0矩阵
print(np.zeros(5))
# [0. 0. 0. 0. 0.]

print(np.zeros([3, 4]))
# [[0. 0. 0. 0.]
#  [0. 0. 0. 0.]
#  [0. 0. 0. 0.]]

Pandas库

pandas库包含两种数据结构series和dataframe,分别对应以为数组和二维数组的处理。

Series基本操作

# 1.创建list,自带索引
obj = Series([4, 5, 6, 8])
print(obj.index)
print(obj.values)
print(obj)
# RangeIndex(start=0, stop=4, step=1)
# [4 5 6 8]
# 0    4
# 1    5
# 2    6
# 3    8
# dtype: int64

# 2.更改索引
dic = Series([1, 2, 3, 4], index=['a', 'b', 'c', 'd'])
print(dic)
print(dic['a']) # 按字典的访问方式进行访问
# a    1
# b    2
# c    3
# d    4
# dtype: int64
# 1

# 3.字典转换为Series,键作为索引
dic = {'name': 'jack', 'sex': 'male', 'age': 13}
obj = Series(dic)
print(obj)
# name    jack
# sex     male
# age       13
# dtype: object

DataFrame基本操作

# 1.DataFrame 构造表格
data = {
    'city': ['beijing', 'shanghai', 'tianjin'],
    'year': [2015, 2016, 2014],
    'gdp': [1, 2, 3]
}
# index:指定索引
# columns: 指定列排序
frame = DataFrame(data, index=range(1, 4), columns=['year', 'city', 'gdp'])
print(frame)

#    year      city  gdp
# 1  2015   beijing    1
# 2  2016  shanghai    2
# 3  2014   tianjin    3

# 2.取表格中数据
print(frame[0:2])  # 切片取行
#    year      city  gdp
# 1  2015   beijing    1
# 2  2016  shanghai    2

print(frame['city']) # 取列
# 1     beijing
# 2    shanghai
# 3     tianjin
# Name: city, dtype: object     

# 3.添加新的一列
frame['pop'] = [100, 200, 300]
print(frame)

#    year      city  gdp  pop
# 1  2015   beijing    1  100
# 2  2016  shanghai    2  200
# 3  2014   tianjin    3  300

frame['capital'] = frame['city'] == 'beijing' # 根据现有列生成新的一列
print(frame)
#    year      city  gdp  pop  capital
# 1  2015   beijing    1  100     True
# 2  2016  shanghai    2  200    False
# 3  2014   tianjin    3  300    False

# 4.字典嵌套构造DataFrame
data2 = {
    'beijing': {2008: 1, 2009: 2, 2010: 3},
    'shanghai': {2008: 2, 2009: 3, 2010: 4}
}
frame2 = DataFrame(data2)
print(frame2)
#       beijing  shanghai
# 2008        1         2
# 2009        2         3
# 2010        3         4

print(frame2.T) # 转置矩阵
#           2008  2009  2010
# beijing      1     2     3
# shanghai     2     3     4

Matplotlib库

matplotlib是python的一个2D绘图库。

# 绘制简单曲线
plt.plot([1, 2, 3], [4, 9, 6])
plt.show()

绘制简单曲线

...

猜你喜欢

转载自www.cnblogs.com/july-3rd/p/10730683.html