Python common libraries: Master these libraries and easily improve your programming skills!

first understand

1. python库的导入
2. 数据分析常用python库:
    1. Numpy,
    2. SciPy,
    3. Matplotlib,
    4. pandas,
    5. StatsModels,
    6. scikit-learn,
    7. keras

1. Import of python library

There are always some software enthusiasts and developers who are willing to contribute the results of their labor. These contributed python libraries implement general functions, such as statistical calculations, graphics, etc. More and more people contribute their knowledge to improve these libraries, so that Make python more and more powerful. We call this group "open source community" and "open source contributors".

Therefore, we need to import python libraries developed by third parties, which can also be called packages or modules.

Python information, source code, and tutorials can be obtained by clicking on the article to jump here

1. The import statement is used to import python

## math库用于数学运算
import math
math.sin(0)
math.sin(math.pi/2)

2. Name of the custom library

import math as m
m.sin(0)

3. Import specific functions in the library

from math import exp as e
e(2)

4. Import all functions in the library

from math import *
exp(2)
sin(pi/2)

5. Download and install the third-party library

pip install

easy_install 

conda install 

# 安装numpy库
pip install numpy

2. Commonly used python libraries for data analysis

1. Numpy

Since python does not have its own array type, if the list is used to process arrays and matrices, the operation efficiency is very low.

Numpy is dedicated to array and matrix operations, and its efficiency is mainly due to two factors:

(1) Written in C language, which is close to the bottom layer of the system and has high execution efficiency

(2) numpy will arrange a memory space with continuous addresses, and the memory addressing efficiency is high

pip install numpy # 下载并安装numpy
conda install numpy # 同上

import numpy as np               # 命名numpy为np
a = np.array([2,0,1,5])          # 创建数组
print(a)                         # 输出a
print(a[:3])                     # 对a切片,输出前三个元素
print(a.min())                   # 输出a中的最小元素
a.sort()                         # 对a的元素进行排序
b = np.array([[1,2,3],[4,5,6]])  # 创建二维数组
print(b*b)                       # 计算数组的元素平方   

2. SciPy

In fact, the numpy array operation is not a real matrix operation, because as in the above example, the multiplication of two numpy arrays is actually just the product of the corresponding elements. SciPy uses the default operation rules for matrices.


# 求解非线性方程组
# 2x1 - x2^2 = 1
# x1^2 - x2 = 2

from scipy.optimize import fsolve
def f(x):
  x1 = x[0]
  x2 = x[1]
  return [2*x1 - x2**2 - 1, x1**2 - x2 - 2]

result = fsolve(f, [1,1])
print(result)

# 定积分
from scipy import integrate
def g(x):
  return (1-x**2)**0.5      # 被积函数 (1-x^2)^0.5
  
pi_2, err = integrate.quad(g, -1, 1)   # pi_2是积分结果,err是误差
print(pi_2, pi_2*2) 

3. Matplotlib

matplotlib is a popular visualization library for python. This library is good at 2D graphics.

import numpy as np
import matplotlib.pyplot as plt

x = np.linspace(0,10,1000)
y = np.sin(x) + 1
z = np.cos(x**2) + 1

plt.figure(figsize = (8,4))
plt.plot(x,y,label = '$\sin x+1$', color = 'red', linewidth = 2)
plt.plot(x, z, 'b--', label = '$\cos x^2+1$')
plt.xlabel('Time(s)')
plt.ylabel('volt')
plt.title('A Simple Example')
plt.ylim(0,2.2)
plt.legend()
plt.show()

4. pandas

The naming of pandas comes from the combination of panel data panel data and data analysis data analysis.

AQR Capital Management successfully developed it in 2008 and open sourced it in 2009.

Pandas is a common tool for data analysis, but we may not be able to remember all the commands. Once there are unclear functions, you can search Baidu or refer to related reference books.

# 安装pandas
pip install pandas
# 安装excel读写工具库
pip install xlrd xlwt

# 使用pandas
import numpy as np
import pandas as pd
s = pd.Series([1,2,3], index = ['a','b','c']) # 序列s
d = pd.DataFrame([[1,2,3],[4,5,6]], columns=['a','b','c']) # 数据框(表)d
d2 = pd.DataFrame(s)  # 转换s为数据框,保存到d2

d.head()
d.describe()

d.to_excel('data.xls')  # 保存为xls文件
pd.read_excel('data.xls') # 读取xls文件

d.to_csv('data.csv') # 保存为csv
pd.read_csv('data.csv') #读取csv


5. StatsModels

statsmodel for statistical analysis

# 进行时间序列的ADF平稳性检验
from statsmodels.tsa.stattools import adfuller as ADF # 导入ADF检验函数
import numpy as np
ADF(np.random.rand(100)) # 生成100个随机数数组,返回平稳性检验结果

6. scikit-learn,

Referred to as sklearn, it is a python library for machine learning. Most machine learning tasks can be done with sklearn.

# 鸢尾花数据训练SVM模型
from sklearn import datasets # 导入sklearn自带数据集
iris = datasets.load_iris()  # 鸢尾花数据集调用.load_iris(),保存为iris
print(iris.data.shape)       # 查看数据集大小
print(iris['target_names'])
print(iris['feature_names'])

#(150, 4)
#['setosa' 'versicolor' 'virginica'] 
#['sepal length (cm)', 'sepal width (cm)', 'petal length (cm)', 'petal width (cm)']

from sklearn import svm      # 导入SVM随机向量机模型

clf = svm.LinearSVC()        # 调用SVM线性分类器
clf.fit(iris.data, iris.target)  # 训练模型
clf.predict([[5.0, 3.6, 1.3, 0.25]])  # 输入一组鸢尾花参数进行判断

#array([0])

7. hard
keras专门用于人工神经网络计算。现有的几个人工神经网络库有:pytorch,tensorflow,sklearn,theano

keras曾经与tensorflow合并,由于tensorflow库的模块调用比较混乱,导致了keras逐渐与tensorflow脱离依赖关系。

learning ideas

If you want to learn more about Python, here are some related learning materials, such as Python learning roadmap, e-books, course source code, materials, order receiving process, etc.

This article module\environment\source code\tutorial can click here to jump to the free collar

Guess you like

Origin blog.csdn.net/weixin_45841831/article/details/130387220