Python data analysis-Numpy, Matplotlib, Pandas
-
- Python data analysis
- 0. Summary
- One, matplotlib
-
- 1. What is matplotlib
- 2. The basic points of matplotlib
-
- 1. Matplotlib basic drawing
- 2. Matplotlib basic drawing and adjustment of X-axis scale
- 3. Matplotlib plots the temperature from 10 o'clock to 12 o'clock
- 4. matplotlib settings display Chinese
- 5. Matplotlib sets graphics information
- 6. Introduction and summary of the difference between matplotlib drawing multiple graphs and different graphs
- 3. Scatter plots, histograms, and histograms of matplotlib
- 4. More drawing tools
- Two, Numpy learning
- Three, Pandas learning
-
-
- 1. Understanding of Pandas series
- 2. Pandas reads external data
- 3. Pandas dataFrame creation
- 4. Dataframe description information of Pandas
- 3. Pandas dataframe index
- 3. Pandas bool index and processing of missing data
- 3. Movie number histogram
- 3. Common statistical methods of Pandas
- 3. The case of string discretization
- 3. Data consolidation
- 3. Data dispersion and aggregation
- 3. Data index learning
- 3. Data dispersion and aggregation exercises and summary
- 3. Pandas time series
- 3. Case
- 3. PM2.5 case
- 3. Douban TV case
-
Jupyter uses
anaconda powershell prompt to open jupyter notebook
1. Input: cd C:\Users\asus\Desktop\iPython
2. Input: jupyter notebook
Python data analysis
Outline
Basic concepts and environment
matplotlib
Drawing
numpy
Processing numeric arrays
pandas
Handling data types such as numeric arrays, strings, time series, lists, dictionaries, etc.
0. Summary
1. Why study data analysis
- There is a job demand
- Is the foundation of Python data science
- Is the foundation of machine learning courses
- It is very convenient to find some very intuitive experiences and conclusions from a bunch of data for use by yourself or others
2. What is data analysis
Data analysis is to use appropriate methods to analyze a large amount of collected data to help people make judgments in order to take appropriate actions.
Data analysis process:
ask questions → prepare data → analyze data → obtain conclusions → visualize results/text/report
3. Environmental installation
Creation environment : conda creat——name python3 pyhton=3
Switch environment : windows: activate python3
official website address : www.anaconda.com/downdoad/
4. Know jupyter notebook
One, matplotlib
Why learn matplotlib :
- Able to visualize data and present it more intuitively
- Make the data more objective and persuasive
1. What is matplotlib
matplotlib : The most popular Python bottom-level plotting library, mainly for data visualization and charting . The name is based on MATLAB and is constructed in imitation of MATLAB.
2. The basic points of matplotlib
1. Matplotlib basic drawing
axis axis: refers to the x or y axis
Basic points :
each red point is a coordinate , and the coordinates of the 5 points are connected into a line to form a line chart
eg : Suppose the temperature (℃) every two hours (range(2,26,2)) in a day is [15,13,14.5,17,20,25,26,26,27,22,18, 15]
from matplotlib import pyplot as plt
#导入pyplot
x=range(2,26,2)
#数据在x轴的位置,是一个可迭代对象
y=[15,13,14.5,17,20,25,26,26,24,22,18,15]
#数据在y轴的位置,是一个可迭代对象
#x轴和y轴的数据一起组成了所有要绘制出的坐标
#分别是(2,15)(4,13)(6,14.5)……
plt.plot(x,y)
#传入x和y,通过plot绘制出折线图
plt.show()
#在执行程序的时候展示图形
(Python and pycharm can be implemented)
Existing problems:
- Set the image size (I want a large HD uncoded image)
- Save to local
- Descriptive information, such as what the x-axis and y-axis represent, and what does this graph represent
- Adjust the spacing of x or y scales
- Line style (such as color, transparency, etc.)
- Mark special points (such as telling others where the highest and lowest points are)
- Add a watermark to the picture (anti-counterfeiting, prevent theft)
2. Matplotlib basic drawing and adjustment of X-axis scale
from matplotlib import pyplot as plt
#导入pyplot
x=range(2,26,2)
#数据在x轴的位置,是一个可迭代对象
y=[15,13,14.5,17,20,25,26,26,24,22,18,15]
#数据在y轴的位置,是一个可迭代对象
#x轴和y轴的数据一起组成了所有要绘制出的坐标
#分别是(2,15)(4,13)(6,14.5)……
#设置图片大小
plt.figure(figsize=(20,8),dpi=80)
#绘图
plt.plot(x,y)
#传入x和y,通过plot绘制出折线图
#设置x轴的刻度
# plt.xticks(x) #步长2
# plt.xticks(range(2,25))
# _xtick_labels = [i/2 for i in range(2,49)]
# plt.xticks(_xtick_labels)
# _xtick_labels = [i/2 for i in range(2,49)]
# plt.xticks(_xtick_labels[::3])
_xtick_labels = [i/2 for i in range(2,49)]
plt.xticks(range(25,50))
#设置y轴的刻度
# plt.yticks(y)
plt.yticks(range(min(y),max(y)+1))
#保存
plt.savefig("./t1.png")
#展示图形
plt.show()
#在执行程序的时候展示图形
3. Matplotlib plots the temperature from 10 o'clock to 12 o'clock
Case
[1] If the list a represents the temperature every minute from 10 o'clock to 12 o'clock, how to draw a line graph to observe the change of the temperature every minute?
a= [random.randint(20,35) for i in range(120) ]
from matplotlib import pyplot as plt
import random
x = range(0,120)
y = [random.randint(20,35) for i in range(120)]
# 设置图片大小
plt.figure(figsize=(20,8),dpi=80)
plt.plot(x,y)
plt.show()
4. matplotlib settings display Chinese
from matplotlib import pyplot as plt
import random
x = range(0,120)
y = [random.randint(20,35) for i in range(120)]
# 设置图片大小
plt.figure(figsize=(20,8),dpi=80)
plt.plot(x,y)
# 调整x轴的刻度
# _x = x
# _xtick_labels = ["hello,{}".format(i) for i in _x]
# plt.xticks(x,_xtick_labels)
_xtick_labels = ["10点{}分".format(i) for i in range(60)]
_xtick_labels += ["11点{}分".format(i) for i in range(60)]
# 取步长,数字和字符串一一对应,数据的长度一样
plt.xticks(list(x)[::3],_xtick_labels[::3])
plt.show()