Wu Yuxiong - born natural python learning Notes: crawling from 1990 to 2017 GDP data of China and drawing display

Graphing the data source is usually not fixed, for example, sometimes we need to crawl from the web,
You may need to be obtained from a file or database.
Use of data technology crawls the web to turn China
 from 1990 to 2016 GDP data for crawl out, re-use MatplotUb plotting display.
Found by the search, HTTP: //value500.com/M2GDP.html pages we have the necessary data.

 

 

Move your mouse over each row in the table well right click "year", the shortcut menu, select "Check"
Options.

 

 

 

 

Chrome opens web development tools, and automatically displays html program code of the mouse position, we can root
According html code to fetch data tables: the first year of data located in table 1 Ge td tag, gdp data are in
The first table 3 td tag.
And draw crawling GDP1990 ~ 2016 data of FIG.
Import Requests     # Import web package content fetching 
Import matplotlib.pyplot AS plt    # Import drawing module, rename plt

from bs4 import BeautifulSoup as bs  #导入网页解析模块,重命名为bs
from pylab import rcParams   #导入pylab包

rcParams['font.sans-serif'] = ['SimHei']  #让matplotlib支持简体中文
year = []    #横坐标列表
gdp = []   #纵坐标列表
url = "http://value500.com/M2GDP.html"   #设置要在哪个网页抓数据
content = requests.get(url)   #获取网页内容
content.encoding='utf-8'    #转为utf-8编码
content1=content.text  #取得网页内容的text部分
parse = bs(content1,"html.parser") #进行html解析
data1 = parse.find_all("table")  #获取所有表元素
rows = data1[19].find_all("tr") #取出包含所需数据的表(网页第20个表)
i=0                             #为了不读取表头数据,设置此控制变量
for row in rows:
    cols = row.find_all("td")  #把每一行表数据存入cols变量
    if(len(cols) > 0 and i==0):  #如果是第一行,则控制变量加1
        i+=1
    else:                       #如果不是第一行,则写入绘图列表
        year.append(cols[0].text[:-2])  #取得年份数据(数据的最后两个字符不是数据需去除)并写入图形的year轴
        gdp.append(cols[2].text)     #把gdp值存入gdp轴
        
plt.plot(year, gdp, linewidth=2.0)   #绘制图形,线宽为2
plt.title("1990~2017年度我国GDP")   #设置图形标题
plt.xlabel("年度")    #设置x轴标题
plt.ylabel("GDP(亿元)")  #设置y轴标题
plt.show()    #显示所绘图形

print(year)
print(gdp)

 

 

Guess you like

Origin www.cnblogs.com/tszr/p/12028451.html