Python Learning Road 16 - Using the API

This series is a compilation of notes for the introductory book "Python Programming: From Getting Started to Practice", which belongs to the primary content. The title sequence follows the title of the book.
This article is the third article in Python data processing, and this article will use the web application interface to automatically request and visualize specific information of a website.

1 Introduction

This will require the use of the requests module to request website data. The main contents are as follows:

  • Request project data from GitHub, sorted by star;
  • Use to pygalvisualize the above data;
  • Call the Hacker News API

2. GitHub repositories

Get the description information of the repository in GitHub, and sort by stars:

# 代码:
import requests

# 执行API调用并存储响应,注意不要输错了!
url = "https://api.github.com/search/repositories?q=language:python&sort=stars"
r = requests.get(url)
print("Status code:", r.status_code)

# 将API响应存储在一个变量中
response_dict = r.json()
print("Total repositories:", response_dict["total_count"])

# 探索有关仓库的信息
repo_dicts = response_dict["items"]
print("Repositories returned:", len(repo_dicts))

# 研究第一个仓库
repo_dict = repo_dicts[0]
print("\nKeys:", len(repo_dict))
for key in sorted(repo_dict.keys()):
    print(key)

# 结果:
Status code: 200
Total repositories: 2563652
Repositories returned: 30

Keys: 72
archive_url
archived
assignees_url
blobs_url
-- snip --

Some requests may not be successful and may require your personal authorization code:

headers = {"Authorization":"your perosonal token"}
url = "https://api.github.com/search/repositories?q=language:python&sort=stars"
r = requests.get(url, headers=headers)

Most APIs have a rate limit, which is the number of requests that can be executed in a given time. For GitHub's rate limit, visit https://api.github.com/rate_limit , the time is "per minute".

3. Visualize the warehouse using Pygal

Use a parameter configuration class to define the parameters of the chart, customize the description information of each bar in the chart, and add URL links to these bars.

import requests
import pygal
from pygal.style import LightColorizedStyle as LCS, LightenStyle as LS

-- snip --
repo_dicts = response_dict["items"]

names, plot_dicts = [], []
for repo_dict in repo_dicts:
    names.append(repo_dict["name"])

    plot_dict = {
        # 每个数据的值
        "value": repo_dict["stargazers_count"],
        # 自定义每个数据的描述信息
        # 原文中没有将其转换成str,报错;可能现在数据类型更改了?
        "label": str(repo_dict["description"]),
        # 为每个条添加网址链接
        "xlink": repo_dict["html_url"],
    }
    plot_dicts.append(plot_dict)

# 可视化
my_style = LS("#333366", base_style=LCS)
# 图表配置类
my_config = pygal.Config()
# x轴标签顺时针旋转45度
my_config.x_label_rotation = 45
# 不显示图例
my_config.show_legend = False
my_config.title_font_size = 24
my_config.label_font_size = 14
# 主标签大小,y轴
my_config.major_label_font_size = 18
# x轴标签最长15个字符
my_config.truncate_label = 15
# 隐藏水平线
my_config.show_y_guides = False
my_config.width = 1000

chart = pygal.Bar(my_config, style=my_style)
chart.title = "Most-Starred Python Projects on GitHub"
chart.x_labels = names

chart.add("", plot_dicts)
chart.render_to_file("python_repos.svg")

Get the following form:
write picture description here

Now each data has its own description, and clicking on them also jumps to their project website. Note the ticks on the left y-axis, the ticks in the book are dense, but the same code here is sparse for some reason, so the effect of the 34th line of code is not reflected here.

4. Hacker News API

Hacker News' API gives you access to information on all articles and comments on the site without having to register for a key. Let's get the ID of the current popular article on it through an API call, and then view the first 30 articles (it may not be accessible, as for the reason and how to do it, you know):

import requests
from operator import itemgetter

# 执行API调用并存储响应
url = "https://hacker-news.firebaseio.com/v0/topstories.json"
r = requests.get(url)
print("Status code:", r.status_code)

# 处理有关每篇文章的信息
submission_ids = r.json()
submission_dicts = []
for submission_id in submission_ids[:30]:
    # 对于每篇文章,都执行一个API调用
    url = ("https://hacker-news.firebaseio.com/v0/item/" + str(submission_id) + ".json")
    submission_r = requests.get(url)
    print(submission_r.status_code)
    response_dict = submission_r.json()

    submission_dict = {
        "title": response_dict["title"],
        "link": "http://news.ycombinator.com/item?id=" + str(submission_id),
        "comments": response_dict.get("descendants", 0)
    }
    submission_dicts.append(submission_dict)

submission_dicts = sorted(submission_dicts, key=itemgetter("comments"), reverse=True)

for submission_dict in submission_dicts:
    print("\nTitle:", submission_dict["title"])
    print("Discussion link:", submission_dict["link"])
    print("Comments:", submission_dict["comments"])

Here is the output:

Status code: 200
200
200
-- snip --

Title: Wells Fargo Hit with $1B in Fines
Discussion link: http://news.ycombinator.com/item?id=16886328
Comments: 358

Title: Want airline food? Take Amtrak
Discussion link: http://news.ycombinator.com/item?id=16882231
Comments: 160

-- snip --

5. Summary

Two projects have been completed so far, and there is the last Django project left in this book. Starting from the next one, I also use three articles to get a preliminary understanding of Django and make a simple web application.

Guess you like

Origin http://43.154.161.224:23101/article/api/json?id=325657700&siteId=291194637