Python technical knowledge to obtain data and visualize it (take a hot pot restaurant as an example)


I want to eat hot pot, but I don’t know how to choose. I used python to cook the liver for one night and grabbed the national hot pot restaurants for top10 analysis.

foreword

The object is from Sichuan and Shu, who prefer to eat hot pot, and worry about choosing an online celebrity shop every time, so I spent a night using python crawler to grab all the shops in the city for visual analysisinsert image description here

Requirement: Capture the number of hot pot restaurants in the city, and visualize the data to browse the distribution of hot pot restaurants in different cities in the province in a more intuitive way

The data in this article comes from a certain map, and the data is obtained and visualized through python technical knowledge

Note: The content of this article is only used for programming technology learning and discussion. The relevant code and data cannot be used for commercial purposes, otherwise the consequences will be at your own risk.

1. Data traceability

1.1 Open the map search, you can see that a lot of store data can be displayed on the map, so where does the data come from?

insert image description here

1.2 Network Assistant Debugging

Open the network debugging assistant, you can see that there is the data of the corresponding store. The data transmission is all interacted through this API. You can request this interface to obtain the required data through the crawler.

**Note:** Regarding the use of the network debugging assistant, click the blue font to get the relevant information

insert image description here

2. Write a crawler program

2.1 Import related libraries

import requests,openpyxl
from numpy import mean
from pyecharts import options as opts
from pyecharts.charts import Map

2.2 Request data

Let's start writing the request data code (remember to bring headers when requesting)

headers = {
    
    
        "User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/89.0.4389.90 Safari/537.36",
        'Referer':'https://map.baidu.com/@12949550.923158279,3712445.9716704674,6.28z',
        "Cookie":";"你的cookie",
}

url = "https://ditu.baidu.com/?newmap=1&reqflag=pcmap&biz=1&from=webmap&da_par=direct&pcevaname=pc4.1&qt=s&da_src=searchBox.button&wd=%E7%81%AB%E9%94%85%E5%BA%97&c=158&src=0&wd2=&pn=0&sug=0&l=13&b=(12553849.45,3237935.24;12570777.45,3265551.24)&from=webmap&biz_forward={%22scaler%22:1,%22styles%22:%22pl%22}&sug_forward=&auth=P65Ox7I43B3Ta0COBJTb5D4NVW9RBQ9TuxLETRBxBLLty9iRyki%3DxXwvYgP1PcGCgYvjPuVtvYgPMGvgWv%40uVtvYgPPxRYuVtvYgP%40vYZcvWPCuVtvYgP%40ZPcPPuVtvYgPhPPyheuVtvhgMuxVVtcvY1SGpuTtGKD%3DCCGYuxtE20w5V198P8J9v7u1cv3uxt2dd9dv7uPWv3Guxt58Jv7uPYIUvhgMZSguxzBEHLNRTVtcEWe1aDYyuVt%40ZPuzteL1wWveuxtf0wd0vyMFUSCy7OAupt66FKEu%3D%3D8xX&seckey=vHBTJ4tdi68MW8qWw%2BjU2KFSTFNFo3ItXO6ack3ti8w%3D%2CAp6F2yrR-L11fgqtb_BCcR__vsbaezgdq3dBSEVigT5dYmDiJD8CMaToeS_RfR0pFYByyqzM_Fym7UZvX8dmUA_npbBsJiTpMFwIgVQ5pFQ4nDgupLc5wRg_xqikNzFJMAI55erqBKkbkNQqXfrs9hl6futZVDWgi_jFWBfUDhiNyCGARzZeP0UzmuY9sAJX&device_ratio=1&tn=B_NORMAL_MAP&nn=0&u_loc=12568222,3256533&ie=utf-8&t=1649831407880&newfrom=zhuzhan_webmap"

response = requests.get(url,headers=headers).json()

The cookie here can be copied in the browser network.

insert image description here

From the returned json data, we can know that our target data is in the content, which contains the list data and store resources (overall_rating is the rating, phone is the store phone number, price is the average price, and name is the store name)

2.3 The following is part of the store data

       res = session.get(url, headers=headers)
        if res.status_code == 200:
            items = res.json()
            for i in items.get('content')[0:10]:
                ext = i.get('ext').get('detail_info')
                overall_rating = ext.get('overall_rating')
                phone = ext.get('phone')
                price = ext.get('price')
                name = ext.get('name')
                print(overall_rating,phone,price,name)

insert image description here

3. Data is stored in the table

work = openpyxl.Workbook()
ws = work.create_sheet(title='省数据', index=0)
ws.append(['评分', '联系方式', '价格', '店名'])

insert image description here

4. Data analysis

Rank and count TOP10 stores according to the value score

insert image description here

5. The distribution of the number of hot pot restaurants in Hunan

In order to draw the distribution map of cities, Hunan Province was chosen as an example for drawing (if you want to draw all the cities in the country, the resulting map will be dense and unsightly)

    c2 = (
        Map()
            .add(f"湖南{
      
      wd}店数量各市统计", bb, "湖南")
            .set_global_opts(
            title_opts=opts.TitleOpts(title=f"湖南{
      
      wd}店数量分布"), visualmap_opts=opts.VisualMapOpts()
        )
            .render(f"湖南{
      
      wd}店数量分布.html")
    )
    return c1,c2

insert image description here

6. The distribution of the number of hot pot restaurants in the country

u

attr = data['省份'].tolist()
value = data['数量'].tolist()
name = []
for i in attr:
    if "省" in i:
        name.append(i.replace("省",""))
    else:
        name.append(i)
from pyecharts import options as opts
from pyecharts.charts import Map
from pyecharts.faker import Faker
c = (
    Map()
        .add("数量", [list(z) for z in zip(name, value)], "china")
        .set_global_opts(title_opts=opts.TitleOpts(title="全国火锅店数量分布情况"))
        .render("全国火锅店数量分布情况.html")
)

insert image description here

Summarize

Inspiration comes from life, so if you don't have an object, you can quickly find one, so as to stimulate you, so as to expand your thinking logic and face more technical difficulties.

↓ ↓ ↓ ↓ Look at more awesome technologies and get more source code ↓ ↓ ↓ ↓

Guess you like

Origin blog.csdn.net/AI19970205/article/details/124171891