Python:通过百度地图API快速获取路对应的行政区域

前言:

最近在做地址标准化的工作,其中一项子任务就是要做地址库,根据内部数据结合前期从网上获取的地址库,计算获得一堆新的路名(未纳入地址库),接下来的工作就是要判断这个新的路名是否跨区域,如果不跨的话属于哪个区?
一开始是通过百度地图一个个搜索的,但是架不住量太大(超过100个),而且后期有可能还会不断有新增。所以干脆用百度API的交叉路搜索,获取该路和其它所有路的交叉信息。以“浦东南路”为例,交叉路口就有900条记录,例如“洪山路与浦东南路交叉口”,把这些记录全部获取后筛选其中的“address”字段,也就是行政区域(如浦东新区),再计算频次,把频次/总量>=95%的认为是没有跨区域,也就是可以把路名和行政区域进行匹配了。

代码

import requests
import json
import math
import collections
import openpyxl

def road_find_district(keywords):
    url = 'https://restapi.amap.com/v3/place/text?keywords=' + keywords + '&types=190302&city=shanghai&output=json&offset=20&page=1&key=d84f3f6b210c6e71c4a1529a2f9b2b53&extensions=all'
    output = json.loads(requests.get(url).text)
    output_count = output['count'] #总共的输出数量
    page_num = math.ceil(int(output_count) / 20) #计算获得页数,向上取整
    address_list = [] #行政区域(比如虹口区)的列表
    # business_area_list = [] #街道/镇的列表
    for i in range(1, int(page_num)+1):
        url = 'https://restapi.amap.com/v3/place/text?keywords=' + keywords + '&types=190302&city=shanghai&output=json&offset=20&page='+ str(i)+'&key=d84f3f6b210c6e71c4a1529a2f9b2b53&extensions=all'
        output = json.loads(requests.get(url).text)
        output_pois = output['pois'] #输出值的列表
        for j in range(len(output_pois)):
            address_list.append(output_pois[j]['address'])
            # business_area_list.append(output_pois[j]['business_area'])
    address_count = collections.Counter(address_list)
    predict_value = address_count.most_common(1)[0][0]
    predict_count = address_count.most_common(1)[0][1]
    predict_percent = predict_count / int(output_count)
    # print(address_count)
    # print(predict_percent)
    if predict_percent >=0.95:
        road_type = 'single_district'
    else:
        road_type = 'mix_district'
    return road_type, predict_value, predict_percent

wb = openpyxl.load_workbook('区镇村路.xlsx')
ws = wb['需要判断']
maxrow = ws.max_row + 1
for i in range(2, maxrow):
    road_name = ws['A'+str(i)].value #路名
    print(road_name)
    try:
        road_type, predict_value, predict_percent = road_find_district(road_name)
        print(road_type, predict_value, predict_percent)
        if road_type == 'single_district':
            ws['B' + str(i)].value = '单'
            ws['C' + str(i)].value = predict_value
        elif road_type == 'mix_district':
            ws['B' + str(i)].value = '混'
    except:
        continue

wb.save('test.xlsx')
wb.close()

成果

在这里插入图片描述

猜你喜欢

转载自blog.csdn.net/weixin_42029733/article/details/106190855