Python数据爬虫学习笔记(21)爬取京东商品JSON信息并解析

一、需求:有一个通过抓包得到的京东商品的JSON链接,解析该JSON内容,并提取出特定id的商品价格p,json内容如下:

jQuery923933([{"op":"7599.00","m":"9999.00","id":"J_5089253","p":"7099.00"},
{"op":"48.00","m":"96.00","id":"J_16463451903","p":"38.00"},
{"op":"59.00","m":"229.00","id":"J_33440061157","p":"59.00"},
{"op":"79.00","m":"80.00","id":"J_6027746","p":"79.00"},
{"op":"32.90","m":"59.00","id":"J_33183063203","p":"32.90"},
{"op":"169.00","m":"699.00","id":"J_33341525798","p":"169.00"},
{"op":"228.00","m":"399.00","id":"J_30639439257","p":"228.00"},
{"op":"188.00","m":"199.00","id":"J_25539002541","tpp":"130.00","up":"tpp","p":"138.00"},
{"op":"55.00","m":"99.00","id":"J_3136674","p":"39.90"},
{"op":"25.90","m":"55.90","id":"J_5338456","p":"22.50"},
{"op":"50.00","m":"50.00","id":"J_11170365589","p":"50.00"}]);

     注意到该json内容是一个数组(array),由中括号[ ]括起来,并非是一个由大括号{ }括起来的对象(object)。

二、编写代码:

import urllib.request
import re
import json

#爬取json数据内容
data=urllib.request.urlopen("https://p.3.cn/prices/mgets?callback=jQuery923933&type=1&area=1&pdtk=&pduid=15374502312291140901533&pdpin=&pin=null&pdbp=0&skuIds=J_5089253%2CJ_16463451903%2CJ_33440061157%2CJ_6027746%2CJ_33183063203%2CJ_33341525798%2CJ_30639439257%2CJ_25539002541%2CJ_3136674%2CJ_5338456%2CJ_11170365589&ext=11100000&source=item-pc").read()
#将数据内容转换为字符串
str1=str(data)
#去掉字符串的无用信息,本例为首尾的圆括号前后部分
str1 = str1[(str1.find('(')+1):str1.rfind(')')]
#将json数据转换为python数据格式,此处jdata为list数组
jdata=json.loads(str1)
#遍历数据,找出特定id的p数值
for i in range(0,len(jdata)):
    jdataObj=jdata[i]
    if jdataObj["id"]=="J_5089253":
        print(jdataObj["p"])

三、补充:

猜你喜欢

转载自blog.csdn.net/Smart3S/article/details/82950586