[Jingdong] product details page + product list data collection

As one of the largest e-commerce platforms in China, JD.com’s data collection has multiple dimensions. 

 

Some people need to collect product information, including categories, brands, product names, prices, sales and other fields, to understand product sales status, popular product attributes, and make market expansion and important decisions; 

Some people need to collect product reviews to clarify product advantages and disadvantages, market intentions, and conduct new product research and optimization; 

In addition to the above, there are many application scenarios waiting to be explored. The following is a detailed introduction to the method of Jingdong data collection. 

Jingdong data collection method 

Since it is imperative to collect data from JD.com, how to do it? Is it to find the Jingdong website to be collected, copy and paste the data one by one into the excel table? Or find a crawler engineer and write a crawler program for collection? 

For ordinary people, these two methods are extremely costly and inefficient - the first method consumes a lot of manpower and may make many mistakes; the second method is very costly and requires a long learning time, and it is difficult to learn in a short period of time. Finish. Is there a way that ordinary people can easily collect Jingdong? 

The following are several JD data collection tutorials that we have compiled. You can operate according to the graphic description, and the field extraction can be increased or decreased according to your actual needs. 

1. Collection of product information on JD.com 

Collection content: After JD.com searches for keywords, the product list information collection that appears 

Collection fields: product title, product link, product price, product picture link, product evaluation quantity, product store name, product store link 

Open the Jingdong product details page (example URL: https://item.jd.com/100016944073.html ), collect the data obtained after clicking different parameters (color, version, etc.) (product number, price, main image link, etc.) will vary with the parameters).

collection field

Product title, color, version, price, product name, product number, image URL, etc.

collection result

The collection results can be exported to Excel, CSV, HTML, database and other formats. Example of exporting to Excel:

 

 Encapsulated into Jingdong product details data (JD.item_get_app) interface code display

1. Request method: HTTP GET POST

2. Request public parameters:

name type must describe
key String yes Call key (must be spliced ​​in the URL in GET mode, request link: http://c0b.cc/R4rbK2 )
secret String yes Call key (copy v: Taobaoapi2014)
api_name String yes API interface name (included in the request address) [item_search, item_get, item_search_shop, etc.]
cache String no [yes, no] The default is yes, the cached data will be called, and the speed is relatively fast
result_type String no [json,jsonu,xml,serialize,var_export] returns the data format, the default is json, and the content output by jsonu can be read directly in Chinese
lang String no [cn,en,ru] translation language, default cn Simplified Chinese
version String no API version

3. Request code example, support high concurrent requests (CURL, PHP, PHPsdk, Java, C#, Python...)

# coding:utf-8
"""
Compatible for python2.x and python3.x
requirement: pip install requests
"""
from __future__ import print_function
import requests
# 请求示例 url 默认请求参数已经做URL编码
url = "https://api-gw.19970108018.cn/jd/item_get/?key=<您自己的apiKey>&secret=<您自己的apiSecret>&num_iid=10335871600"
headers = {
    "Accept-Encoding": "gzip",
    "Connection": "close"
}
if __name__ == "__main__":
    r = requests.get(url, headers=headers)
    json_obj = r.json()
    print(json_obj)

4. Code error code description

Guess you like

Origin blog.csdn.net/tbprice/article/details/130321559