Use Python to implement word cloud analysis of commodity reviews

environment use

  • Python 3.8
  • Pycharm

module use

  • requests
  • jieba participle
  • wordcloud word cloud

Data source analysis

Clarify requirements <data source analysis>

  • What is the collected data? Get the content of the desired data through that url address

  • Packet capture analysis: browser built-in tools --> developer tools

    I. F12 or right-click to check and select network Click on the second page
    II. Copy the comment content, search in the developer tools, you can directly find the corresponding comment data package

https://club.jd.com/comment/productPageComments.action?callback=fetchJSON_comment98&productId=100029079354&score=0&sortType=5&page=1&pageSize=10&isShadowSku=0&rid=0&fold=1

insert image description here

Data acquisition code implementation

send request

Click to get the source code

url = 'https://***屏蔽一下不然不给过.com'
# 请求参数 --> 字典数据类型 构建完整键值对
data = {
    
    
    # 'callback': 'fetchJSON_comment98',
    'productId': '100029079354',
    'score': '0',
    'sortType': '5',
    'page': page,
    'pageSize': '10',
    'isShadowSku': '0',
    'rid': '0',
    'fold': '1',
}
# 模拟浏览器 --> headers 请求头
headers = {
    
    
    'user-agent': 'Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/101.0.0.0 Safari/537.36'
}
# 发送请求 requests 模块 get 方法<请求方式>
# 等号左边: url/params/headers 属于get函数里面形式参数 等号右边 url/data/headers 传入进去参数/变量
response = requests.get(url=url, params=data, headers=headers)

retrieve data

The server returns response data

  • response response object
  • response.text Get the response text data
  • response.json() Get response json dictionary data

Analytical data

Dictionary data type: Extract data content through key-value pairs <dictionary value>
According to the content on the left of the colon [key], extract the content on the right of the colon [value]

# for循环遍历 把列表里面元素一个一个提取出来
for i in response.json()['comments']:
    content = i['content']
    print(content)

save data

python学习交流Q群:770699889 ### 源码领取
with open('口红评论.txt', mode='a', encoding='utf-8') as f:
    # 写入数据内容
    f.write(content)
    f.write('\n')

word cloud code

Click to get the source code

insert image description here
insert image description here

Well, today’s sharing is over here, if you have any questions about the article, you can leave a message or private message

insert image description here

Guess you like

Origin blog.csdn.net/Dangerous_li/article/details/127752490