Python small project: query product information through product barcode


insert image description here

Complete project download: download link

1 product barcode

On weekdays, everyone will buy a lot of commodities, whether it is beverages, food, medicines, daily necessities, etc., there will be barcodes on the packaging of the commodities.
Commodity bar codes include codes and bar codes for retail items, non-retail items, logistics units, and locations. Our country adopts the internationally common commodity code and barcode identification system, promotes the application of commodity barcodes, and establishes our country's commodity identification system.
Retail goods refer to goods that are settled through POS scanning at the retail end. Its barcode identification consists of a Global Trade Item Number (GTIN) and its corresponding barcode symbols. The bar code identification of retail goods mainly adopts EAN/UPC bar code. A combination pack of a can of beer, a bottle of shampoo and a bottle of conditioner can be sold as a retail item to the final consumer.
Generally speaking, every commodity circulating in the market will have its own commodity barcode.
insert image description here

2 The purpose of querying the barcode of the commodity

From a technical point of view, the purpose of using Python to query product information through product barcodes this time is to practice crawler technology.
From the perspective of life, this project can query the information of the purchased goods to ensure the reliability of the source and composition of the goods.

3 Implementation steps and code

3.1 Introduction to Crawling Websites

The website link is as follows: Barcode query website
The screenshot of the website is as follows:
insert image description here
You can see that you enter the barcode of a certain commodity in the website, and then enter the verification code. Click to search for product information. Take "6901028001915" as an example to perform a query. The screenshot is as follows:
insert image description here

3.2 python code implementation

3.2.1 Log module

To save operation records, add a log module to the project, the code is as follows:

import logging
import logging.handlers
'''
日志模块
'''
LOG_FILENAME = 'msg_seckill.log'
logger = logging.getLogger()


def set_logger():
    logger.setLevel(logging.INFO)
    formatter = logging.Formatter('%(asctime)s - %(process)d-%(threadName)s - '
                                  '%(pathname)s[line:%(lineno)d] - %(levelname)s: %(message)s')
    console_handler = logging.StreamHandler()
    console_handler.setFormatter(formatter)
    logger.addHandler(console_handler)
    file_handler = logging.handlers.RotatingFileHandler(
        LOG_FILENAME, maxBytes=10485760, backupCount=5, encoding="utf-8")
    file_handler.setFormatter(formatter)
    logger.addHandler(file_handler)

set_logger()

3.2.2 Query module

As can be seen from the screenshot above, the website query requires digital verification code verification, so the ddddocr package is used here to identify the verification code. Import the corresponding package:

from logging import fatal
import ddddocr
import requests
import json
import os
import time
import sys
from msg_logger import logger

Next is the main code of the project, which is explained in detail in the comments of the entire operation logic code:

headers = {
    
    'User-Agent':'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/94.0.4606.81 Safari/537.36'}
path = os.path.abspath(os.path.dirname(sys.argv[0]))

# json化
def parse_json(s):
    begin = s.find('{')
    end = s.rfind('}') + 1
    return json.loads(s[begin:end])

# 创建目录
def mkdir(path):
     # 去除首位空格
    path = path.strip()
    # 去除尾部 \ 符号
    path = path.rstrip("\\")
    # 判断路径是否存在
    isExists=os.path.exists(path)
    # 判断结果
    if not isExists:
        os.makedirs(path)
        logger.info(path + ' 创建成功')
        return True
    else:
        # 如果目录存在则不创建,并提示目录已存在
        logger.info(path + ' 目录已存在')
        return False

# 爬取 "tiaoma.cnaidc.com" 来查找商品信息
def requestT1(shop_id):
    url = 'http://tiaoma.cnaidc.com'
    s = requests.session()

    # 获取验证码
    img_data  = s.get(url + '/index/verify.html?time=',  headers=headers).content
    with open('verification_code.png','wb') as v:
        v.write(img_data)

    # 解验证码
    ocr = ddddocr.DdddOcr()
    with open('verification_code.png', 'rb') as f:
        img_bytes = f.read()
    code = ocr.classification(img_bytes)
    logger.info('当前验证码为 ' + code)
    # 请求接口参数
    data = {
    
    "code": shop_id, "verify": code}
    resp = s.post(url + '/index/search.html',headers=headers,data=data)
    resp_json = parse_json(resp.text)
    logger.info(resp_json)

    # 判断是否查询成功
    if resp_json['msg'] == '查询成功' and resp_json['json'].get('code_img'):
        # 保存商品图片
        img_url = ''
        if resp_json['json']['code_img'].find('http') == -1:
            img_url =  url + resp_json['json']['code_img']
        else:
            img_url =  resp_json['json']['code_img']

        try:
            shop_img_data  = s.get(img_url,  headers=headers, timeout=10,).content
             # 新建目录
            mkdir(path + '\\' + shop_id)
            localtime = time.strftime("%Y%m%d%H%M%S", time.localtime())
            # 保存图片
            with open(path + '\\' + shop_id + '\\' + str(localtime) +'.png','wb') as v:
                v.write(shop_img_data)
            logger.info(path + '\\' + shop_id + '\\' + str(localtime) +'.png')
        except requests.exceptions.ConnectionError:
            logger.info('访问图片URL出现错误!') 
       
    if resp_json['msg'] == '验证码错误':
        requestT1(shop_id)

    return resp_json

3.2.3 Running Results

if __name__ == "__main__":
    try:
        dict_info = requestT1('6901028001915')['json']
        print(dict_info['code_sn'])
        print(dict_info['code_name'])
        print(dict_info['code_company'])
        print(dict_info['code_address'])
        print(dict_info['code_price'])
    except:
        print('商品无法查询!')

Try to run the code, take "6901028001915" as an example, and check the running results: the
insert image description here
visible product information is successfully queried.

4 Epilogue

Through the guidance of this article, we have a deep understanding of how to use Python to query product information through product barcodes. Commodity barcodes play an important role in the modern retail industry. By writing Python programs, we have realized the function of automatically querying commodity details through barcodes. By scraping the website and using Python's requests library, we were able to fetch product information from third-party APIs and organize it into an easy-to-understand format. During the implementation process, we introduced how to use the log module for error tracking, and how to combine the query module to fully realize the program functions. Finally, we show the running results and verify the accuracy and convenience of querying product information. Through this article, readers will be able to easily master the skills of querying product information, and provide a more convenient shopping experience for businesses and consumers.

insert image description here

Guess you like

Origin blog.csdn.net/weixin_46043195/article/details/125794653