Python uses seleniumwire to get http request header information

Why use seleniumwire instead of native selenium

Recently, I need to grab the information of a page, but the page data request is an asynchronous post request. This data asynchronous request is anti-crawled. There are three parameters in the request header that are generated by js. It is difficult to crack the parameters through js, so I want to use [seleniumwire 】Get the three parameters encrypted by js, let [seleniumwire] help us analyze js.

seleniumwire installation

Please refer to " python seleniumwire installation "

Get request headers using seleniumwire

# 是seleniumwire 不是 selenium
from seleniumwire import webdriver
from selenium.webdriver.support.wait import WebDriverWait

# 配置驱动
driver = webdriver.Firefox(executable_path='E:\Python\learnning\meituan\webdriver\geckodriver.exe')
# 发送请求
driver.get('http://www.baowugroup.com/media_center/news?page=1')
# 等待数据加载完
WebDriverWait(driver, 20).until(lambda x: x.find_element_by_xpath('//h1[@class="page-top-img__title--c"]'))
# 遍历打印请求列表
for request in driver.requests:
    print(request,request.headers,request.response)

    # 根据自己的需求找出请求
    if request.response and "x-signature" in request.headers:
        print(request.headers["x-signature"])

driver.close()

Original address: Deta Blog 

Guess you like

Origin blog.csdn.net/qq_42200107/article/details/125256481