Getting started with Python crawler 5: Simulate the browser to visit the website

☞ ░Go to LaoYuanPython blog https://blog.csdn.net/LaoYuanPython

I. Introduction

In the previous two sections, we introduced how to use Google browser and IE browser to obtain http-related message information for website visits. This section introduces how to use the obtained information to construct HTTP access message headers in Python applications and simulate the browser Visit the website. The information obtained in this section is based on the http information obtained by the Google browser, which is equivalent to the application accessing the website by simulating the access of the Google browser. The principle of IE is the same, and you can handle it yourself.

2. Obtain the header information of the http request message from the browser

Use the method introduced in " https://blog.csdn.net/LaoYuanPython/article/details/113055084 Python crawler 3: Use Google browser to obtain http information for website visits " to copy the http request header information of the visited website, and use the visit https://blog.csdn.net/LaoYuanPythonas The content of the request header obtained by the example is as follows (only part of the cookies information is taken, replaced by an ellipsis):

:authority: blog.csdn.net
:method: GET
:path: /LaoYuanPython
:scheme: https
accept: text/html,application/xht

Guess you like

Origin blog.csdn.net/LaoYuanPython/article/details/113063101