[Python3 crawler] 15_Fiddler packet capture analysis

We need to crawl some information that cannot be seen in the source code of the webpage, such as Taobao's comments, etc.

We can use the tool Fiddler to scrape

Software download address: https://pan.baidu.com/s/1nPKPwrdfXM62LlTZsoiDsg Password: wche

The installation is not detailed, just go to the next step

After the installation is complete, run the program as follows:

image

set proxy

image

By default, Fiddler can only crawl webpages of HTTP protocol, but cannot crawl webpages of HTTPS protocol, and we often need to crawl webpages of HTTPS protocol.

Tool-----Options-----HTTPS, set as follows:

image

During the setting process, you may be prompted as follows, just click Yes

image

 

Then click Actions----Export Root Certificate to Desktop

image

image

After clicking OK, the icon will appear on the desktop

image

Next, we can import the certificate in the browser, we open the Firefox browser

image

[Options]---[Privacy and Security]---[Certificate]---[View Certificate]---[Import]

image

image

image

 

After the above information is set, we refresh the URL: https://www.taobao.com/

image

Look at Fiddler and nothing

image

So what's the problem?

Next, we win+R and enter certmgr.msc and press Enter to open the certificate manager

image

【Operation】---【Find Certificate】

image

Here, we found a lot of certificates, we right click -- delete all certificates

After the deletion is completed, the picture is as follows:

image

Next, delete the relevant certificate in Firefox

[Options]---[Privacy and Security]---[Certificate]---[View Certificate]

image

Find the Fiddler certificate starting with DO_NOT and delete it

Delete the certificates under [Personal], [Server], and [Other] in turn

After the certificates are deleted

Click the link below to download the file

https://files.cnblogs.com/files/OliverQin/fiddlercertmaker.zip

After downloading, unzip it, open it directly, and ignore the error.

image

Then restart Fiddler, and after restarting, open the comment of any product casually

image

Use clear to clear the content first, then refresh the comment

The page I refresh is as follows:

image

After refreshing, watch Fiddler again

image

We can see that we can already grab it. If the above settings still do not work, open the directory where Fiddler is installed: for example (D:\soft\fiddler)

Go to the directory in cmd and execute the following code

makecert.exe -r -ss my -n "CN=DO_NOT_TRUST_FiddlerRoot, O=DO_NOT_TRUST, OU=Created by http://www.fiddler2.com" -sky signature -eku 1.3.6.1.5.5.7.3.1 -h 1 -cy authority -a sha1 -m 120 -b 09/05/2012

The execution result is as follows:

image

After the execution is complete, export and import the CA certificate again.

Guess you like

Origin http://43.154.161.224:23101/article/api/json?id=324667224&siteId=291194637