Python crawler selection 08 episodes (HTTP proxy artifact Fiddler)

HTTP proxy artifact Fiddler

1. Fiddle definition

Fiddler is a powerful Web debugging tool, it can record all client and server HTTP requests. When Fiddler starts, the default IE proxy is set to 127.0.0.1:8888, while other browsers need to be set manually.

2. Working principle

Fiddler works as a proxy web server, it uses the proxy address: 127.0.0.1, port: 8888
Insert picture description here

Three. Fiddler grab HTTPS settings

  1. Start Fiddler, open Tools> Telerik Fiddler Options in the menu bar, and open the "Fiddler Options" dialog box.

Insert picture description here

  1. Set up Fiddler:

    Open the
    toolbar- >Tools->Fiddler Options->HTTPS, select Capture HTTPS CONNECTs (capture HTTPS connections),
    select Decrypt HTTPS traffic (decrypt HTTPS traffic)
    In addition, we need to use Fiddler to obtain the HTTPS requests of all processes of the machine, so the middle In the drop-down menu, select…from all processes and
    select Ignore server certificate errors below

Insert picture description here

  1. Configure Windows to trust this root certificate for Fiddler to resolve the security warning: Trust Root Certificate.
    Insert picture description here

  2. Fiddler main menu Tools -> Fiddler Options…-> Connections

    Select Allow remote computers to connect
    Act as system proxy on startup

  3. Restart Fiddler to make the configuration take effect (this step is very important and must be done).

Four. How Fiddler captures the Chrome session

  1. Install SwitchyOmega proxy management Chrome browser plug-in
    Insert picture description here

  2. As shown in the figure, set the proxy server to 127.0.0.1:8888

Insert picture description here

  1. Switch to the set proxy through the browser plug-in.

Insert picture description here

Five. Fiddler interface

  • After setting, the local HTTP communication will pass through the 127.0.0.1:8888 proxy, and it will be intercepted by Fiddler.
    Insert picture description here

Six. Detailed explanation of the request part

Headers —— 显示客户端发送到服务器的 HTTP 请求的 header,显示为一个分级视图,包含了 Web 客户端信息、Cookie、传输状态等。
Textview —— 显示 POST 请求的 body 部分为文本。
WebForms —— 显示请求的 GET 参数 和 POST body 内容。
HexView —— 用十六进制数据显示请求。
Auth —— 显示响应 header 中的 Proxy-Authorization(代理身份验证) 和 Authorization(授权) 信息.
Raw —— 将整个请求显示为纯文本。
JSON - 显示JSON格式文件。
XML —— 如果请求的 body 是 XML 格式,就是用分级的 XML 树来显示它。

Seven. Detailed explanation of the response (Response) part

Transformer —— 显示响应的编码信息。
Headers —— 用分级视图显示响应的 header。
TextView —— 使用文本显示相应的 body。
ImageVies —— 如果请求是图片资源,显示响应的图片。
HexView —— 用十六进制数据显示响应。
WebView —— 响应在 Web 浏览器中的预览效果。
Auth —— 显示响应 header 中的 Proxy-Authorization(代理身份验证) 和 Authorization(授权) 信息。
Caching —— 显示此请求的缓存信息。
Privacy —— 显示此请求的私密 (P3P) 信息。
Raw —— 将整个响应显示为纯文本。
JSON - 显示JSON格式文件。
XML —— 如果响应的 body 是 XML 格式,就是用分级的 XML 树来显示它 。

Guess you like

Origin blog.csdn.net/weixin_38640052/article/details/108115629