requests module - used by proxy proxies

Understand the use of proxy and proxy proxy parameters

What is an agent?

  • A proxy is an intermediary server that acts as a bridge between the client and the target server.
  • The proxy server helps forward the client's request and returns the target server's response to the client.

What is proxy ip?

  • The proxy ip is a specific IP address that points to the location of the proxy server.
  • By setting the proxy proxy parameter and specifying the proxy ip, the client's request will be sent to the proxy server instead of directly accessing the target server.

The role of proxy proxy parameters

  • Using the proxy proxy parameter, we can tell the client to use a specific proxy server for network requests.
  • In this way, our real identity and location are hidden, increasing network security and privacy protection.

Proxy usage scenarios

  • Data collection and web crawler: Proxy can help collect data between multiple websites in turn to prevent IP from being blocked.
  • Access to restricted websites: Proxies allow us to bypass some regional or network restrictions and access blocked content.
  • Improved security: The proxy filters malicious content, protecting clients from malicious attacks.

Summarize

The use of proxy ip and proxy proxy parameters allows us to add an intermediate layer in network requests, hide real identities, and improve privacy protection and security. It has important applications in data collection, accessing restricted websites, and improving network security.

insert image description here

The difference between forward proxy and reverse proxy

  1. Angle distinction:

    • A forward proxy is a proxy server that forwards requests for the browser or client (the party that sent the request).
    • A reverse proxy is a proxy server that forwards requests for the server that ultimately handles them.
  2. Forward proxy:

    • The forward proxy forwards the request for the browser or the client, and the client knows the real ip address of the server that finally processes the request, such as a VPN.
    Metaphor explanation: Forward proxy is like you have hired a middleman (agent) to help you buy things. You tell the middleman what you want to buy, and the middleman knows what you want and helps you buy it. But the last thing you get is still handed over to you by the middleman. You know that the middleman will buy it for you in the end.
  3. reverse proxy:

    • A reverse proxy forwards requests not for the browser or client, but for the server that ultimately handles the request.

    • The client does not know the real address of the server, the request is sent to the reverse proxy server, and then the reverse proxy server forwards it to the server that finally processes the request, such as nginx.

      Metaphor explained: A reverse proxy is like a front desk clerk in a store. When you go to the store to shop, you don't know that the salesperson is actually behind your back to pick up the goods for you and then deliver them to you. The salesperson acts as a middleman, hiding the actual origin of the goods.

In the forward proxy, the proxy server is located between the client and the target server, hiding the real identity of the client, the client sends a request through the proxy server, and the proxy server forwards the request to the target server. In the reverse proxy, the proxy server is located between the target server and the client, hiding the true identity of the target server, the client sends a request to the reverse proxy server, and then the reverse proxy server forwards the request to the target server. Such different roles and forwarding methods make the application scenarios of forward proxy and reverse proxy different.

Classification of proxy ip (proxy server)

  1. According to the degree of anonymity of proxy IP, proxy IP can be divided into the following three categories:

    • Transparent Proxy (Transparent Proxy): Although a transparent proxy can directly "hide" your IP address, you can still find out who you are. The request headers received by the target server are as follows:

      # 这段代码是用于获取客户端的真实 IP 地址的方法,但由于可能存在代理服务器,需要进行一些检查
      
      # 获取客户端 IP 地址
      REMOTE_ADDR = Proxy IP
      
      # 检查是否存在 HTTP_VIA 头,该头字段常用于指示请求是否经过代理服务器
      HTTP_VIA = Proxy IP
      
      # 检查是否存在 HTTP_X_FORWARDED_FOR 头,该头字段常用于表示客户端真实 IP 地址,但也可以被伪造
      HTTP_X_FORWARDED_FOR = Your IP
      
      
    • Anonymous Proxy (Anonymous Proxy): With an anonymous proxy, others can only know that you use a proxy, but cannot know who you are. The request headers received by the target server are as follows:

      # 示例代码块:使用匿名代理隐藏客户端真实IP地址
      
      # 这个示例代码块模拟了在使用匿名代理时,请求头信息中显示的内容。
      # 匿名代理隐藏了客户端的真实IP地址,将其显示为代理服务器的IP地址,提高了用户的隐私保护。
      
      # 请求头信息示例:
      # 客户端的真实IP地址被隐藏,显示为代理服务器的IP地址
      REMOTE_ADDR = 代理IP
      
      # HTTP_VIA头字段表示请求经过了代理服务器,显示为代理服务器的IP地址
      HTTP_VIA = 代理IP
      
      # HTTP_X_FORWARDED_FOR头字段表示客户端的真实IP地址,但在匿名代理中仍然显示为代理服务器的IP地址
      HTTP_X_FORWARDED_FOR = 代理IP
      
      
    • High Anonymity Proxy (Elite proxy or High Anonymity Proxy): High Anonymity Proxy makes it impossible for others to find that you are using a proxy, so it is the best choice. There is no doubt that using a high-profile proxy works best . The request headers received by the target server are as follows:

      # 示例代码块:使用高匿代理隐藏客户端真实IP地址
      
      # 这个示例代码块模拟了在使用高匿代理时,请求头信息中显示的内容。
      # 高匿代理完全隐藏了客户端的真实IP地址和使用代理的事实,保护隐私最好的选择。
      
      # 请求头信息示例:
      # 客户端的真实IP地址被隐藏,显示为代理服务器的IP地址
      REMOTE_ADDR = 代理IP
      
      # HTTP_VIA头字段没有显示代理服务器的IP地址,表示请求未经过代理服务器
      HTTP_VIA = not determined
      
      # HTTP_X_FORWARDED_FOR头字段没有显示客户端真实IP地址,表示无法得知客户端的真实身份
      HTTP_X_FORWARDED_FOR = not determined
      
      
  2. Depending on the protocol used by the website, you can choose the proxy service of the corresponding protocol. The following are the proxy services of different protocols and their characteristics:

    • HTTP Proxy: Suitable for accessing websites using the HTTP protocol.
    • HTTPS Proxy: Suitable for accessing websites using the HTTPS protocol, providing higher security.
    • SOCKS tunnel proxy (like SOCKS 5 proxy):
      1. The SOCKS proxy only transmits data packets and does not care about specific application protocols (such as FTP, HTTP, HTTPS, etc.).
      2. Compared to HTTP and HTTPS proxies, SOCKS proxies are usually less time-consuming.
      3. SOCKS proxy can forward HTTP and HTTPS requests, with greater flexibility.

    Choosing the right proxy service depends on the protocol type and performance requirements of the websites being visited.

The use of proxies proxy parameters

In order to make the server think that the same client is not requesting; in order to prevent frequent requests to a domain name from being blocked, we need to use proxy ip; then we will learn how the requests module uses proxy ip

  • usage:

    response = requests.get(url, proxies=proxies)
    
  • The form of proxies: dictionary

  • For example:

    proxies = {
          
           
        "http": "http://12.34.56.79:9527", 
        "https": "https://12.34.56.79:9527", 
    }
    
  • Note: If the proxies dictionary contains multiple key-value pairs, the corresponding proxy ip will be selected according to the protocol of the url address when sending the request

Guess you like

Origin blog.csdn.net/m0_67268191/article/details/132144420