python common development operation and maintenance module (7) web detection module pycurl

1. Introduction module

pycurl (http://pycurl.sourceforge.net) is a C language written in libcurl Python implementation, very powerful, supported operating agreement FTP, HTTP, HTTPS, TELNET, etc., it can be understood as the curl command function under Linux Python package, easy to use. Pycurl this section by calling a method provided, detecting the achievement of the quality of Web services, such as the HTTP status code in response to the request delay, HTTP header information, download speed, the location service can use this information to specific areas of slow response, detailed below Be explained

2. Module common methods described

Curl object handles pycurl.Curl () creates a class implements libcurl package, no parameters. More about the introduction of package libcurl see http://curl.haxx.se/libcurl/c/libcurltutorial.html.

Curl objects following description of several commonly used methods. 

· Close () method, a method corresponding to curl_easy_cleanup libcurl package, no parameters to achieve close recovered Curl objects.
· Perform () method, a method corresponding to curl_easy_perform libcurl package, no parameters to achieve the object Curl submitted the request. ·

the setopt (option, value) method, a method corresponding to curl_easy_setopt libcurl package, the parameter is designated by the option libcurl constant, the value of the parameter value will vary depending option, it can be a string, integer, long integer, file object , or a list of functions and so on. Here are a list of commonly used constants:

pycurl.Curl = C ()     # Create a curl objects 
c.setopt (pycurl.CONNECTTIMEOUT,. 5)     # connection waiting time is set to 0 without waiting 
c.setopt (pycurl.TIMEOUT,. 5)     # request time 
c. setopt (pycurl.NOPROGRESS, 0)     # whether the shield download progress bar, non-shielded 0 
c.setopt (pycurl.MAXREDIRS, 5)     # Specifies the maximum number of HTTP redirects 
c.setopt (pycurl.FORBID_REUSE, 1)     # complete interaction after the forced disconnect, not reused 
c.setopt (pycurl.FRESH_CONNECT,. 1)     # mandatory to obtain a new connection, i.e., connection cache replace 
c.setopt (pycurl.DNS_CACHE_TIMEOUT, 60)     # set DNS information to save time, the default 120 seconds 
c.setopt (pycurl.URL, " http://www.baidu.com " )     #Specifies the URL request 
c.setopt (pycurl.USERAGENT, " Mozilla / 5.2 (compatible; MSIE 6.0; Windows NT 5.1; SV1; the .NET the CLR 1.1.4322; the .NET the CLR 2.0.50324) " )     # Configure HTTP request header - Agent-the User 
c.setopt (pycurl.HEADERFUNCTION, getHeader)   # returns the HTTP HEADER directed to a callback function getHeader 
c.setopt (pycurl.WRITEFUNCTION, getBody)     # returns the content is directed to a callback function getBody 
c.setopt (the pycurl. writeHeader, fileobj)      # returns the HTTP HEADER directed to fileobj file object 
c.setopt (pycurl.WRITEDATA, fileobj)     # returns the HTML content directed to file objects fileobj

· Getinfo (option) method, a method corresponding to curl_easy_getinfo libcurl package, the parameter is designated by the option of libcurl constant. Here are a list of commonly used constants:

pycurl.Curl = C ()     # Create a curl objects 
c.getinfo (pycurl.HTTP_CODE)     # returned HTTP status code 
c.getinfo (pycurl.TOTAL_TIME)     # total time consumed by the end of transmission 
c.getinfo (pycurl.NAMELOOKUP_TIME)     # the DNS parse time consumed 
c.getinfo (pycurl.CONNECT_TIME)     # connection time consumed 
c.getinfo (pycurl.PRETRANSFER_TIME)     # from a connection to the transmission time consumed to prepare 
c.getinfo (pycurl.STARTTRANSFER_TIME)     # to establish a connection from the transmission start time consumed 
c.getinfo (pycurl.REDIRECT_TIME)     # time consumed redirection 
c.getinfo (pycurl.SIZE_UPLOAD)     # Upload packet size 
c.getinfo (pycurl.SIZE_DOWNLOAD)     # download packet size
c.getinfo (pycurl.SPEED_DOWNLOAD)     # average download speed 
c.getinfo (pycurl.SPEED_UPLOAD)     # average upload speeds 
c.getinfo (pycurl.HEADER_SIZE)     # HTTP header size

We use this libcurl package provides constant values ​​to achieve the purpose of detecting Web service quality.

3. Practices: Implementing detection Web Quality of Service

HTTP service is one of the most popular Internet applications, quality of service and user experience is good or bad relationship to the level of operational service site, the most commonly used are two standards, one for the availability of the service, such as whether to provide services in a normal state, not appear 404 or 500 page not found error page; second response speed services, such as static class file download time at the millisecond level, dynamic CGI for the second level. This example uses the setopt pycurl getinfo implemented method and HTTP quality of service detection, monitoring URL acquired HTTP status code returned, HTTP status code using pycurl.HTTP_CODE constants obtained from the HTTP request and the completion of each link to download during the response time, by pycurl.NAMELOOKUP_TIME, pycurl.CONNECT_TIME, pycurl.PRETRANSFER_TIME, pycurl.R and other constants to achieve. Also by pycurl.WRITEHEADER, pycurl.WRITEDATA get constant target URL HTTP response headers and page content. Implementation source code as follows:

【/home/test/pycurl/simple1.py】

 

#_*_coding:utf-8_*_
#****************************************************************#
# ScriptName: simple01.py
# Author: BenjaminYang
# Create Date: 2019-06-02 01:37
# Modify Author: BenjaminYang
# Modify Date: 2019-06-02 01:37
# Function: 
#***************************************************************#

#!/usr/bin/python
import os,sys
import time
import pycurl
URL="http://www.google.com.hk" #探测的目标URL 
pycurl.Curl = C ()     # Create a Curl objects 
c.setopt (pycurl.URL, URL) # define constants URL request 
c.setopt (pycurl.CONNECTTIMEOUT,. 5)   # define the latency of the connection request 
c.setopt (pycurl .TIMEOUT,. 5)     # define request time 
c.setopt (pycurl.NOPROGRESS,. 1)   # shielding download progress bar 
c.setopt (pycurl.FORBID_REUSE,. 1) # after completion of the interaction force a disconnection or reusable 
c.setopt ( pycurl.MAXREDIRS, 1)    # Specifies the maximum number of HTTP redirects to 1 
c.setopt (pycurl.DNS_CACHE_TIMEOUT, 30)    # set the time to save DNS information is 30 seconds 
# to create a file object to "wb" open with http header and page content to store returned 
INDEXFILE = Open (os.path.dirname (the os.path.realpath ( __FILE__ )) + "/content.txt " , " WB " ) 
c.setopt (pycurl.WRITEDATA, indexfile) # returned HTML file object contents indexfile directed to 
the try : 
    c.perform () # commit request 
the except Exception, E:
     Print  " Connection error : " + STR (E) 
    indexfile.close () 
    c.close () 
    the sys.exit () 
NAMELOOKUP_TIME = c.getinfo (c.NAMELOOKUP_TIME)   # Get DNS resolution 
CONNECT_TIME = c.getinfo (c.CONNECT_TIME)   # Get established connection time 
PRETRANSFER_TIME = c.getinfo (c.PRETRANSFER_TIME) # obtain the time from transmission of a connection to the preparation consumed
= C.getinfo STARTTRANSFER_TIME (c.STARTTRANSFER_TIME)     # acquired from the transmission start time to establish a connection to the consumption 
TOTAL_TIME = c.getinfo (c.TOTAL_TIME) # total transmission time acquisition 
HTTP_CODE = c.getinfo (c.HTTP_CODE) # Get HTTP status code 
SIZE_DOWNLOAD = c.getinfo (c.SIZE_DOWNLOAD) # for downloading packet size 
header_size = c.getinfo (c.HEADER_SIZE)   # Get HTTP header size 
SPEED_DOWNLOAD = c.getinfo (c.SPEED_DOWNLOAD)   # obtains an average download speed 
# print data output 
Print  " the HTTP status code:% S " % (HTTP_CODE)
 Print  " the DNS resolution time:.%. 2F MS " % (NAMELOOKUP_TIME * 1000 )
 Print  "Establish a connection time:.% 2F MS " % (CONNECT_TIME * 1000 )
 Print  " ready for transmission time:.% 2F MS " % (PRETRANSFER_TIME * 1000 )
 Print  " transmission start time:.% 2F MS " % (STARTTRANSFER_TIME * 1000 )
 Print  " end total transmission time:.%. 2F MS " % (TOTAL_TIME * 1000 )
 Print  " Download packet size:% bytes D / S " % (SIZE_DOWNLOAD)
 Print  " the HTTP header size: D% byte " % (header_size)
 Print  " average download speed:% bytes D / S "% (SPEED_DOWNLOAD)
 # close the file and Curl objects
indexfile.close()
c.close()

 

Guess you like

Origin www.cnblogs.com/benjamin77/p/10961773.html