Requests the basic use of third-party libraries python

Original link: http://www.cnblogs.com/Estate-47/p/9799332.html

Requests are written in python language, based on urllib, using Apache2 Licensed open-source HTTP protocol library. It is more convenient than urllib, we can save a lot of work to fully meet the needs of HTTP test. Requests Philosophy Is PEP idioms 20 as the center of development, so it is more than Pythoner urllib.

By installing pip

pip install requests 

a most basic get request

. 1  Import Requests
 2  
. 3 REQ = requests.get ( ' https://www.cnblogs.com/ ' ) # Common get request 
. 4  Print (req.text) # parsing the page tags, meta tags header lookup head of < charset = Meta "UTF-. 8" /> 
. 5  Print (req.content) # out of Chinese some distortion, need to be decoded 
. 6  Print (req.content.decode ( ' UTF-. 8 ' )) # with decode decodes
1 requests.get(‘https://github.com/timeline.json’) #GET请求
2 requests.post(“http://httpbin.org/post”) #POST请求
3 requests.put(“http://httpbin.org/put”) #PUT请求
4 requests.delete(“http://httpbin.org/delete”) #DELETE请求
5 requests.head(“http://httpbin.org/get”) #HEAD请求
6 requests.options(“http://httpbin.org/get”) #OPTIONS请求

GET method is not only simple, other methods are unified interface style

Second, the need to obtain a user name with post password page

 

. 1  Import Requests
 2  
. 3 PostData = {
 . 4      ' name ' : ' Estate ' ,
 . 5      ' Pass ' : ' 123456 ' 
. 6 } # must be a dictionary type 
. 7 REQ = requests.post ( ' http://www.iqianyue.com/ mypost ' , Data = PostData)
 . 8  Print (req.text) # after entering the login page 
. 9  
10 yonghu = req.content # result of a user login 
. 11 F = Open ( '1.html ' , ' WB ' ) # writes the result 1.html 
12 is  f.write (yonghu)
 13 is f.close ()

http://www.iqianyue.com/mypost
Need to be logged into this site, we want to define a user name and password dictionary

Run no error can write the result in an HTML file

 1 <html>
 2 <head>
 3 <title>Post Test Page</title>
 4 </head>
 5 
 6 <body>
 7 <form action="" method="post">
 8 name:<input name="name" type="text" /><br>
 9 passwd:<input name="pass" type="text" /><br>
10 <input name="" type="submit" value="submit" />
11 <br />
12 you input name is:estate<br>you input passwd is:123456</body>
13 </html>

 

Third, with headers for anti-climb

 

. 1  Import Requests
 2  
. 3 headers = {
 . 4  ' the User-- Agent ' : ' the Mozilla / 5.0 (the Windows NT 10.0; the WOW64) AppleWebKit / 537.36 (KHTML, like the Gecko) the Chrome / 63.0.3239.132 Safari / 537.36 ' 
. 5 } # request header sent request, the plurality of header fields can be added directly in the dictionary 
. 6 REQ = requests.get ( ' http://maoyan.com/board ' , headers = headers) # pass arguments 
. 7  Print (req.text)

 1 User-Agent':'Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/63.0.3239.132 Safari/537.36 

Some pages will go after the 403 Forbidden, we need to look into the web page in the User-Agent hesders and added to the dictionary when reading the arguments to be passed, and crawling cat's eye movie information page running

 

Fourth, use cookies to skip landing

 

. 1  Import Requests
 2  
. 3 F = Open ( ' cookies.txt ' , ' R & lt ' )
 . 4  # initializes variables cookies dictionary 
. 5 cookies = {}
 . 6  # loop through dicing for. By character; cut reading, and then returns a list of data to traverse 
. 7  for Line in reached, f.read () Split (. ' ; ' ):
 . 8      # Split parameter is set to 1, the string is cut into two parts 
. 9      name, value = line.split ( ' = ' ,. 1 )
 10      # add content dictionary Cookies 
. 11     cookies[name] = value
12 url='https://www.cnblogs.com/'
13 res=requests.get(url,cookies=cookies)
14 data=res.content
15 f1=open('bokeyuan.html','wb')
16 f1.write(data)
17 f1.close()
18 f.close()

After first entering a user name and password to obtain cookies web page, copy and paste to the new text, create an empty dictionary cookies with a loop through for cutting. The cookies according to the character field; reading cut into two parts.

After running writes the results to be named bokeyuan of html files into html files directly click on the page icon to enter the page after landing

 

5, agents IP

1 import requests
2 
3 proxies={
4     'HTTP':'183.129.244.17:10080'
5 }
6 req=requests.get('https://www.taobao.com/',proxies=proxies)
7 print(req.text)

When the acquisition was closed in order to avoid IP, often using a proxy. requests have corresponding proxiesattributes. We can find on the web proxy IP, enter the proxy IP address and port in the dictionary, multiple IP can be added directly behind the dictionary. If the proxy requires an account and password, you need this:

1 proxies = {
2     "http": "http://user:[email protected]:3128/",
3 }

 

Sixth, timeout setting

1 import requests
2 
3 req=requests.get('https://www.taobao.com/',timeout=1)
4 print(req.text)
timeout is only effective connection process, regardless of the Download Response body 

above operation based library Requests subsequent supplements do ......


 

Reproduced in: https: //www.cnblogs.com/Estate-47/p/9799332.html

Guess you like

Origin blog.csdn.net/weixin_30500289/article/details/94918995