针对 urllib.request 的简单理解

方法一:

import urllib.request

response = urllib.request.urlopen('https://www.baidu.com/') #这里我理解接受返回并打开了
html = response.read() #这里读取

print(response) #看一下返回,对象和内存地址
print(type(response)) # 类型
print(html) #反回一串字节码

结果是:

<http.client.HTTPResponse object at 0x0000000002DD19B0>
<class 'http.client.HTTPResponse'>
b'<html>\r\n<head>\r\n\t<script>\r\n\t\tlocation.replace(location.href.replace("https://","http://"));\r\n\t</script>\r\n</head>\r\n<body>\r\n\t<noscript><meta http-equiv="refresh" content="0;url=http://www.baidu.com/"></noscript>\r\n</body>\r\n</html>'

方法二:

import urllib.request

req = urllib.request.Request('https://www.baidu.com') #还没请求,只是制作头部
response = urllib.request.urlopen(req) # 头部发送并接收返回
b_page = response.read() #读取返回的数据,是字节码

page = b_page.decode('utf-8') # 转换编码

print(type(req))
print(type(response))
print(type(b_page))
print(b_page)
print('\n--------------------------------\n\n')
print(page)
print('\n--------------------------------\n\n')
print(req)
print(response)


结果:

<class 'urllib.request.Request'>
<class 'http.client.HTTPResponse'>
<class 'bytes'>
b'<html>\r\n<head>\r\n\t<script>\r\n\t\tlocation.replace(location.href.replace("https://","http://"));\r\n\t</script>\r\n</head>\r\n<body>\r\n\t<noscript><meta http-equiv="refresh" content="0;url=http://www.baidu.com/"></noscript>\r\n</body>\r\n</html>'

--------------------------------


<html>
<head>
    <script>
        location.replace(location.href.replace("https://","http://"));
    </script>
</head>
<body>
    <noscript><meta http-equiv="refresh" content="0;url=http://www.baidu.com/"></noscript>
</body>
</html>

--------------------------------


<urllib.request.Request object at 0x00000000027A80F0>
<http.client.HTTPResponse object at 0x0000000002DD39B0>

方法三:

import urllib.parse
import urllib.request
url = '"
values = {
'act' : 'login',
'login[email]' : '',
'login[password]' : ''
}
data = urllib.parse.urlencode(values)      # 将数据 "编码"
req = urllib.request.Request(url, data)    #制作请求
req.add_header('Referer', 'http://www.python.org/')    #在请求req中添加头部
response = urllib.request.urlopen(req)    #发送请求并接收返回数据
the_page = response.read()    #读取数据,肯定是b字节码了
print(the_page.decode("utf8"))    # 解码数据

猜你喜欢

转载自blog.csdn.net/abcdasdff/article/details/82057407