python socket学习 遇到400 Bad Request

第一篇博客

一直以来就很想尝试写博客,但是却因为自己的种种原因迟迟未动。
今天在学习的时候遇到了一个小问题,让我觉得,我必须得用某种方法把自己在学习的时候所踩到的坑,有过的收获记录下来。

那就开始写博客吧!

记录一下这个问题:学习python socket编程时 以下代码遇到的问题

#!/usr/bin/env python3
# -*- coding: utf-8 -*-

import socket
import ssl
# s = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
s = ssl.wrap_socket(socket.socket())
s.connect(('www.sina.com.cn', 443))

s.send(b'GET/HTTP/1.1\r\nHost:www.sina.com.cn\r\nConnection:close\r\n\r\n')
# s.send()
'''
s.send('Connection: keep-alive\r\n'.encode())
s.send('Cache-Control: no-cache\r\n'.encode())
s.send('Accept: text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,image/apng,*/*;q=0.8\r\n'.encode())
s.send('Upgrade-Insecure-Requests: 1\r\n'.encode())
s.send('User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/66.0.3359.139 Safari/537.36\r\n'.encode())
s.send('Accept-Encoding: gzip, deflate, br\r\n'.encode())
s.send('Cookie: lianjia_uuid=ce61c41c-25b0-46d6-a0a0-d57a75ee8706; UM_distinctid=1631f588055f9-0286722badd3ec-b34356b-1fa400-1631f58805657f; _ga=GA1.2.43397143.1525239286; _smt_uid=5ae94e02.558be516; _jzqx=1.1525248800.1525335927.1.jzqsr=zh%2Elianjia%2Ecom|jzqct=/ershoufang/xiangzhouqu/.-; _jzqc=1; _jzqy=1.1525239284.1525594526.2.jzqsr=baidu.jzqsr=baidu|jzqct=%E9%93%BE%E5%AE%B6; _jzqckmp=1; _gid=GA1.2.1028411676.1525594529; Hm_lvt_9152f8221cb6243a53c83b956842be8a=1525594526,1525594536,1525594804,1525595210; select_city=440400; all-lj=c60bf575348a3bc08fb27ee73be8c666; _qzjc=1; lianjia_ssid=99306d63-8ee5-a53c-a740-2d3021f3db2f; CNZZDATA1255604082=964175865-1525237915-https%253A%252F%252Fwww.lianjia.com%252F%7C1525602095; _jzqa=1.3750161754444366000.1525239284.1525594526.1525603274.8; CNZZDATA1254525948=963210960-1525238218-https%253A%252F%252Fwww.lianjia.com%252F%7C1525603556; CNZZDATA1255633284=1054798284-1525238580-https%253A%252F%252Fwww.lianjia.com%252F%7C1525603557; Hm_lpvt_9152f8221cb6243a53c83b956842be8a=1525606057; _jzqb=1.9.10.1525603274.1; _qzja=1.1070225156.1525239298260.1525597069547.1525603274282.1525605398368.1525606071025.0.0.0.86.8; _qzjb=1.1525603274282.9.0.0.0; _qzjto=23.2.0\r\n\r\n'.encode())

'''

buffer = []
d = s.recv(1024)
while d:
    buffer.append(d)
    d = s.recv(1024)
data = b''.join(buffer)

s.close()

header, html = data.split(b'\r\n\r\n', 1)

print(header.decode('utf-8'))
with open('sina.html', 'wb') as f:
    f.write(html)


以上代码遇到问题 运行环境 Windows10 python3.7 在pycharm上运行

D:\Desktop\untitled\venv\Scripts\python.exe D:/Desktop/untitled/text.py
HTTP/1.1 400 Bad Request
Server: edge-esnssl-1.17.3-14.3
Date: Fri, 10 Apr 2020 11:01:05 GMT
Content-Type: text/html
Content-Length: 168
Connection: close
X-Via-CDN: f=edge,s=cnc.qinhuangdao.edssl.12.nb.sinaedge.com,c=58.244.92.156;

我之前学习写过爬虫,以为 400 Bad Request可能是浏览器认为不安全,于是我在网上搜索了一波,就尝试了 代码里被注释掉的一坨东西。然并卵。
于是我把别人的代码直接粘过来运行了一波,然后就

D:\Desktop\untitled\venv\Scripts\python.exe D:/Desktop/untitled/text.py
HTTP/1.1 200 OK
Server: edge-esnssl-1.17.3-14.3
Date: Sat, 11 Apr 2020 01:17:17 GMT
Content-Type: text/html
Content-Length: 539883
Connection: close
Vary: Accept-Encoding
ETag: "5e911a18-7e9d9"V=CCD0B746
X-Powered-By: shci_v1.03
Expires: Sat, 11 Apr 2020 01:17:27 GMT
Cache-Control: max-age=60
X-Via-SSL: ssl.44.sinag1.yz.lb.sinanode.com
Age: 50
Via: https/1.1 cnc.yizhuang.union.92 (ApacheTrafficServer/6.2.1 [cRs f ]), https/1.1 cnc.qinhuangdao.union.53 (ApacheTrafficServer/6.2.1 [cRs f ])
X-Via-Edge: 1586567837080186611af1d041679547efa2d
X-Cache: HIT.53
X-Via-CDN: f=edge,s=cnc.qinhuangdao.edssl.11.nb.sinaedge.com,c=175.17.102.24;f=edge,s=cnc.qinhuangdao.union.57.nb.sinaedge.com,c=121.22.4.11;f=Edge,s=cnc.qinhuangdao.union.53,c=121.22.4.57
Process finished with exit code 0

然后,分析应该自己憨憨有语法错误,一行一行对比,
果然 上面的代码第10行

这是正确的

s.send(b'GET / HTTP/1.1\r\nHost:www.sina.com.cn\r\nConnection:close\r\n\r\n')

这是我的

s.send(b'GET/HTTP/1.1\r\nHost:www.sina.com.cn\r\nConnection:close\r\n\r\n')

总结
这次憨憨事件是开始写博客的导火索
以后写代码要时时刻刻注意语法细节

欢迎各位评论区赐教

发布了1 篇原创文章 · 获赞 0 · 访问量 0

猜你喜欢

转载自blog.csdn.net/u010486308/article/details/105447256