Understanding: OAuth2.0 Protocol

Understanding the OAuth2.0 Protocol

Quote from: Original Source

1. Introduction

If you drive to a hotel for a dinner, you will often be unable to find a parking space and delay a lot of time. Is there a good way to avoid this problem? Yes, I heard that some owners of luxury cars are not worried about this problem. Luxury cars are generally equipped with two types of keys: a master key and a parking key. When you arrive at the hotel, you only need to hand over the parking key to the waiter, and the waiter will handle the parking. Compared to the master key, the function of this parking key is limited: it can only start the engine and drive the car for a limited distance, it can lock the car, but it cannot open the trunk or use other equipment in the car. Here is a simple "open authorization" idea: with a parking key, the owner can authorize some functions of the car (such as starting the engine, driving a limited distance) to the waiter.

Authorization is an ancient concept, and it is a feature that a multi-user system must support. For example, Alice and Bob are both Google users, so Alice should be able to authorize Bob's access to her photos. But please note that this kind of authorization is a closed authorization, which only supports mutual authorization between users within the system, but cannot support authorization with other external systems or users. For example, Alice wants to use the "NetEase Printing Service" to print out some of her photos. How can she do it?

Some people will definitely say that Alice can tell NetEase Printing Service her Google username and password, and the matter will be solved? Yes, but only students who are not concerned about security and privacy will make this "trick". So let's think about it, what are the problems with this "trick"? (1) Alice's username and password may be cached by NetEase Printing Services, and may not be encrypted. Once it was attacked, Alice would lie down and be shot. (2) The NetEase Imaging service can access all Alice's resources on Google, and Alice cannot perform minimal permission control on them, such as only allowing access to a certain photo, and the access is valid within 1 hour. (3) Alice cannot revoke her individual authorization unless Alice updates her password.

In the era of cloud computing with Web services as the core, the authorization needs of user Alice have become increasingly urgent and prosperous, and "Open Authorization" was born because of this, which is intended to help Alice authorize her resources to A third-party application that supports fine-grained permission control and will not leak Alice's password or other authentication credentials.

According to different application scenarios, there are currently two methods for implementing open authorization: one is to use the OAuth protocol [1]; the other is to use the IAM service [2]. The OAuth protocol is mainly suitable for open authorization of resources for individual users, such as Google's user Alice. OAuth is characterized by "on-site authorization" or "online authorization": the client mainly accesses resources through the browser, and Alice's resource owner identity needs to be authenticated during authorization, and Alice's on-site approval is required. OAuth is generally widely used in SNS services, such as Weibo. The IAM service is different. It is characterized by "pre-authorization" or "offline authorization": the client mainly accesses resources through the REST API, and the resource owner can know in advance the resource requests required by third-party applications. will rarely change. IAM services are generally used in cloud computing services, such as AWS services and Alibaba Cloud computing services.

This article mainly introduces OAuth open authorization. I'll cover open authorization as an IAM service in another blog post. Let me introduce the OAuth 2.0 protocol, the instantiation description of the protocol, and the security analysis.



2. OAuth 2.0 Protocol

OAuth 2.0 is a popular practice at present, and it was first used by Google, Yahoo, Microsoft, Facebook, etc. The reason why it is marked as 2.0 is because there was a 1.0 protocol initially, but this 1.0 protocol was made too complicated and was not easy to use, so it was not popularized. 2.0 is a new design, the protocol is simple and clear, but it is not compatible with 1.0, it can be said that it has nothing to do with 1.0. So, I will only introduce 2.0.

2.1 Participants of the protocol

From the description in the introduction, we can see that there are at least the following three participating entities in OAuth:
- RO (resource owner): The resource owner, who has the ability to authorize the resource. User Alice above.
- RS (resource server): Resource server, which stores resources and handles access requests to resources. Such as the Google resource server, the resource it keeps is the photo of user Alice.
- Client : A third-party application, which can access the resources of the RO after obtaining the authorization of the RO. Such as NetEase printing services.

In addition, in order to support the open authorization function and better describe the open authorization protocol, OAuth introduces a fourth participating entity:
- AS (authorization server): The authorization server, which authenticates the RO's identity, provides the RO's authorization approval process, and finally Issue an Access Token. Readers, please note that for the convenience of the description of the protocol, the AS and the RS are only logically distinguished here; physically, the functions of the AS and the RS can be provided by the same server.


2.2 Authorization type

In open authorization, a third-party application (Client) may be a Web site, a piece of JavaScript code running in a browser, or an application program installed locally. These third-party applications have their own security features. For Web sites, it is separate from the RO browser, it can save sensitive data in the protocol by itself, and these keys can not be exposed to RO; for JavaScript code and locally secure applications, it runs natively In RO's browser, RO can access the client's sensitive data in the protocol.

In order to support these different types of third-party applications, OAuth proposes a variety of authorization types, such as Authorization Code Grant, Implicit Grant, Resource Owner Password Credentials Grant, and Client Credentials Grant ( Client Credentials Grant). Since this article aims to help users understand the OAuth protocol, I will first introduce the basic ideas of these authorization types, and then select one of the most core, difficult to understand, and most widely used authorization types - "authorization code", for in-depth 's introduction.


2.3 OAuth Protocol - Basic Ideas


[Figure 1: Abstract Protocol Flow]

As shown in Figure 1, the basic flow of the protocol is as follows:

(1) Client requests RO authorization. The request generally includes: resource path to be accessed, operation type, Client identity and other information.

(2) The RO approves the authorization and sends the "authorization evidence" to the Client. As for how the RO approves, this is a matter outside the agreement. Typically, the AS provides an authorization approval interface for the RO to explicitly approve. This can refer to the description in the instantiation analysis in the next section.

(3) The client requests an "Access Token" from the AS. At this point, the client needs to provide the AS with the "authorization evidence" of the RO and the credentials of the client's own identity.

(4) After the AS verification is passed, it returns the "Access Token" to the Client. There are also many types of access tokens. If it is a bearer type, whoever holds the access token can access the resource.

(5) The Client carries the "access token" to access the resources on the RS. During the validity period of the token, the client can carry the token multiple times to access resources.

(6) RS verifies the validity of the token, such as whether it is forged, whether it is beyond its authority, and whether it has expired. After the verification is passed, the service can be provided.


2.4 Open authorization of authorization code type



[Figure 2: Authorization Code Flow]

As shown in Figure 2, the open authorization protocol flow of authorization code type is described as follows:
(1) The execution flow of the Client initialization protocol. First redirect the RO user agent to the AS via HTTP 302. The client should include the following parameters in the redirect_uri: client_id, scope (describes the resource being accessed), redirect_uri (the URI of the Client), state (used to resist CSRF attacks). In addition, the request can also include access_type and approval_prompt parameters. When approval_prompt=force, AS will provide an interactive page, requiring the RO to explicitly approve (or reject) the client's request. If there is no approval_prompt parameter, the RO will approve the request by default. When access_type=offline, AS will issue a refresh_token when issuing access_token. Because the validity period of the access_token is short (such as 3600 seconds), in order to optimize the protocol execution process, the offline mode will allow the client to directly hold the refresh_token in exchange for a new access_token.

(2) The AS authenticates the RO identity and provides a page for the RO to decide whether to approve or reject the Client's request (when approval_prompt=force).

(3) If the request is approved, the AS redirects the RO user agent to the Client using the redirect_uri provided by the Client in step (1). redirect_uri must contain authorization_code and the state provided by Client in step 1. If the request is rejected, AS will return the corresponding error message through redirect_uri.

(4) The client takes the authorization_code to access the AS to exchange the required access_token. The client request information should include the authentication data required to authenticate the client's identity, and the redirect_uri used in the previous step to request authorization_code.

(5) When the AS receives the authorization_code, it needs to verify the identity of the Client, and verify that the received redirect_uri matches the redirect_uri used when requesting the authorization_code in step 3. If the verification is passed, AS will return access_token, and refresh_token (if access_type=offline).

If the reader is not clear about the details of this process, you can first look at an instantiation description in Section 3, and then come back to this part.


3. OAuth2.0 protocol instantiation description

Below I use instantiation to help readers understand the operation process of the authorization code type authorization protocol. Assumptions:
(1) Alice has a valid Google account;
(2) Facebook.com has registered the Client identity on the Google Authorization Server and has obtained (client_id, client_secret), note that client_secret is a shared secret between the Client and the AS key.
(3) Alice wants to authorize Facebook.com to view her contact list (https://www.google.com/m8/feeds).

Figure 3 shows the protocol operation process between Alice, Facebook.com, Google resource server, and Google OAuth authorization server.


[Figure 3: An Instance of Authorization Code Flow] 

The details involved in the protocol are already in Figure 3, so I don't plan to introduce them in detail. If you understand this picture, OAuth2.0 will understand.

Readers, please note that in step (4), when the client needs to exchange the "authorization code" for the "authorization token", the client needs to prove its identity to the AS, that is, to prove that it is the one that Alice approved the authorization in step (2). Grantee. There are two main methods for this identity certification (the first one is used in Figure 3):
(1) Send client_secret to AS directly through https, because client_secret is shared by Client and AS, so as long as the channel security of client_secret is transmitted Can.
(2) The client identity is authenticated by the message authentication code. The typical algorithm is HMAC-SHA1. In this way, the Client does not need to transmit the client_secret, but only needs to send the signature of the message request. Since there is no need to pass sensitive data to AS, it only needs to use http.

In addition, in step (2), the Google authorization server needs to authenticate Alice's RO identity, and provide an authorization interface for Alice to perform authorization and approval. The examples provided by Google today are shown in Figure 4 and Figure 5, which are only for readers to understand the meaning of "on-site authorization" or "online authorization" of OAuth.



[Figure 4: RO's Identity Authentication]


[Figure 5: RO's Authorization Decision]


4. Security considerations in OAuth2.0 design

4.1 Why is authorization_code introduced?

In protocol design, why use authorization_code to exchange access_token? This is a question that readers can easily think of. That is to say, in the third step of the protocol, why not directly return the access_token to the Client through redirection?

for example:
HTTP/1.1 302
Location:
https://www.facebook.com/?access_token=ya29.AHES6ZSXVKYTW2VAGZtnMjD&token_type=Bearer&expires_in=3600


If the access_token is returned directly, the protocol will become more concise, and there will be less interaction between the client and the AS, and the performance will be better. So why not design it that way? The reason for this design is not given in the protocol document [1], but it is not difficult to analyze:

quote

(1) The browser's redirect_uri is an insecure channel, and this method is not suitable for passing sensitive data (such as access_token). Because the uri may be passed to other malicious sites through the HTTP referrer, or it may exist in the browser cacher or log file, which brings many opportunities for attackers to steal the access_token. In addition, this protocol should not assume that the behavior of the RO user agent is trustworthy, because the RO browser may have already been implanted by the attacker to listen to the access_token by cross-site scripting. Therefore, the access_token is passed to the Client through the RO's user agent, which will significantly increase the risk of the access_token being leaked. But authorization_code can be passed through redirect_uri because authorization_code is not as sensitive as access_token. Even if the authorization_code is leaked, the attacker cannot directly obtain the access_token, because exchanging the authorization_code for the access_token needs to verify the real identity of the client. That is to say, it is useless for others to take authorization_code except Client. In addition, the access_token should only be issued to the Client for use, and no other subject (including the RO) should obtain the access_token. The design of the protocol should ensure that the Client is the only subject capable of obtaining the access_token. After the authorization_code is introduced, it can be guaranteed that the Client is the only holder of the access_token. Of course, the client is also the only one obliged to protect the access_token from being leaked.

(2) The introduction of authorization_code will also bring the following benefits. Since the protocol needs to verify the client's identity, if the authorization_code is not introduced, the client's identity authentication can only be passed through the redirect_uri in step 1. Also because redirect_uri is an insecure channel, this additionally requires the Client to use digital signature technology for identity authentication, rather than simple password or password authentication. After the authorization_code is introduced, the AS can directly authenticate the client (see steps 4 and 5), and can support any client authentication method (for example, simply send the client key directly to the AS).


quote

The code in step 3 is the code generated by the user after manual verification, and
the access_token in step 5 is obtained from the AS after the client machine obtains the code.
It is not safe to use code directly.


After we understand the above security considerations, readers may feel a sense of enlightenment and understand the beauty of introducing authorization_code. So, is it necessary to introduce authorization_code to solve these security problems? of course not. The author will give an extended authorization type solution that directly returns access_token in another blog post, which makes the protocol simpler and fewer interactions under the same security conditions.


4.2 Considerations Based on Web Security

OAuth protocol design is different from simple network security protocol design, because OAuth needs to consider various web attacks, such as CSRF (Cross-Site Request Forgery), XSS (Cross Site Script), Clickjacking. To understand the principles of these attacks, readers need to have a basic understanding of browser security (eg, Same Origin Policy, Same Origin Policy). For example, the introduction of the state parameter in redirect_uri is from the perspective of browser security, and with it, CSRF attacks can be resisted. Without this parameter, the attacker can inject the authorization_code or access_token provided by the attacker into the redirect_uri, which may cause the Client to access the wrong resource (for example, remit money to a wrong account).

Based on the consideration of web security, the OAuth protocol document has already been comprehensively elaborated, so I do not intend to expand it in this article. For interested readers, please refer to [1].


V. Conclusion

This article makes a basic introduction to the OAuth 2.0 open authorization protocol and its design security considerations, hoping to help students who participate in the design and development of security protocols.



references:

[1] Hammer-Lahav, E., Recordon, D., and D. Hardt, "The OAuth 2.0 Authorization Framework", draft-ietf-oauth-v2-31 (work in progress), June 2012.
[2] http://aws.amazon.com/iam/










-

Guess you like

Origin http://43.154.161.224:23101/article/api/json?id=326444667&siteId=291194637