It's 2023, why is OAuth still a headache?

a9954967ff91d27a7c4613d9d945c388.gif

[Editor's note] This article introduces the problems in the practice of OAuth, such as: the OAuth standard is too large and complex, everyone's OAuth is slightly different, many APIs add non-standard extensions to OAuth, debugging OAuth is difficult Difficulty, cumbersome approvals required to build applications on top of APIs, security issues with OAuth, etc. Nango, an open source service built by the author, aims to simplify the OAuth process and improve security, and is applicable to a variety of APIs to solve these problems.

Link: https://www.nango.dev/blog/why-is-oauth-still-hard

Do not redistribute without permission!

Author | Robin Guldener Translator |

Editor in charge | Xia Meng

Listing | CSDN (ID: CSDNnews)

We've implemented OAuth for 50 of the most popular APIs. Result: indescribable.

5359291253eeccb3092c15754bca6b5b.pngSource: nango.dev

OAuth is a standard protocol, and client libraries supporting OAuth 2.0 already exist for almost every programming language you can imagine. You might conclude from this that, with the client library in hand, you should be able to implement OAuth for any API in about 10 minutes, or at least an hour. However, the ideal is full and the reality is very skinny. It's hard to imagine a person being able to implement OAuth for any API in 10 minutes or an hour.


c3be036bf7a73396434f13fce8631728.png

OAuth in practice

We use OAuth for 50 of the most popular APIs such as Google (Gmail, Calendar, Sheets, etc.), HubSpot, Shopify, Salesforce, Stripe, Jira, Slack, Microsoft (Azure, Outlook, OneDrive), LinkedIn, Facebook and others APIs using OAuth.

Our conclusion: The OAuth experience today is about the same as JavaScript's browser API in 2008. While there is general agreement on how things should be done, virtually every API has its own interpretation of the standard, implementation differences and idiosyncrasies, and non-standard behavior and extensions. The result: every detail has the potential for problems and errors.

Where exactly is the problem, let's dig into it!


fa91c46f7d943ad55b5566b759b76b4c.png

Problem 1: The OAuth standard is too large and complex

"This API also uses OAuth 2.0, we did it a few weeks ago. I can get it done tomorrow." - Intern's last words

OAuth is a huge standard. It consists of 17 RFCs (documents that define standards) on the official website. They cover everything from the OAuth framework and Bearer tokens to threat models and private key JWTs.

You might ask: "Are all these RFCs about a simple 3rd party access token authorization API?" You'd be right. Let's just focus on those things that might be relevant for typical API third-party access use cases:

  • OAuth Standard: OAuth 2.0 is now the default, but some APIs still use OAuth 1.0a (and 2.1 is coming). Once you know which one your API is using, move on to the next step:

  • Authorization type: Do you need authorization_code, client_credentials, or device_code? What do they do, and when should you use them? When in doubt, try authorization_code.

  • By the way: Refresh tokens are also a type of grant, but instead of getting an access token, it's used to extend the validity of the access token. How they work is standardized, but how you request them in the first place is not. Talk later.

  • Now that you have your request ready, let's look at the official OAuth parameters; there are 72 of them, each with a clear meaning and behavior. You can check them out here. Common examples are prompt, scope, audience, resource, assertion, and login_hint. However, in our experience, most API providers seem to be as clueless about this list as you are right now, so don't worry too much.

If you think this is still too complicated and a lot to learn, we tend to agree with you. Most teams building public APIs seem to agree with this. They do not follow the full OAuth 2.0 standard, but only selectively implement some OAuth functionality based on their API use case. This leads to a long page in the documentation outlining how OAuth works for this particular API. But it's hard to blame them; they just want to provide a good developer experience (developer experience, DX) and make their API easy to use and understand, rather than following complex standards. They probably thought that the OAuth 2.0 standard was too complicated or not suitable for their scenario, so they just picked some features that they thought would be useful. While this may be done with good intentions, it can also lead to confusion and inconsistency.

5834164139d5ade50d98ad7cebc161c1.png

authorization_code OAuth flow for Salesforce. What's not to like about a clear view of this simple 10-step process?

The problem is, everyone has a different idea of ​​what OAuth is, so you end up with a lot of different (sub)implementations.


276b0daf1b691a41ac61cc28cff83dc1.png

Problem 2: OAuth is slightly different for everyone


Each API implements a different subset of OAuth, forcing you to peruse their lengthy OAuth documentation:

1. What parameters do they expect in the authorization call?

  • For Jira, you must set the audience parameter to specify the URL of the Jira instance you want to access. Google tends to handle this with a different scope, but cares a lot about the prompt parameter. Meanwhile, someone at Microsoft discovered the response_mode parameter and asked you to always set it to query.

  • The Notion API takes a radical approach by doing away with the ubiquitous scope parameter. In fact, you won't even find the word "scope" in their API docs. Notion calls them capabilities, and you set them when you register your application. It took us 30 minutes of confusion to understand what was going on. Why are they reinventing the wheel?

  • The situation is even worse with offline_access: most APIs these days let access tokens expire after a certain period of time. To get a refresh token, you need to request "offline_access", which needs to be done through a parameter, a scope, or something you set when registering the OAuth application. Ask your API or OAuth doctor for details.

2. What do they expect to see in the token request call?

  • Some APIs, like Fitbit, insist on fetching data in request headers. Most people really want it in the body, encoded as x-www-url-form-encoded, except for a few, like Notion, who prefer to fetch it as JSON.

  • Some people expect you to use Basic auth to authenticate this request. Many people don't care about this. But be careful, they might change their mind tomorrow.

3. Where should I redirect my users to authorize?

  • Shopify and Zendesk have a model where each user has a subdomain, say {subdomain}.myshopify.com. Yes, this includes OAuth authorization pages too, so you better build dynamic URLs in your models and front-end code.

  • Zoho Books provides different data centers for customers in different regions. Hope they remember where their data is: to authorize your app, your US customers should visit https://accounts.zoho.com, European customers can visit https://accounts.zoho.eu, Indian customers are welcome to visit https ://accounts.zoho.in. The list goes on.

4. But at least I can choose my callback URL, right?

  • If you enter http://localhost:3003/callback as the Slack API callback, they will kindly remind you to "use https for security reasons". Yes, works for localhost too. Luckily there are solutions to do OAuth redirection on localhost.

We could go on for a long time, but we think you get the point by now.

9684d77c5c5744ecc74b91d28d82804a.jpeg

OAuth is too complicated; let's make a simpler version of OAuth that has everything we need! © XKCD


89cb5663d79d333c7fce635cd08bd363.png

Problem 3: Many APIs add non-standard extensions to OAuth

Although the OAuth standard is comprehensive, many APIs still find that it has some functionality missing. A common problem we run into is that you need some data in addition to the access_token to interact with the API. It would be more convenient if this extra data could be returned to you along with the access_token during the OAuth flow.

But it does mean more non-standard behavior that you need to implement specifically for each API.

Here is a list of some of the non-standard extensions we saw:

  • Quickbooks uses a realmID which you need to pass in every API request. The only time they tell you the realmID is as an extra parameter in the OAuth callback. Better store it somewhere safe!

  • Braintree does the same, with a companyID

  • Salesforce uses a different API base URL for each customer; they call it instance_url. Thank goodness they return the user's instance_url along with the access token in the token response, but you really need to parse it out from there and store it.

  • Unfortunately, Salesforce also did something even more annoying: access tokens expire after a preset period of time, which can be customized by the user. So far so good, but for some reason they don't tell you in the token response when the access token you just received will expire (everyone else does). Instead, you need to query an additional token details endpoint to get the token's (current) expiration date. Why, Salesforce, why?

  • Slack has two different types of scopes: scopes that are held as Slack bots and scopes that allow you to act on behalf of users who authorize your app. Clever, but instead of just adding different scopes for each scope, they implemented a separate user_scopes parameter that you need to pass in the authorization call. You'd better be aware of this, and good luck finding an OAuth library that supports this.

For the sake of brevity and simplicity, we've skipped many of the less-than-standard OAuth flows we've come across.


88f7870f19f60b11c522acd8a277443a.png

Problem 4: "invalid_request" - debugging OAuth is hard

Debugging a distributed system is hard enough, and it's even harder when services return broad, generic error messages.

OAuth2 has standardized error messages, but they're just as useful in telling you what's going on as the example in the title (which, by the way, is one of the error messages recommended by the OAuth standard).

You might think that OAuth is a standard, every API is documented, so no need to debug. a lot of. I can't tell you how many times the docs are wrong. Or lack of detail. Or not updated with latest changes. Or what you missed the first time you watched them. About 80% of the OAuth flows we implement have some issues the first time they are implemented and need to be debugged.

38dacc87b7eea82dfb37d4579d8cbc04.jpeg

Randall seems to be able to read my mood to debug the OAuth flow? © XKCD

Some flows also break for seemingly random reasons: for example, LinkedIn OAuth breaks when you pass in a PKCE (Proof Key for Code Exchange) parameter. What error are you getting? "client error - invalid OAuth request." Is this... convincing? It took us an hour to figure out what the incoming (optional, and usually ignored) PKCE parameter was causing the flow to break. Another common mistake is sending a scope that doesn't match your pre-registered application. (Pre-registering scopes? Yes, many APIs now require this.) This usually results in a generic error message about a problem with the scope. It's frustrating.


a285fe8dd45325f310264f99660ca27f.png

Problem 5: Building apps on top of APIs requires cumbersome approvals

The truth is, if you're taking advantage of third-party APIs to build applications for other platforms or services, you're probably in a weak position. Your customers are asking for integration because they are already using other systems. Now you need to keep them happy.

Objectively speaking, many APIs are flexible and provide a convenient self-registration flow that allows developers to register their applications and start using OAuth. But some of the most popular APIs require review before your app becomes public and available to any of their users. Again to be fair, most review processes are reasonable and can be completed within a few days. They may have a net gain in safety and quality for end users.

2ad3d6f070e9fc0d7dff4a569418a745.jpeg

But some notorious examples can take months to complete, and some even require you to enter into a revenue share agreement:

  • If you want to access scopes that contain more sensitive user data, such as email content, Google requires a "security audit". We've heard that these reviews can take days or weeks to pass and require a fair amount of work on your part.

  • Want to integrate with Rippling? Get ready to answer their 30+ questions and safety pre-production screening. We've heard it takes months to gain access (if you're approved).

  • HubSpot, Notion, Atlassian, Shopify, and just about anyone with an integrated marketplace or app store requires a review to list there. Some reviews are gentle, while others require you to provide demo logins, video demos, blog posts (yes!), and more. Listing in a market or store is usually optional, though.

  • Ramp, Brex, Twitter, and quite a few other services don't provide developers with a self-registration process, instead requiring you to fill out a form to gain manual access. Many will be processing requests soon, but we're still waiting for some to respond after a few weeks.

  • A particularly extreme example is Xero, which is a paid API: if you want to exceed the limit of 25 connected accounts, you must become a Xero partner and list your app in their app store. They take (as of this writing) a 15% revenue share of every lead generated from that store.


57b185af5edae423d45a5fd7526d59a3.png

Problem 6: OAuth has security issues

With the discovery of OAuth security vulnerabilities and the advancement of network technology, the OAuth standard is constantly updated and improved. If you want to implement current security best practices, the OAuth working group has a detailed guide for you to refer to. If you're working with an API that still uses OAuth 1.0a, you'll realize that backwards compatibility is an ongoing challenge.

Fortunately, security is getting better with each iteration, but usually at the cost of more work for developers. The upcoming OAuth 2.1 standard will make some current best practices mandatory, and include mandatory PKCE (only a few APIs require this today) and additional restrictions on refresh tokens.

9e685453fadfec033407a81166cb481e.jpeg

At least OAuth has implemented a two-factor authentication model. © XKCD

Probably the biggest change was triggered by the introduction of expired access tokens and refresh tokens. In theory, the process looks simple: whenever an access token expires, refresh it with a refresh token, and store the new access token and refresh token. Actually, when we implement this function, we have to consider:

  • Race Condition: How do we ensure that no other requests are running while we refresh the current access token?

  • Some APIs also expire refresh tokens when you don't use them for a certain number of days (or the user has revoked access). Some refreshes are expected to fail.

  • Some APIs will give you a new refresh token on every refresh request...

  • ...but some silently assume you'll keep the old refresh token and keep using it.

  • Some APIs will tell you when an access token expires in absolute terms. Others just express it in relative "seconds from now". Still others, like Salesforce, don't disclose this kind of information lightly.


d77350ef2cf1e87b6897b0558c97d0dd.png

Finally: some things we haven't talked about yet

Unfortunately, we've only scratched the surface of your OAuth implementation. Now that your OAuth flow is working and you've got an access token, it's time to think about the following:

  • How to securely store these access tokens and refresh tokens. They are like passwords for your user accounts. But one-way encryption is not an option; you need secure, reversible encryption.

  • Check that the granted scope matches the requested scope (some APIs allow users to change their granted scope during the authorization flow).

  • Avoid situations where multiple requests modify the same token at the same time (also known as a race condition) when tokens are refreshed.

  • Detects access tokens revoked by the user on the provider side.

  • Notify users that their access token has expired and direct them to re-authorize your app.

  • How to revoke access tokens that you no longer need (or that users have asked you to delete under GDPR).

  • You also need to deal with changes in the available OAuth scopes, provider bugs, missing documentation, etc.


13f3a78574939fa49268cadf0edffd42.png

Is there a better way?

If you're reading this, you're probably thinking, "There must be a better way!"

We think there is, and that's why we're building Nango: an open source, self-contained service that offers pre-built OAuth flows, secure token storage, and automatic token refresh for 90+ OAuth APIs.

If you give it a try, we'd love to hear your feedback. If you'd like to share your worst OAuth horror stories with us, we'd love to hear them in our Slack community too.

In the practice of OAuth, in addition to the problems mentioned in this article, what problems are there?

feb4eccb33c732eb7a010773405d8688.jpeg

Recommended reading:

▶Oracle is strictly checking the Java license again. Netizens: The company has uninstalled Java and re-recruited programmers to develop a new system!

Due to the shortage of funds, the full-time developer stated: This open source software may have no future!

▶Looking for high-quality technical content creators! The second round of the mobile cloud developer community award-winning essay

Guess you like

Origin blog.csdn.net/FL63Zv9Zou86950w/article/details/131671656