Understanding oAuth Authentication Code Flow
Table of Contents
Who is the User and how trivial is this question? #
A Service need to know who the requesting user is, what the user is allowed to do and if that information is still valid. The most common way to achieve this is by using static credentials (like username and password) - assuming that if the credentials are known, the requesting user must be the user to which the credentials belong. Such static credentials often come with a lot of problems 1:
- Anyone with the credentials can impersonate the user
- Static credentials may be brute-forced or leaked
- Humans are humans and will write them down, use them multiple times, or share them with others
- It’s hard to revoke credentials without changing them or the system
- Credentials don’t include any information about the user or their permissions
- Credentials are often used for a long time because changing them is hard (and humans don’t like it)
Instead of trusting that only the two parties know the static credentials, a service should trust a third party to verify the identity of the user. One very simple implementation including a third party is using the mail provider as authentication authority. The user logs in with the email and a random string that was sent to the email.
However, this flow is neither very secure nor very user-friendly. The user has to switch between their email client and the application; and mails offer many attack vectors, with the simplest being that the mail is intercepted used for authentication before the user can use it.
Why oAuth and why is it so complicated? #
If you log in to a service by authenticating against a different service, you might use oAuth. The process of clicking on “login using xyz” on a website that is not xyz, entering your credentials for xyz, and then being redirected back is the oAuth flow.
Authentication is a requirement that so many services have in common, and therefore a range of attacks on user identity have been developed. The need for an user-friendly authentication that also considers even unknown attack vectors is why oAuth was defined - it is a protocol that that defines how a service may authenticate a user in diverse use cases. It contains flows that might appear complicated at first, but are necessary to cover all attack vectors and use cases.
Terminology #
Actors:
- Resource Owner - The human that can authenticate
- Resource Server - The server that holds the resources, for basic authentication this might be only the user identifier
- Client - The service that wants to know who the user is
- Authorization Server - The server that authenticates the user and issues tokens
As precondition, the Client must register themselves at the Authorization Server to get their own client_id
to
identify themselves towards this Authorization Server.
Each request should contain the grant_type
to specify the flow that is used.
Tokens:
- Access Token - credentials used to access protected resources. The access token provides an abstraction layer, replacing different authorization constructs (e.g., username and password) with a single token understood by the resource server.
- Refresh Token - credentials used to obtain a new access token when the current access token becomes invalid or expires. The Access Token should be short-lived to minimize the damage if it ever got stolen or invalid. If the human decides to revoke the access, the Access token may remain valid, but the Refresh token will not work anymore.
Json Web Tokens (JWT) are a common format for tokens. The RFC which defines oAuth does not specify the token format, but JWT is very common, because it can be easily signed/validated and can contain all necessary information.
OpenID Connect uses JWT as the token format, but oAuth does not require JWT.
oAuth Authentication Flows #
The flows are defined in the RFC 67493, while I found the best overview at curity.io 4 and Auth0 5. Swagger provides some open api specifications for oAuth flows 6.
Browser Flow - A Service wants to know who the user is #
- The Client initiates the flow by directing the human to the Authorization Server. The Client includes its
client_id
,scope
,state
,code_challenge
andredirect_uri
.
{
"type": "object",
"description": "The Request with which the client starts the oAuth flow to receive a token.",
"properties": {
"client_id": {
"type": "string",
"description": "The ID of the requesting client obtained when registering the client with the Authorization Server."
},
"response_type": {
"type": "string",
"description": "Defines the flow type, this post only covers the latest `code` flow."
},
"scope": {
"type": "string",
"description": "Additional (space separated) resources of the human. The Client may request certain scopes
(e.g. access to the users photos), the Authorization Server may ignore them based on their policy or the humans
instructions (e.g. the human may have disabled the sharing of their photos during the consent step)"
},
"state": {
"type": "string",
"description": "Client generated string to maintain state between the request and callback
(like a authentication-attempt-id for the Client) to prevent cross-site request forgery"
},
"redirect_uri": {
"type": "string",
"description": "The URI to which the human will be redirected after the Authorization Server has processed the
request and the human has granted access."
},
"code_challenge": {
"type": "string",
"description": "A hash of a random string (the `code_verifier`) that the Client will use to authenticate itself later",
}
},
"required": [
"client_id",
"response_type"
]
}
The Authentication Server checks if the
redirect_uri
is registered for theclient_id
to prevent Redirect URI manipulation (that would authenticate the human for one Client and then redirect the human and access information to an evil side). Then the Authentication Server authenticates the resource owner and requests grating.The Authorization Server redirects the Human back to the Client to the
redirect_uri
. The redirection URI includes an authorizationcode
and thestate
provided by the Client earlier.
{
"type": "object",
"description": "The 'response' (but a http request to the `redirect_uri` from the Authorization Server to the Client after the human has granted access",
"properties": {
"state": {
"type": "string",
"description": "The state that was provided by the Client to identify the started flow attempt."
},
"code": {
"type": "string",
"description": "A code with which the Client can request an access token from the Authorization Server. The Authorization Server
does not directly add the access code to the redirected uri to make sure that the access token is not visible in the
Humans side history and could then be misused"
}
},
"required": [
"state",
"code"
]
}
The Client check if the given
state
matches the stored state (so this oAuth attempt is the same that was started on this earlier).The Client requests an access token from the Authorization Server by providing the
code
,redirect_uri
, and the original `code to authenticate itself.
{
"type": "object",
"description": "The Request of the Client to exchange the code for an access token.",
"properties": {
"client_id": {
"type": "string",
"description": "The ID of the requesting client obtained when registering the client with the Authorization Server."
},
"grant_type": {
"type": "string",
"description": "value must be set to `authorization_code`"
},
"code": {
"type": "string",
"description": "The authorization code received from the Authorization server"
},
"redirect_uri": {
"type": "string",
"description": "The same redirect_uri that was used in the first request"
},
"code_verifier": {
"type": "string",
"description": "The original code_verifier that was used to hash the code_challenge"
}
},
"required": [
"client_id",
"grant_type",
"code",
"redirect_uri"
]
}
- The Authorization Server validates the authorization
code
, hashes the givencode_verifier
to check if it matches the previouscode_challange
to ensure that the service who requests the access token is actually the service that started the process, and ensures that theredirect_uri
matches the URI used to redirect the Client in step 3 to ensure that the intended uri is still the same. Finally, the Authorization Server responds back with an access token and, optionally, a refresh token.
Conclusion #
oAuth web flow is a complex protocol that is necessary to cover all attack vectors and use cases, but enables a secure and convenient way of authentication and authorisation. The complexity of the protocol is hidden behind libraries, but a developer who is aware of the flows can make better decisions and understand the security implications of their choices.
Because the complexity is hidden, developers should not hesitate to use oAuth for their services. It is a secure, user-friendly way to authenticate users and should be preferred over static credentials.
Happy Coding :)