Using the Authorization Code Flow and PKCE extension to secure web and mobile applications
You may have seen the term OAuth (Open Authorization) at some point during your walk as a developer. I know that your life is busy and there is probably no time left to understand all the technical terms that appear during the daily searches we carry out to solve specific problems. Most of the time that you are going to develop web or mobile applications, you will have to deal with authentication and authorization, because it is necessary to protect the data and ensure that each user accesses only what is authorized. There are several ways to perform this protection, some of which are specified by the OAuth protocol and its versions.
OAuth
The first version of the protocol (OAuth 1.0) emerged in December 2007 after Twitter software engineer leader Blaine Cook, along with other developers, realized that it was necessary to establish an authentication and authorization standard to be used in the software industry, each companies developed their own solutions, and they did not always follow the good practices necessary for a service as critical as that of security and data protection. Over time the protocol has been improved and today the protocol is in version 2.0.
The OAuth2 protocol allows users to authenticate to third-party applications without exposing their credentials and can still control what data those applications can access. You have probably seen a button like “Login with Google” or “Login with Facebook”, when you click and perform the process, in addition to authenticating yourself in the application, you will be giving permission for it to be able to access some of your data stored on Google or Facebook, such as name, email or contact list.
The protocol specifies some flows and you can use the one most suitable for the type of your application. In this article I will detail and implement the Authorization Code Flow, but before that we need to understand the roles (entities) present in the definition of the protocol, they are:
Resource Owner: It is the owner of the resource, an entity that grants access to protected resources. Using Google authentication as an example, that entity would own the account.
Authorization Server: Entity responsible for authenticating the user and issuing their access token. User data, such as: name and e-mail are under the protection of this entity, in addition, the user will only be able to access the APIs exposed by the Resource Server if he has the token issued by the Authorization Server. Example: Google authentication server.
Resource Server: Server responsible for the API that we want to access. Example: in the case of a virtual store, when the user wants to view their orders, the endpoint https://api.exemplo.com.br/me/requests is requested. The Resource Server is the server that this endpoint is hosted on.
Client: Application that requests access to the protected data of the Resource Owner. Example: mobile application or a SPA (Single Page Application).
Authentication flows
Authentication flows are sets of steps that a Client takes to obtain authorization to access resources. The Figure 1 shows the standard flow of OAuth2 and will help us to better understand the responsibility of each Role, if you don’t understand the sequence of steps, don’t worry, I’ll explain shortly.
- In step 1, the Client requests authorization from the Resource Owner to access its data.
- Following the happy path, the Resource Owner authorizes access and in step 2 the Client receives the authorization grant (code), which represents the authorization given by the user.
- Step 3 and step 4 are one of the most critical of the flow, here Client requests and receives (in the best case) the access token (also a code), as the name already says, this code will enable the Client to access the protected resource . If this code is recovered by a malicious application, that application could be impersonated by the real Client.
- For each request to Resource Server, it must also send the access token, as shown in step 5, then the Resource Server checks if it is a valid code. If valid, the resource is released and sent to the Client in step 6.
Now that you know how the general authentication flow works, the RFC 6749 details four flows that can be used in different scenarios to obtain the access code, they are:
- Authorization Code Flow: Used by web applications that are processed on the server side. In the next topic we will learn in detail how this flow works.
- Implicit Flow: Used in web and mobile applications that run on the client side. This flow is not recommended, as the access token is granted via the url parameter. As the url is stored in the browser history, malicious extensions can extract this information and use it to gain access to your data and applications.
- Resource Owner Password Credentials: This flow is the most different from the others, since the username and password are known by the Client application, so it is only possible through trusted applications.
- Client Credentials: Used for protected clients, as a code called client secret is used to authenticate, this code must be stored in a safe place.
The Security Best Current Practice document says that Resource Owner Password Credentials Flow is deprecated. It also recommends not using Implicit Flow as it is considered vulnerable.
Authorization Code Flow
Applications that run on the server side do not expose the source code, then you can use this flow, the reason is that each application has a code called client secret. As shown in Figure 2, this code is used to generate the access token.
Let’s understand each step of the flow:
- The user clicks the Login button on the web or mobile application.
- The page redirects the user to the Authorization Server.
- The user has access to the login page.
- The user insert the credentials and also informs which accesses the application can access.
- If the data is valid, the Authorization Server returns the authorization code.
- In this step, the application already has all the necessary data to request the access token. Then the authorization code, the client secret and the client id (unique code pre-registered with the Authorization Server that identifies the web or mobile application) are sent to the Authorization Server.
- The Authorization Server validates the information.
- If the information is valid, the Authorization Server finally returns the access token.
- The application can now use the access token to access the endpoints protected by the backend application (API).
- If the access token is valid, the API returns the protected data.
The PKCE extension
Regarding web or mobile applications, Authorization Code Flow should not be used, as it is not possible to guarantee that the client secret is safe. Implicit Flow is not recommended in the OAuth 2.0 Security Best Current Practice document. In view of this scenario, the extension PKCE (Proof Key for Code Exchange) appears, this extension changes some steps of the Authorization Code Flow so that it is possible to use the flow in client side applications. Figure 3 shows the flow that contemplates these changes.
Let’s understand each step of the flow:
- When the user accesses the application for the first time without being authenticated, two codes are generated: (a) random code (code verifier) and (b) encrypted code verifier (using the code challenge method), called code challenge.
- This data is stored in the web browser or in the mobile application.
- The page is redirected to the Authorization Server with the data: code challenge and code challenge method.
- The Authorization Server stores this data and returns the login page.
- The user enters his credentials and also inform which accesses the application can access.
- If the data is valid, the Authorization Server returns an authorization code.
- In this step, the application already has all the necessary data to request the access token. Then the authorization code and the client verifier are sent to the Authorization Server.
- If the information is valid, the Authorization Server finally returns the access token. Here is an important moment, the validation occurs as follows:
The Authorization Server generates the code challenge from the code verifier that just arrived using the code challenge method that was sent for the first time and then compares it with the code challenge received previously. - The application can now use the access token to access the endpoints protected by the backend application (API).
- If the access token is valid, the API returns the protected data.
Did you notice the difference to the common Authorization Code Flow? The application no longer passes the client secret, validation is done from the code generated randomly in the first call.
References
- https://tools.ietf.org/html/rfc6749
- https://oauth.net
- https://tools.ietf.org/html/rfc7636
- https://auth0.com/docs/flows
- https://tools.ietf.org/html/draft-ietf-oauth-security-topics-09
- https://developer.okta.com/blog
Did you like this article and want to ask any questions or make suggestions? Contact me!
Linkedin: linkedin.com/in/michaeldfti/
E-mail: michael@michaelsilva.io
Site: michaelsilva.io