Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[New Feature]: Airflow Cognito integration #126

Open
LucaCinquini opened this issue Jun 27, 2024 · 4 comments
Open

[New Feature]: Airflow Cognito integration #126

LucaCinquini opened this issue Jun 27, 2024 · 4 comments
Assignees
Labels

Comments

@LucaCinquini
Copy link
Collaborator

No description provided.

@LucaCinquini
Copy link
Collaborator Author

LucaCinquini commented Jun 27, 2024

Dependency: CS setup Cognito user pool in each target shared services venue, also provide connection information and whatever instuctions are needed for integration.

Risks: Cannot be accomplished with current Airflow version, will need to wait for next Airflow version and support for AuthManager in Airflow 3.0.X????

Tests:
o Successful login into Airflow UI with Cognito credentials
o Successful use of Airflow API to submit a job with Cognito credentials
o Successful use of OGC API to submit a job with Cognito credentials

@nikki-t
Copy link
Collaborator

nikki-t commented Jul 30, 2024

Cognito/Airflow Information

The Airflow web UI uses Flask App Builder (FAB).

Authentication for the API is handled separately to the Web Authentication.

An Amazon Cognito user pool is an OpenID Connect (OIDC) identity provider (IdP). Documentation on implementation options:

  • This documentation goes into a lot of detail around implementing OIDC in Airflow via FAB.
  • This AWS blog post details how to set up ALB with Amazon Cognito to authenticate users to Kubernetes web app. We may need this to facilitate Cognito access via the ALB to Airflow as it shows how the access token, subject, and user claims (JWT format) are passed to a web application.
  • In addition to the modifications made to existing FAB webserver_config.py file, we can create our own auth manager by subclassing BaseAuthManager.

Proposed architecture

  • Users are defined with groups that map to Airflow roles in the Cognito user pool.
    • Airflow roles: Admin, User, Op, Viewer, and Public
  • Modify webserver_config.py to create a subclass of FabAirflowSecurityManagerOverride and override the get_oauth_user_info method to authenticate with the Cognito user pool users and groups. Map Cognito user pool groups to Airflow roles and return the username and role keys (groups).
  • Include the new webserver_config.py file in the helm chart.

Info needed from Cognito

@nikki-t
Copy link
Collaborator

nikki-t commented Aug 12, 2024

Here are the general steps that are required for OAuth2.0 authentication with Cognito user pool. From: https://aws.amazon.com/blogs/security/how-to-use-oauth-2-0-in-amazon-cognito-learn-about-the-different-oauth-2-0-grants/

  1. HTTP GET request to https://AUTH_DOMAIN/oauth2/authorize where AUTH_DOMAIN=user pool's configured domain.
    a. response_type=code
    b. client_id
    c. redirect_uri=The URL that a user is directed to after successful authentication
    d. state=Random value that is used to prevent CSRF
    e. scope=Space-separated list of scopes to request for the generated tokens
    f. nonce=A random value that you can add to the request which is included in the ID token that Cognito issues.
  2. A CSRF token is returned in a cookie. The user is redirected to https://AUTH_DOMAIN/login (which hosts the auto-generated UI) with the same query parameters set from step 1.
  3. The user authenticates with the auto-generated UI.
  4. Cognito verifies the user pool credentials, the user is redirected to the URL that was specified in the origin redirect_uri query parameter. Also sets a code query parameter that specifies the authorization code vended to the user by Cognito.
  5. The application (Airflow webserver) extract authorization tool from query parameters and exchange it for user pool tokens. Exchange is a POST request to https://AUTH_DOMAIN/oauth2/token with application/x-www-form-urlencoded parameters: grant_type, code, client_id, redirect_uri.
  6. JSON response returned includes: access_token, refresh_token, id_token, expires_in, token_type

So far it looks like the traffic is passing steps 1 though 3 but the redirect may not be working on step 4. I can't quite isolate where in the Airflow webserver_config.py or the authentication flow in Cognito the issue is arising.

@nikki-t
Copy link
Collaborator

nikki-t commented Sep 17, 2024

Solutions tried,

  1. FAB documentation for OAuth, gets hung up on redirect URI and does not seem to reach get_oauth_user_info function.
  2. Stack Overflow which matches the GitHub OAuth configuration gets stuck on redirect URI.
  3. Airflow documentation
    a. Set up GitHub authentication and also stuck at redirect URI.
  4. ODIC provider does not work with current Airflow version. Is creating a new OIDCView from existing OIDView so extending functionality.
  5. CognitoAuthManager extending BaseAuthManager class to define our own authentication operations.
    a. Requires quite a bit of work to build out the class and provide authentication and authorization.
    b. Needs scoping.

Documentation on OAuth 2.0 grants in Cognito: https://aws.amazon.com/blogs/security/how-to-use-oauth-2-0-in-amazon-cognito-learn-about-the-different-oauth-2-0-grants/

It looks like Airflow may be moving away from FAB in the future and it may make the most sense to implement our own auth manager following the AWS auth manager architecture (Note: this does not use cognito for authentication and authorization).

@LucaCinquini LucaCinquini added U-SPS and removed U-SPS labels Sep 25, 2024
@LucaCinquini LucaCinquini changed the title Airflow Cognito integration [New Feature]: Airflow Cognito integration Sep 25, 2024
@LucaCinquini LucaCinquini self-assigned this Sep 26, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
Status: In Progress
Development

No branches or pull requests

2 participants