Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

GCP BigQuery service account json not supported in v0.7 #350

Closed
jbandoro opened this issue Jun 28, 2023 · 16 comments · Fixed by #352
Closed

GCP BigQuery service account json not supported in v0.7 #350

jbandoro opened this issue Jun 28, 2023 · 16 comments · Fixed by #352
Assignees
Labels
bug Something isn't working good first issue Good for newcomers team/astro-devex
Milestone

Comments

@jbandoro
Copy link
Collaborator

We ran into this problem upgrading from v0.6.X -> v0.7.X using the default google_cloud_platform airflow connection that uses a keyfile json and not the keyfile path that is currently supported in astronomer cosmos.

All of the profile var mapping of the json service account can be found in an older tag before the refactor of profiles in 0.7 here: https://github.com/astronomer/astronomer-cosmos/blob/astronomer-cosmos-v0.6.8/cosmos/providers/dbt/core/profiles/bigquery.py as a place to start from.

When I have time I'd be happy to take a stab at this, but opening up this issue in case others also run into this during an upgrade.

@JoeSham
Copy link
Contributor

JoeSham commented Jun 28, 2023

Hi, I was just going to open a similar issue :) We were not using the older version, but I guess we have the same problem: we store our Service Account JSON file as a secret in Google Secret Manager, so when we retrieve it, it is not a path to a file, but the actual file/JSON. I started testing with Cosmos 0.7.3 and so far did not manage getting it to somehow digest the SA keyfile as a json...

@JoeSham
Copy link
Contributor

JoeSham commented Jun 28, 2023

@tatiana Hi, I see you did the refactor probably, could you plz check this? Is there really no way to use SA keyfile_dict currently, or is there some other way? Thank you

@tatiana
Copy link
Collaborator

tatiana commented Jun 28, 2023

@JoeSham @jbandoro I believe this may have been an issue introduced by the PR where we refactored profiles: #271

To confirm if I understood the issue, Cosmos works as expected if you create a Google Cloud Platform connection specifying the key_path property but not if you specify keyfile_dict - is that the case?

@tatiana tatiana added this to the 1.0.0 milestone Jun 28, 2023
@tatiana tatiana added the good first issue Good for newcomers label Jun 28, 2023
@JoeSham
Copy link
Contributor

JoeSham commented Jun 28, 2023

@tatiana Hi, yes, correct. The GoogleCloudServiceAccountFileProfileMapping seems to only accept key_path as a path to SA keyfile, but there is no keyfile_dict or similar option. I also tested passing the SA keyfile_dict into the key_path param, in which case it passed in the ProfileMapping but then failed later on the fact that the keyfile (as given) does not exist

@tatiana tatiana added the bug Something isn't working label Jun 28, 2023
@tatiana
Copy link
Collaborator

tatiana commented Jun 28, 2023

@JoeSham, the fix should be straightforward. The issue is that this:
https://github.com/astronomer/astronomer-cosmos/blob/astronomer-cosmos-v0.6.8/cosmos/providers/dbt/core/profiles/bigquery.py

Was oversimplified as part of #271:
https://github.com/astronomer/astronomer-cosmos/blob/main/cosmos/profiles/bigquery/service_account_file.py

We can add test cases to make sure we don't miss this feature in the future: https://github.com/astronomer/astronomer-cosmos/blob/main/tests/profiles/bigquery/test_bq_service_account_file.py

Would either of you have the capacity to work on the fix? Otherwise, we can take it

JoeSham added a commit to JoeSham/astronomer-cosmos that referenced this issue Jun 28, 2023
JoeSham added a commit to JoeSham/astronomer-cosmos that referenced this issue Jun 28, 2023
@jlaneve
Copy link
Collaborator

jlaneve commented Jun 28, 2023

@JoeSham looked at the commits that are linked here, I think we should create a new profile mapping for the keyfile_dict/json. What do you think about that?

JoeSham added a commit to JoeSham/astronomer-cosmos that referenced this issue Jun 28, 2023
JoeSham added a commit to JoeSham/astronomer-cosmos that referenced this issue Jun 28, 2023
JoeSham added a commit to JoeSham/astronomer-cosmos that referenced this issue Jun 28, 2023
@JoeSham
Copy link
Contributor

JoeSham commented Jun 28, 2023

@jlaneve yea you are right, it was just a quick first draft. I now made another quick 2nd draft, the way you recommended. Without tests so far. Not sure if I will have more time for it in the upcoming days though...

@jlaneve
Copy link
Collaborator

jlaneve commented Jun 28, 2023

@jlaneve yea you are right, it was just a quick first draft. I now made another quick 2nd draft, the way you recommended. Without tests so far. Not sure if I will have more time for it in the upcoming days though...

That's fine! If you want to open a draft PR with what you have, I'm more than happy to jump in and add tests 🙂

@JoeSham
Copy link
Contributor

JoeSham commented Jun 28, 2023

Ok great @jlaneve :) Draft PR opened

JoeSham added a commit to JoeSham/astronomer-cosmos that referenced this issue Jun 28, 2023
@tatiana
Copy link
Collaborator

tatiana commented Jun 28, 2023

That's a great start, @JoeSham ! Thank you very much. Please, do let us know once you'd like reviews - or if you need any support, e.g for writing the tests

@JoeSham
Copy link
Contributor

JoeSham commented Jun 30, 2023

Hey @tatiana @jlaneve, so could you please maybe take a look + possibly add tests etc.? (note: I am leaving for holidays next week so won't be able to work on this after today for a while)

@tatiana
Copy link
Collaborator

tatiana commented Jun 30, 2023

Hi @JoeSham , that's totally fine, thank you very much for start working on this.
Would you like to add us as contributors of your fork/branch, so we can follow up on the same PR, or do you prefer that we open a new PR from a different repo?
Enjoy your holidays!

@JoeSham
Copy link
Contributor

JoeSham commented Jun 30, 2023

Thx @tatiana! Added you and @jlaneve as contributors

@JoeSham
Copy link
Contributor

JoeSham commented Jul 14, 2023

Hey @tatiana @jlaneve 👋 May I ask what is the current status, please?

tatiana pushed a commit that referenced this issue Jul 18, 2023
@tatiana
Copy link
Collaborator

tatiana commented Jul 18, 2023

@JoeSham, I apologize for the delay; I worked on other tasks in the past few days.
Unfortunately, the invitation to contribute to your repo has expired.

I wrote the tests:
https://github.com/astronomer/astronomer-cosmos/tree/joe/bq_keyfile_dict

Please, feel free to cherry-pick 615a256

@JoeSham
Copy link
Contributor

JoeSham commented Jul 20, 2023

Hey @tatiana thx, I invited you and @jlaneve as contributors again

tatiana pushed a commit to JoeSham/astronomer-cosmos that referenced this issue Jul 23, 2023
tatiana pushed a commit to JoeSham/astronomer-cosmos that referenced this issue Jul 23, 2023
tatiana added a commit that referenced this issue Jul 24, 2023
…`keyfile` (#352)

Add support to Google Cloud Platform connections that define
`keyfile_dict` (actual value) instead of `keyfile` (path).

A design decision for this implementation was to not add `keyfile_json`
to `secret_fields`. This was not done because this property is
originally a JSON. While storing it as an environment variable is
simple, we'd need more significant changes to our profile parsing to
correctly render this to the `profile.yml` generated by Cosmos.

This used to work in Cosmos 0.6.x and stopped working in 0.7.x as part
of a previous profile refactors #271.

Closes: #350

Co-authored-by:  Tatiana Al-Chueyr <[email protected]>

**How to validate this change**

1. Have a GCP BQ service account with the `BigQuery Data Editor` role
2. Create a namespace that can be accessed by the service account (1)
3. Create an Airflow GCP connection that uses `keyfile` (path to the
service account credentials saved locally). Example (replace
`some-namespace` (2) and `key_path`(1)):
```
export AIRFLOW_CONN_GOOGLE_CLOUD_DEFAULT='{"conn_type": "google_cloud_platform", "extra": {"key_path": "/home/some-user/key.json", "scope": "https://www.googleapis.com/auth/cloud-platform", "project": "astronomer-dag-authoring", "dataset": "some-namespace" , "num_retries": 5}}'
```
4. Change the `basic_cosmos_dag.py` with the following lines, making
sure it references the Airflow connection created in (3) and the dataset
created in (2):
 ```
    conn_id="google_cloud_default",
    profile_args={
        "dataset": "some-namespace",
    },
```

5. Run the DAG, for instance:
```
PYTHONPATH=`pwd` AIRFLOW_HOME=`pwd`
AIRFLOW__CORE__DAGBAG_IMPORT_TIMEOUT=20000
AIRFLOW__CORE__DAG_FILE_PROCESSOR_TIMEOUT=20000 airflow dags test
basic_cosmos_dag `date -Iseconds`
```

6.  Change the Airflow GCP connection to use `keyfile_dict` (hard-code the `keyfile` content in the Airflow connection, replacing `<your keyfile content here>`)
```
export AIRFLOW_CONN_GOOGLE_CLOUD_DEFAULT='{"conn_type":
"google_cloud_platform", "extra": {"keyfile_dict": <your keyfile content
here>, "scope": "https://www.googleapis.com/auth/cloud-platform",
"project": "astronomer-dag-authoring", "dataset": "cosmos" ,
"num_retries": 5}}'
```

7. Run the previously created Cosmos-powered DAG (5) that confirms (6) works

---------

Co-authored-by: Tatiana Al-Chueyr <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working good first issue Good for newcomers team/astro-devex
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants