-
Notifications
You must be signed in to change notification settings - Fork 160
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
GCP BigQuery service account json not supported in v0.7 #350
Comments
Hi, I was just going to open a similar issue :) We were not using the older version, but I guess we have the same problem: we store our Service Account JSON file as a secret in Google Secret Manager, so when we retrieve it, it is not a path to a file, but the actual file/JSON. I started testing with Cosmos 0.7.3 and so far did not manage getting it to somehow digest the SA keyfile as a json... |
@tatiana Hi, I see you did the refactor probably, could you plz check this? Is there really no way to use SA keyfile_dict currently, or is there some other way? Thank you |
@JoeSham @jbandoro I believe this may have been an issue introduced by the PR where we refactored profiles: #271 To confirm if I understood the issue, Cosmos works as expected if you create a Google Cloud Platform connection specifying the |
@tatiana Hi, yes, correct. The GoogleCloudServiceAccountFileProfileMapping seems to only accept key_path as a path to SA keyfile, but there is no keyfile_dict or similar option. I also tested passing the SA keyfile_dict into the key_path param, in which case it passed in the ProfileMapping but then failed later on the fact that the keyfile (as given) does not exist |
@JoeSham, the fix should be straightforward. The issue is that this: Was oversimplified as part of #271: We can add test cases to make sure we don't miss this feature in the future: https://github.com/astronomer/astronomer-cosmos/blob/main/tests/profiles/bigquery/test_bq_service_account_file.py Would either of you have the capacity to work on the fix? Otherwise, we can take it |
Related issue: astronomer#350
Related issue: astronomer#350
@JoeSham looked at the commits that are linked here, I think we should create a new profile mapping for the keyfile_dict/json. What do you think about that? |
Related issue: astronomer#350
Related issue: astronomer#350
Related issue: astronomer#350
@jlaneve yea you are right, it was just a quick first draft. I now made another quick 2nd draft, the way you recommended. Without tests so far. Not sure if I will have more time for it in the upcoming days though... |
That's fine! If you want to open a draft PR with what you have, I'm more than happy to jump in and add tests 🙂 |
Related issue: astronomer#350
That's a great start, @JoeSham ! Thank you very much. Please, do let us know once you'd like reviews - or if you need any support, e.g for writing the tests |
Hi @JoeSham , that's totally fine, thank you very much for start working on this. |
@JoeSham, I apologize for the delay; I worked on other tasks in the past few days. I wrote the tests: Please, feel free to |
Related issue: astronomer#350
Related issue: astronomer#350
…`keyfile` (#352) Add support to Google Cloud Platform connections that define `keyfile_dict` (actual value) instead of `keyfile` (path). A design decision for this implementation was to not add `keyfile_json` to `secret_fields`. This was not done because this property is originally a JSON. While storing it as an environment variable is simple, we'd need more significant changes to our profile parsing to correctly render this to the `profile.yml` generated by Cosmos. This used to work in Cosmos 0.6.x and stopped working in 0.7.x as part of a previous profile refactors #271. Closes: #350 Co-authored-by: Tatiana Al-Chueyr <[email protected]> **How to validate this change** 1. Have a GCP BQ service account with the `BigQuery Data Editor` role 2. Create a namespace that can be accessed by the service account (1) 3. Create an Airflow GCP connection that uses `keyfile` (path to the service account credentials saved locally). Example (replace `some-namespace` (2) and `key_path`(1)): ``` export AIRFLOW_CONN_GOOGLE_CLOUD_DEFAULT='{"conn_type": "google_cloud_platform", "extra": {"key_path": "/home/some-user/key.json", "scope": "https://www.googleapis.com/auth/cloud-platform", "project": "astronomer-dag-authoring", "dataset": "some-namespace" , "num_retries": 5}}' ``` 4. Change the `basic_cosmos_dag.py` with the following lines, making sure it references the Airflow connection created in (3) and the dataset created in (2): ``` conn_id="google_cloud_default", profile_args={ "dataset": "some-namespace", }, ``` 5. Run the DAG, for instance: ``` PYTHONPATH=`pwd` AIRFLOW_HOME=`pwd` AIRFLOW__CORE__DAGBAG_IMPORT_TIMEOUT=20000 AIRFLOW__CORE__DAG_FILE_PROCESSOR_TIMEOUT=20000 airflow dags test basic_cosmos_dag `date -Iseconds` ``` 6. Change the Airflow GCP connection to use `keyfile_dict` (hard-code the `keyfile` content in the Airflow connection, replacing `<your keyfile content here>`) ``` export AIRFLOW_CONN_GOOGLE_CLOUD_DEFAULT='{"conn_type": "google_cloud_platform", "extra": {"keyfile_dict": <your keyfile content here>, "scope": "https://www.googleapis.com/auth/cloud-platform", "project": "astronomer-dag-authoring", "dataset": "cosmos" , "num_retries": 5}}' ``` 7. Run the previously created Cosmos-powered DAG (5) that confirms (6) works --------- Co-authored-by: Tatiana Al-Chueyr <[email protected]>
We ran into this problem upgrading from v0.6.X -> v0.7.X using the default
google_cloud_platform
airflow connection that uses a keyfile json and not the keyfile path that is currently supported in astronomer cosmos.All of the profile var mapping of the json service account can be found in an older tag before the refactor of profiles in 0.7 here: https://github.com/astronomer/astronomer-cosmos/blob/astronomer-cosmos-v0.6.8/cosmos/providers/dbt/core/profiles/bigquery.py as a place to start from.
When I have time I'd be happy to take a stab at this, but opening up this issue in case others also run into this during an upgrade.
The text was updated successfully, but these errors were encountered: