Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

chore(metrics): init Metrics in app init, cleanup setup code to one b… #1167

Open
wants to merge 14 commits into
base: master
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
9 changes: 5 additions & 4 deletions .github/workflows/ci.yaml
Original file line number Diff line number Diff line change
@@ -1,8 +1,9 @@
# push will run on every pushed commit to any branch (so this will rerun the tests
# once a branch gets merged to main in addition to any new commits on any branch)
on: push

name: CI
on:
push:
pull_request:
types: [opened, reopened]
Comment on lines +1 to -5
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There can be edge cases where the "automatic merge" messes something up, and the CI succeeds on the "branch run", but fails on the "PR run". In that case you have to update your branch manually and fix the conflict that git didn't see. It has happened to me a handful of times, not sure it's worth running the CI twice on every PR, but 🤷‍♀️

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does it happen in other repos? which automatic merge are you referring to?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it can happen in any repo

i can't find documentation about it but this is my understanding from what i observed:
when 2 checks run (for push and for PR) they're not necessarily running on the same code. The "push" check runs on the branch's code, and the "PR" check runs on the code that would be on master after merging the PR. If there are any changes on master that aren't on the branch, it performs a merge automatically and runs the CI on that. It's the same code that would end up on master if you choose to merge without updating your branch first - git/github finds no conflict and merges. Occasionally it does it wrong, and it can be caught by the "PR" CI check

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should be protected against people merging without first updating their branch, at which point the push should run again. I do remember issues in the past vaguely... but I can't find any documentation in GH Actions that leads me to believe it's necessary to have both of these. Unless you have a strong opinion, I would rather not run everything twice and just get feedback on the code on main when the branch-creator merges it in and pushes those commits.


concurrency:
group: '${{ github.workflow }} @ ${{ github.event.pull_request.head.label || github.head_ref || github.ref }}'
cancel-in-progress: true
Expand Down
4 changes: 2 additions & 2 deletions .secrets.baseline
Original file line number Diff line number Diff line change
Expand Up @@ -121,7 +121,7 @@
"filename": ".github/workflows/ci.yaml",
"hashed_secret": "3e26d6750975d678acb8fa35a0f69237881576b0",
"is_verified": false,
"line_number": 13
"line_number": 14
}
],
"deployment/scripts/postgresql/postgresql_init.sql": [
Expand Down Expand Up @@ -422,5 +422,5 @@
}
]
},
"generated_at": "2024-07-25T17:19:58Z"
"generated_at": "2024-07-26T20:11:41Z"
}
2 changes: 1 addition & 1 deletion Dockerfile
Original file line number Diff line number Diff line change
Expand Up @@ -41,7 +41,7 @@ RUN poetry config virtualenvs.create false \
COPY . /$appname
COPY ./deployment/uwsgi/uwsgi.ini /etc/uwsgi/uwsgi.ini
COPY ./deployment/uwsgi/wsgi.py /$appname/wsgi.py
COPY clear_prometheus_multiproc /$appname/clear_prometheus_multiproc
COPY ./deployment/scripts/metrics/setup_prometheus /$appname/setup_prometheus
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.


# install fence
RUN poetry config virtualenvs.create false \
Expand Down
12 changes: 11 additions & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -128,11 +128,21 @@ Whereas pre-signed URL is a cloud-agnostic solution, services and tools on Googl

See [Fence and Google](docs/additional_documentation/google_architecture.md) for more details on data access methods specific to Google.


## Setup

See detailed explanation [here](docs/additional_documentation/setup.md)

## Unit testing

The easiest way to ensure your environment is getting set up for tests the same
way the CI/CD is by using the setup and test run script the CI/CD uses.

You can run unit tests (which are run with pytest behind the scenes) like this:

```shell
bash ./tests/ci_commands_script.sh
```

## Additional documentation

1. [Terminologies](docs/additional_documentation/terminology.md)
Expand Down
33 changes: 33 additions & 0 deletions deployment/scripts/metrics/setup_prometheus
Original file line number Diff line number Diff line change
@@ -0,0 +1,33 @@
#!/usr/bin/env bash
# This script is called by:
# UWSGI during startup, this gets moved to the /fence directory in the Dockerfile
# - It prepares the prometheus_multiproc_dir folder to store the metrics from separate uwsgi workers (per PID)
# run.py
# - So local runs setup necessary environment vars and folders for prometheus metrics
# Test framework in conftest
# - So test runs setup necessary environment vars and folders for prometheus metrics

# Usage:
# ./script_name.sh [DIR] [true]

# Default directory if no argument is provided
DIR=${1:-/var/tmp/prometheus_metrics}

# Determine whether to wipe the directory (default is to wipe)
SETUP_DIR=${2:-true}

set -ex

if [[ "$SETUP_DIR" == "true" ]]; then
echo "setting up $PROMETHEUS_MULTIPROC_DIR. clearing existing files, ensuring it exists, chmod 755"
rm -Rf "$DIR"
mkdir -p "$DIR"
chmod 755 "$DIR"
fi

if id -u nginx &>/dev/null; then
chown "$(id -u nginx)":"$(id -g nginx)" "$DIR"
fi

export PROMETHEUS_MULTIPROC_DIR="$DIR"
echo "PROMETHEUS_MULTIPROC_DIR is $PROMETHEUS_MULTIPROC_DIR"
3 changes: 1 addition & 2 deletions deployment/uwsgi/uwsgi.ini
Original file line number Diff line number Diff line change
Expand Up @@ -28,8 +28,7 @@ pythonpath = /usr/local/src/*
# metrics setup
stats = 127.0.0.1:9191
stats-http = true
env = prometheus_multiproc_dir=/var/tmp/uwsgi_flask_metrics
exec-asap = /fence/clear_prometheus_multiproc /var/tmp/uwsgi_flask_metrics
exec-asap = /fence/setup_prometheus /var/tmp/prometheus_metrics

# Initialize application in worker processes, not master. This prevents the
# workers from all trying to open the same database connections at startup.
Expand Down
2 changes: 1 addition & 1 deletion docs/additional_documentation/fence_create.md
Original file line number Diff line number Diff line change
Expand Up @@ -53,7 +53,7 @@ curl --request POST https://FENCE_URL/oauth2/token?grant_type=client_credentials

The optional `--expires-in` parameter allows specifying the number of *days* until this client expires. The recommendation is to rotate credentials with the `client_credentials` grant at least once a year (see [Rotate client credentials](#rotate-client-credentials) section).

NOTE: In Gen3, you can grant specific access to a client the same way you would to a user. See the [user.yaml guide](https://github.com/uc-cdis/fence/blob/master/docs/user.yaml_guide.md) for more details.
NOTE: In Gen3, you can grant specific access to a client the same way you would to a user. See the [user.yaml guide](./user.yaml_guide.md) for more details.

NOTE: Client credentials tokens are not linked to a user (the claims contain no `sub` or `context.user.name` like other tokens). Some Gen3 endpoints that assume the token is linked to a user, or whose logic require there being a user, do not support them. For an example of how to adapt an endpoint to support client credentials tokens, see [here](https://github.com/uc-cdis/requestor/commit/a5078fae27fa258ac78045cf2bb89cb2104f53cf). For an example of how to explicitly reject client credentials tokens, see [here](https://github.com/uc-cdis/requestor/commit/0f4974c25343d2185c7cdb48dcdeb58f97800672).

Expand Down
2 changes: 1 addition & 1 deletion docs/additional_documentation/setup.md
Original file line number Diff line number Diff line change
Expand Up @@ -154,4 +154,4 @@ saved by the OAuth client to use with
## Quickstart with Helm

You can now deploy individual services via Helm!
Please refer to the Helm quickstart guide HERE (https://github.com/uc-cdis/fence/blob/master/docs/quickstart_helm.md)
Please refer to the Helm quickstart guide [HERE](./quickstart_helm.md)
2 changes: 1 addition & 1 deletion docs/additional_documentation/user.yaml_guide.md
Original file line number Diff line number Diff line change
Expand Up @@ -16,7 +16,7 @@

The `user.yaml` file is one way to get authorization information into Gen3. It is ingested via [Fence's `usersync` script](usersync.md). The format of this file is tightly coupled with the notions of resource, role and policy as defined by Gen3's policy engine, [Arborist](https://github.com/uc-cdis/arborist#arborist).

For Gen3 Data Commons that do not use Arborist or that use the Google Data Access method of [Google Service Account Registration](https://github.com/uc-cdis/fence/blob/master/docs/google_architecture.md#google-account-linking-and-service-account-registration), refer to the [Deprecated format](#deprecated-format) section.
For Gen3 Data Commons that do not use Arborist or that use the Google Data Access method of [Google Service Account Registration](./google_architecture.md#google-account-linking-and-service-account-registration), refer to the [Deprecated format](#deprecated-format) section.

In a fully deployed Gen3 Commons using [Cloud Automation](https://github.com/uc-cdis/cloud-automation), the `user.yaml` file is usually hosted in S3 and configured via the `global.useryaml_s3path` setting of the Gen3 Data Commons manifest:
```
Expand Down
20 changes: 10 additions & 10 deletions docs/azure/azure_devops_pipeline.md
Original file line number Diff line number Diff line change
@@ -1,16 +1,16 @@
# Azure DevOps Build Pipeline

The purpose of this [Azure DevOps Pipeline](../../azure-devops-pipeline.yaml) is to build `fence`, run a test suite, and then push the `fence` container into an [Azure Container Registry](https://docs.microsoft.com/en-us/azure/container-registry/container-registry-get-started-portal).
The purpose of this [Azure DevOps Pipeline](../../azure-devops-pipeline.yaml) is to build `fence`, run a test suite, and then push the `fence` container into an [Azure Container Registry](https://learn.microsoft.com/en-us/azure/container-registry/container-registry-get-started-portal).

## Getting Started

If you don't already have access, you can use the free sign up with [Azure Devops](https://docs.microsoft.com/en-us/azure/devops/pipelines/get-started/pipelines-sign-up?view=azure-devops).
If you don't already have access, you can use the free sign up with [Azure Devops](https://learn.microsoft.com/en-us/azure/devops/pipelines/get-started/pipelines-sign-up?view=azure-devops).

You can also import the [pipeline](../../azure-devops-pipeline.yaml), see these [doc notes](https://docs.microsoft.com/en-us/azure/devops/pipelines/get-started/clone-import-pipeline?view=azure-devops&tabs=yaml#export-and-import-a-pipeline) as a guide.
You can also import the [pipeline](../../azure-devops-pipeline.yaml), see these [doc notes](https://learn.microsoft.com/en-us/azure/devops/pipelines/get-started/clone-import-pipeline?view=azure-devops&tabs=yaml#export-and-import-a-pipeline) as a guide.

### Setup Azure Container Registry

[Create a Service Principal](https://docs.microsoft.com/en-us/cli/azure/create-an-azure-service-principal-azure-cli#password-based-authentication) in your Azure Subscription using [Azure CLI](https://docs.microsoft.com/en-us/cli/azure/install-azure-cli).
[Create a Service Principal](https://learn.microsoft.com/en-us/cli/azure/create-an-azure-service-principal-azure-cli#password-based-authentication) in your Azure Subscription using [Azure CLI](https://learn.microsoft.com/en-us/cli/azure/install-azure-cli).

First, log into `az` cli:

Expand All @@ -36,7 +36,7 @@ spTenantId=$(echo $spObject | jq -r ".tenant")

> You will need to have appropriate permissions in the AAD directory. If you don't have access, please work with your Azure Subscription administrator to obtain a Service Principal.

You can also create an **Azure Container Registry** using [azure cli](https://docs.microsoft.com/en-us/azure/container-registry/container-registry-get-started-azure-cli) or the [portal](https://docs.microsoft.com/en-us/azure/container-registry/container-registry-get-started-portal).
You can also create an **Azure Container Registry** using [azure cli](https://learn.microsoft.com/en-us/azure/container-registry/container-registry-get-started-azure-cli) or the [portal](https://learn.microsoft.com/en-us/azure/container-registry/container-registry-get-started-portal).

You can use the following `az` cli commands in `bash` for reference:

Expand All @@ -45,7 +45,7 @@ az group create --name myResourceGroup --location eastus
az acr create --resource-group myResourceGroup --name myContainerRegistry --sku Basic
```

Also, make sure that the **Service Principal** has rights to the [Azure Container Registry](https://docs.microsoft.com/en-us/azure/container-registry/container-registry-roles?tabs=azure-cli) to **acrPull** and **acrPush**.
Also, make sure that the **Service Principal** has rights to the [Azure Container Registry](https://learn.microsoft.com/en-us/azure/container-registry/container-registry-roles?tabs=azure-cli) to **acrPull** and **acrPush**.

```bash
acrResourceId="$(az acr show -n myContainerRegistry -g myResourceGroup --query "id" -o tsv)"
Expand All @@ -65,7 +65,7 @@ az login --service-principal --username "$spClientId" --password "$spPassword" -
az acr login --name myContainerRegistry
```

You can also verify that this service principal will have `ACRPush` and `ACRPull` permission with ACR, which you can check how the [getting started with docker guide](https://docs.microsoft.com/en-us/azure/container-registry/container-registry-get-started-docker-cli?tabs=azure-cli) for more details.
You can also verify that this service principal will have `ACRPush` and `ACRPull` permission with ACR, which you can check how the [getting started with docker guide](https://learn.microsoft.com/en-us/azure/container-registry/container-registry-get-started-docker-cli?tabs=azure-cli) for more details.

First, pull and tag an image:

Expand Down Expand Up @@ -99,13 +99,13 @@ az acr repository list --name mycontainerregistry

You can set the variables on your **Azure DevOps pipeline**.

First, make sure you have already [imported your Azure DevOps Pipeline](https://docs.microsoft.com/en-us/azure/devops/pipelines/get-started/clone-import-pipeline?view=azure-devops&tabs=yaml#export-and-import-a-pipeline).
First, make sure you have already [imported your Azure DevOps Pipeline](https://learn.microsoft.com/en-us/azure/devops/pipelines/get-started/clone-import-pipeline?view=azure-devops&tabs=yaml#export-and-import-a-pipeline).

Click on the pipeline and then click edit, which will let you update the variables in the Azure DevOps pipeline:

![Click on Variables](azure_devops_pipeline_config_1.png)

Variable Name | Description
Variable Name | Description
------ | ------
SP_CLIENT_ID | This is your Service Principal Client ID.
SP_CLIENT_PASS | This is your Service Principal Password. You can override this value when running the Azure DevOps pipeline.
Expand All @@ -121,4 +121,4 @@ After updating the variables, be sure to click **save**:

You can run the pipeline to validate the `fence` build and push to ACR.

![Run the pipeline](azure_devops_pipeline_config_3.png)
![Run the pipeline](azure_devops_pipeline_config_3.png)
6 changes: 4 additions & 2 deletions fence/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -27,7 +27,7 @@
)

from fence.auth import logout, build_redirect_url
from fence.metrics import metrics
from fence.metrics import Metrics
from fence.blueprints.data.indexd import S3IndexedFileLocation
from fence.blueprints.login.utils import allowed_login_redirects, domain
from fence.errors import UserError
Expand Down Expand Up @@ -97,6 +97,8 @@ def app_init(
logger.info(
f"Prometheus metrics are{'' if config['ENABLE_PROMETHEUS_METRICS'] else ' NOT'} enabled."
)
# Initialize the Metrics instance
app.metrics = Metrics(enabled=config["ENABLE_PROMETHEUS_METRICS"])


def app_sessions(app):
Expand Down Expand Up @@ -207,7 +209,7 @@ def metrics_endpoint():
/!\ There is no authz control on this endpoint!
In cloud-automation setups, access to this endpoint is blocked at the revproxy level.
"""
data, content_type = metrics.get_latest_metrics()
data, content_type = flask.current_app.metrics.get_latest_metrics()
return flask.Response(data, content_type=content_type)


Expand Down
3 changes: 1 addition & 2 deletions fence/blueprints/data/indexd.py
Original file line number Diff line number Diff line change
Expand Up @@ -49,7 +49,6 @@
from fence.resources.ga4gh.passports import sync_gen3_users_authz_from_ga4gh_passports
from fence.resources.audit.utils import enable_audit_logging
from fence.utils import get_valid_expiration_from_request
from fence.metrics import metrics

from . import multipart_upload
from ...models import AssumeRoleCacheAWS, query_for_user, query_for_user_by_id
Expand Down Expand Up @@ -210,7 +209,7 @@ def _log_signed_url_data_info(
f"acl={acl} authz={authz} bucket={bucket} user_sub={user_sub} client_id={client_id}"
)

metrics.add_signed_url_event(
current_app.metrics.add_signed_url_event(
action,
protocol,
acl,
Expand Down
3 changes: 1 addition & 2 deletions fence/blueprints/login/base.py
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,6 @@
from fence.blueprints.login.redirect import validate_redirect
from fence.config import config
from fence.errors import UserError
from fence.metrics import metrics

logger = get_logger(__name__)

Expand Down Expand Up @@ -134,7 +133,7 @@ def get(self):

def post_login(self, user=None, token_result=None, **kwargs):
prepare_login_log(self.idp_name)
metrics.add_login_event(
flask.current_app.metrics.add_login_event(
user_sub=flask.g.user.id,
idp=self.idp_name,
fence_idp=flask.session.get("fence_idp"),
Expand Down
2 changes: 1 addition & 1 deletion fence/config-default.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -727,7 +727,7 @@ INDEXD_PASSWORD: ''
# - Support Azure Blob Data Access Methods
# //////////////////////////////////////////////////////////////////////////////////////

# https://docs.microsoft.com/en-us/azure/storage/common/storage-account-keys-manage?toc=%2Fazure%2Fstorage%2Fblobs%2Ftoc.json&tabs=azure-portal#view-account-access-keys
# https://learn.microsoft.com/en-us/azure/storage/common/storage-account-keys-manage?toc=%2Fazure%2Fstorage%2Fblobs%2Ftoc.json&tabs=azure-portal#view-account-access-keys
# AZ_BLOB_CREDENTIALS: 'fake connection string'
AZ_BLOB_CREDENTIALS:

Expand Down
Loading
Loading