Skip to content

Commit

Permalink
Merge pull request #55 from lsst-sqre/tickets/DM-38798
Browse files Browse the repository at this point in the history
DM-38798: Debugging for USDF deployment
  • Loading branch information
jonathansick authored Jun 23, 2023
2 parents df9844b + 47ecee8 commit fda92e3
Show file tree
Hide file tree
Showing 16 changed files with 877 additions and 589 deletions.
2 changes: 1 addition & 1 deletion .github/workflows/docs.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -31,7 +31,7 @@ jobs:
- name: Run tox
uses: lsst-sqre/run-tox@v1
with:
python-version: ${{ matrix.python }}
python-version: "3.11"
tox-envs: "docs"

- name: Upload documentation
Expand Down
2 changes: 2 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
@@ -1,3 +1,5 @@
docs/_static/openapi.json

# Byte-compiled / optimized / DLL files
__pycache__/
*.py[cod]
Expand Down
11 changes: 11 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,16 @@
# Change log

## 0.7.1 (2023-07-23)

Fixes:

- Add additional logging of JupyterLab spawning failures in workers.

Other changes:

- Added documentation for configuration environment variables.
- Added OpenAPI docs, rendered by Redoc, to the Sphinx documentation site.

## 0.7.0 (2023-05-22)

- The JupyterHub service's URL path prefix is now configurable with the `NOTEBURST_JUPYTERHUB_PATH_PREFIX` environment variable. The default is `/nb/`, which is the existing value.
Expand Down
2 changes: 1 addition & 1 deletion Dockerfile
Original file line number Diff line number Diff line change
Expand Up @@ -14,7 +14,7 @@
# - Runs a non-root user.
# - Sets up the entrypoint and port.

FROM python:3.11.3-slim-bullseye as base-image
FROM python:3.11.4-slim-bullseye as base-image

# Update system packages
COPY scripts/install-base-packages.sh .
Expand Down
5 changes: 5 additions & 0 deletions docs/api.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
########
REST API
########

This is a stub page for the API.
7 changes: 7 additions & 0 deletions docs/documenteer.toml
Original file line number Diff line number Diff line change
Expand Up @@ -5,6 +5,13 @@ copyright = "2021-2023 Association of Universities for Research in Astronomy, In
[project.python]
package = "noteburst"

[project.openapi]
openapi_path = "_static/openapi.json"
doc_path = "api"

[project.openapi.generator]
function = "noteburst.main:create_openapi"

[sphinx]
rst_epilog_file = "_rst_epilog.rst"

Expand Down
12 changes: 12 additions & 0 deletions docs/index.rst
Original file line number Diff line number Diff line change
@@ -1,10 +1,22 @@
:html_theme.sidebar_secondary.remove:

#########
Noteburst
#########

Noteburst is a Rubin Science Platform service that coordinates running Jupyter Notebooks in a JupyterLab context.
Noteburst can be used by CI and monitoring services, as well as for applications that need to compute and render Jupyter Notebooks programatically.

To learn more about Noteburst's design, see :sqr:`065`.
Noteburst is designed to be deployed with Phalanx. `See Noteburst's operations documentation in Phalanx. <https://phalanx.lsst.io/applications/noteburst/index.html>`__

Services that use Noteburst:

- `Times Square <https://github.com/lsst-sqre/times-square>`__ (Parameterized Jupyter Notebook publishing application)

.. toctree::
:hidden:

user-guide/index
api
changelog
61 changes: 61 additions & 0 deletions docs/user-guide/configuring-workers.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,61 @@
##########################################
Configuring Noteburst's JupyterLab workers
##########################################

Noteburst works by operating a cluster of workers that each manages their own JupyterLab pods.
This page describes how these workers are configured.

Background: Kubernetes architecture
===================================

In Kubernetes, the workers are deployed as a Kubernetes Deployment.
A Deployment enables multiple Noteburst worker instances to run at the same time.
These workers share the same configuration (typically through a Kubernetes ConfigMap).
This means that individual workers can't be assigned specific RSP/JupyterLab user accounts.
Noteburst works around this by configuring the deployment of Noteburst workers with a pool of identities.
When a worker starts up, it picks an available identity from the pool and uses that identity to run the JupyterLab pod.
See the next section for details.

.. _worker-identities-yaml:

Worker identities
=================

Each Noteburst worker pod runs a JupyterLab server under a specific, and unique, identity.
These identities are bot accounts.
In some RSP environments, such as the USDF, these identities need to be associated with actual user accounts.

When a Noteburst worker pod starts up, it picks an available identity from a pool of available identities.
These identifies are configured in a file, that path of which is specified with :envvar:`NOTEBURST_WORKER_IDENTITIES_PATH`.
This file is a YAML-formatted file that looks like this:

.. code-block:: yaml
:caption: identities.yaml
- username: "bot-noteburst00"
- username: "bot-noteburst01"
- username: "bot-noteburst02"
- username: "bot-noteburst03"
- username: "bot-noteburst04"
- username: "bot-noteburst05"
The YAML file consists of a list of identities.
At a minimum, an identity requires a ``username`` field.

In some environments where Gafaelfawr cannot provide a uid for a user, a ``uid`` must be specified:

.. code-block:: yaml
:caption: identities.yaml
- username: "bot-noteburst00"
uid: 90000
- username: "bot-noteburst01"
uid: 90001
- username: "bot-noteburst02"
uid: 90002
- username: "bot-noteburst03"
uid: 90003
- username: "bot-noteburst04"
uid: 90004
- username: "bot-noteburst05"
uid: 90005
92 changes: 92 additions & 0 deletions docs/user-guide/environment-variables.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,92 @@
#####################
Environment variables
#####################

Noteburst uses environment variables for configuration.
In practice, these variables are typically set as Helm values and 1Password/Vault secrets that are injected into the container as environment variables.
See the `Phalanx documentation for Noteburst <https://phalanx.lsst.io/applications/noteburst/index.html>`__ for more information on the Phalanx-specific configurations.

.. envvar:: SAFIR_NAME

(string, default: "Noteburst") The name of the application.
This is used in the metadata endpoint.

.. envvar:: SAFIR_PROFILE

(string enum: "production" [default], "development") The application run profile.
Use production to enable JSON structured logging.

.. envvar:: SAFIR_LOG_LEVEL

(string enum: "debug", "info" [default], "warning", "error", "critical") The application log level.

.. envvar:: NOTEBURST_PATH_PREFIX

(string, default: "/noteburst") The path prefix for the Noteburst application.
This is used to configure the application's URL.

.. envvar:: NOTEBURST_ENVIRONMENT_URL

(string) The base URL of the Rubin Science Platform environment.
This is used for creating URLs to services, such as JupyterHub.

.. envvar:: NOTEBURST_JUPYTERHUB_PATH_PREFIX

(string, default: "/nb") The path prefix for the JupyterHub application.

.. envvar:: NOTEBURST_NUBLADO_CONTROLLER_PATH_PREFIX

(string, default: "/nublado") The path prefix for the Nublado controller service.

.. envvar:: NOTEBURST_GAFAELFAWR_TOKEN

(secret string) This token is used to make an admin API call to Gafaelfawr to get a token for the user.

.. envvar:: NOTEBURST_REDIS_URL

(string) The URL of the Redis server, used by the worker queue.

.. envvar:: NOTEBURST_ARQ_MODE

(string enum: "production" [default], "test") The Arq worker mode.
The production mode uses the Redis server, while the test mode mocks queue interactions for testing the application.

.. envvar:: NOTEBURST_WORKER_IDENTITIES_PATH

(string) The path to the Science Platform worker identities file.
See :ref:`worker-identities-yaml`.

.. envvar:: NOTEBURST_WORKER_QUEUE_NAME

(string) The name of arq queue the workers process.

.. envvar:: NOTEBURST_WORKER_LOCK_REDIS_URL

(Redis URL) The URL of the Redis server, used by the worker lock.

.. envvar:: NOTEBURST_WORKER_JOB_TIMEOUT

(integer, default: 3000) The timeout for a worker job, in seconds.

.. envvar:: NOTEBURST_WORKER_TOKEN_LIFETIME

(integrer, default: 2419200) The worker auth token lifetime in seconds.

.. envvar:: NOTEBURST_WORKER_TOKEN_SCOPES

(string, default: "exec:notebook") The worker (nublado pod) token scopes, as a comma-separated string.

.. envvar:: NOTEBURST_WORKER_IMAGE_SELECTOR

(string enum: "recommended" [default], "weekly", "reference") The method for selecting a Jupyter image to run.
For "reference" see :envvar:`NOTEBURST_WORKER_IMAGE_REFERENCE`.

.. envvar:: NOTEBURST_WORKER_IMAGE_REFERENCE

(string) The tag of the Jupyter image to run. This is used when :envvar:`NOTEBURST_WORKER_IMAGE_SELECTOR` is set to "reference".

.. envvar:: NOTEBURST_WORKER_KEEPALIVE

(string, enum: "normal" [default], "fast", "disabled") The worker keep alive mode.
The regular keep-alive execises the JupyterLab pod every 5 minutes. The fast mode exercises the pod every 30 seconds.
The disabled mode does not exercise the pod.
10 changes: 10 additions & 0 deletions docs/user-guide/index.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,10 @@
##########
User guide
##########

.. toctree::
:maxdepth: 2
:caption: Deployment

configuring-workers
environment-variables
2 changes: 1 addition & 1 deletion requirements/dev.in
Original file line number Diff line number Diff line change
Expand Up @@ -20,4 +20,4 @@ respx
types-PyYAML

# Documentation
documenteer[guide]
documenteer[guide]==1.0.0a1
Loading

0 comments on commit fda92e3

Please sign in to comment.