Skip to content

Building DMOD's Docker Images

Robert Bartel edited this page Oct 13, 2023 · 1 revision

TL;DR

# Step 0a: Ensure registry details are configured in .env, even if not actually running a Docker image registry

# Step 0b: If using the included Docker image registry stack, make sure it is running
./scripts/control_stack.sh --deploy-config docker-registry.yml dev_registry_stack start

# Step 1: Build image that prepares DMOD Python packages and makes them available to subsequent images
./scripts/control_stack.sh py-sources build push

# Step 2: Build service and worker images
./scripts/control_stack.sh main build push

Background

One of the primary focuses of DMOD is automating compute infrastructure. While DMOD is designed to be extensible and support other platforms in the future, currently DMOD is implemented to rely on Docker containerization and Docker Swarm orchestration as the foundations of its infrastructure capabilities. DMOD makes considerable use of Docker stacks for configuring build and execution processes.

DMOD does not include pre-built image files, nor does OWP provide or publish DMOD Docker images at this time. However, DMOD includes the necessary Dockerfiles, Docker stack configurations, and customized scripts to accomplish this relatively easily. This page will walk through the process of building all the necessary custom Docker images for a deployment.

The control_stack.sh Script

The script provide at ./scripts/control_stack.sh facilitates many Docker-stack-related actions. Here, we concentrate on those related to image building and publishing/pushing, though it is also used to start/stop the different Docker stacks. See more details using it's "help" options:

# Simplest help output:
./scripts/control_stack.sh -h

# More descriptive help output:
./scripts/control_stack.sh -hh

# Descriptive help output plus additional details:
./scripts/control_stack.sh -hhh

Private Image Registry Details

DMOD requires a Docker image registry be configured for the custom images. While it doesn't actually need to be running for single-node deployments, it does for multi-node deployments. This is done via the local environment config, which is discuss further in the INSTALL.md document.

The example.env explains the necessary values and provides reasonable defaults if it is used a the basis for the local env config.

Starting the Provided Registry Stack

While you are free to run your own registry separately, DMOD also provides a stack config to run a simple private registry as part of the DMOD deployment itself. That can be started with the command:

# Substitute "start" with "stop" to stop the registry stack
# Note also that the registry stack needs to explicitly specify the deploy config because of the filename

./scripts/control_stack.sh --deploy-config docker-registry.yml dev_registry_stack start 

Building the py-sources Stack Images

DMOD organizes most of the infrastructure into a main Docker stack, which we address shortly. At the time of this writing, there is a prerequisite stack called py-sources that must be built first. This builds DMOD's internal Python packages and makes them accessible to other image builds.

This can be done using DMOD's custom ./scripts/control_stack.sh tool. If you have a running registry to push to, then you can also run the push action, either as part of the same command or on its own.

# To build:
./scripts/control_stack.sh py-sources build

# To push:
./scripts/control_stack.sh py-sources push

# To build and push in one command:
./scripts/control_stack.sh py-sources build push

Aside: why is there py-sources?

Previously, DMOD's custom Docker images were based on Alpine Linux to attempt to reduce the size of the images. A side effect of this was that some more complex transitive dependencies - e.g., numpy, scikit-learn, pandas - required compilation from source, which was a rather lengthy process. Things were organized to have this in a separate stack to be able to provide flexibility for other parts of the build process while also minimizing how often these steps would need to be re-run.

DMOD has evolved, and some of this is no longer applicable, but at the time of this writing, the structure is still in place.

Building the main Stack Images

The main stack contains the image configurations for all the DMOD internal services, as well as the images for the different worker nodes that get started as part of jobs: e.g., the ngen image for NextGen model execution job workers.

These are built the same way as demonstrated above, using the control_stack.sh script:

# To build:
./scripts/control_stack.sh main build

# To push:
./scripts/control_stack.sh main push

# To build and push in one command:
./scripts/control_stack.sh main build push

Optional: Specifying Images to Build

It is possible to specify the particular subset of images that are built by adding the --build-args flag followed by a list of service names (i.e., names from the entries in the main stack [docker-build.yml](../blob/master/docker/main/docker-build.yml] file). For example, this will build only the images for the scheduler-service and the NextGen job worker:

# Note the quoted string

./scripts/control_stack.sh --build-args "scheduler-service ngen" main build

Note that this only works for the build action; e.g., if the push action is used, all main stack images will be pushed.