Docker

In addition to Automatic Package Synchronization, you can also create a Coiled Python environment using a Docker image. Coiled supports images on Docker Hub, Amazon ECR or Google Artifact Registry.

Note

Your ability to use private images stored in Docker Hub or a cloud provider-specific registry is limited by which option you chose when initially setting up your Coiled account (see the Container Registry step for Google Cloud or AWS).

For example, if you chose to store your Coiled software environments in ECR, then you will not be able to use private Docker Hub images. If you would like to be able to use both Docker Hub and ECR reach out to us for help.

First, you’ll create a Coiled Python environment using the container= keyword argument of coiled.create_software_environment().

coiled.create_software_environment(
    name="dask-latest",
    container="daskdev/dask:latest",
)

For registries other than Docker Hub—such as Amazon ECR or Google Artifact Registry—you’ll need to specify the full registry URL. For example, to create an environment using the publicly available RAPIDS image in the NVIDIA container registry:

coiled.create_software_environment(
    name="my-docker-env",
    container="nvcr.io/nvidia/rapidsai/base:23.08-cuda11.8-py3.10",
)

The URL for an image in Amazon ECR, for example, would resemble 789111821368.dkr.ecr.us-east-2.amazonaws.com/prod/coiled.

When using a custom container image, the image needs to be built with all the software that you need on the cluster. The pip and conda keyword arguments of coiled.create_software_environment() cannot be used to install additional packages.

Minimally, the container image must have dask and distributed installed.

By default, Coiled will use the container entrypoint when attempting to run python. On some containers, the entrypoint prevents running commands. For example, it may start a Jupyter server. In such cases, you can ignore the entrypoint by setting use_entrypoint=False when creating your Coiled software environment.

Verification

To test that your container will run successfully on your Coiled cluster, you can run:

docker run --rm <your_container> python -m distributed.cli.dask_spec \
  --spec '{"cls":"dask.distributed.Scheduler", "opts":{}}'

If successful, this will start the dask.distributed scheduler (you can use CTRL+C to stop it). For example:

> docker run --rm daskdev/dask:latest python -m distributed.cli.dask_spec \
    --spec '{"cls":"dask.distributed.Scheduler", "opts":{}}'

2022-10-06 14:44:43,640 - distributed.scheduler - INFO - State start
2022-10-06 14:44:43,656 - distributed.scheduler - INFO - Clear task state
2022-10-06 14:44:43,658 - distributed.scheduler - INFO -   Scheduler at:    tcp://172.17.0.2:41089
2022-10-06 14:44:43,658 - distributed.scheduler - INFO -   dashboard at:                     :8787

If not, you will see an error like /opt/conda/bin/python: Error while finding module specification for 'distributed.cli.dask_spec' (ModuleNotFoundError: No module named 'distributed'). For example:

> docker run --rm continuumio/miniconda3:latest python -m distributed.cli.dask_spec \
    --spec '{"cls":"dask.distributed.Scheduler", "opts":{}}'

Unable to find image 'continuumio/miniconda3:latest' locally
latest: Pulling from continuumio/miniconda3
dc1f00a5d701: Already exists
a7a9c78d89b2: Already exists
44ac19016d77: Already exists
Digest: sha256:977263e8d1e476972fddab1c75fe050dd3cd17626390e874448bd92721fd659b
Status: Downloaded newer image for continuumio/miniconda3:latest
/opt/conda/bin/python: Error while finding module specification for 'distributed.cli.dask_spec' (ModuleNotFoundError: No module named 'distributed')

If the dask.distributed scheduler fails to start, it’s good to check that distributed is installed and the environment it is installed in has been activated.

You can also try bypassing the default container entrypoint by running:

docker run --rm --entrypoint "" <your_container> python -m distributed.cli.dask_spec \
  --spec '{"cls":"dask.distributed.Scheduler", "opts":{}}'

If it’s necessary to bypass the entrypoint, you can set use_entrypoint=False when creating your Coiled software environment.

If you’re having trouble running your Docker container on your Coiled cluster, feel free to reach out to us for help.