Using custom code on a Coiled cluster#
In this tutorial, you’ll learn more advanced techniques for installing custom Python code on an already running cluster, using a custom Docker image to create a Coiled software environment, or running additional commands after installing a package.
Replicate your local Python environment#
Package sync scans your local Python environment and replicates it on the cluster—even local packages and Git dependencies. It’s easier than building a Docker image, plus it launches significantly faster (see Package Synchronization). Package sync is used by default when you create a cluster.
Install a local Python module#
If you’re not using package sync, but you have custom Python packages you would like to use on your Coiled cluster, you can upload a Python module to all of your workers using Dask’s upload_file. This sends a local file to all your worker nodes on an already running cluster. This method is particularly helpful for when you would like to use custom Python functions that haven’t been packaged in your Cluster.
import coiled
from dask.distributed import Client
cluster = coiled.Cluster()
client = Client(cluster)
client.upload_file("my_script.py")
Docker images#
In addition to Package Synchronization, you can also create a Coiled Python environment using a Docker image. Coiled supports images on Docker Hub, Amazon ECR or Google Artifact Registry.
Note
Your ability to use private images stored in Docker Hub or a cloud provider-specific registry is limited by which option you chose when initially setting up your Coiled account (see the Container Registry step for Google Cloud or AWS).
For example, if you chose to store your Coiled software environments in ECR, then you will not be able to use private Docker Hub images. If you would like to be able to use both Docker Hub and ECR reach out to us for help.
First, you’ll create a Coiled Python environment using the container=
keyword argument of of coiled.create_software_environment()
. For example, to create an environment using the publicly available RAPIDS image on Docker Hub:
coiled.create_software_environment(
name="my-docker-env",
container="rapidsai/rapidsai-core:23.02-cuda11.5-runtime-ubuntu20.04-py3.8",
)
For Amazon ECR or Google Artifact Registry, you can pass the full registry URL; for Amazon ECR, for example, this would resemble 789111821368.dkr.ecr.us-east-2.amazonaws.com/prod/coiled
.
Verification#
To test that your container will run successfully on your Coiled cluster, you can run:
docker run --rm <your_container> python -m distributed.cli.dask_spec \
--spec '{"cls":"dask.distributed.Scheduler", "opts":{}}'
If successful, this will start the dask.distributed
scheduler (you can use CTRL+C to stop it). For example:
> docker run --rm daskdev/dask:latest python -m distributed.cli.dask_spec \
--spec '{"cls":"dask.distributed.Scheduler", "opts":{}}'
2022-10-06 14:44:43,640 - distributed.scheduler - INFO - State start
2022-10-06 14:44:43,656 - distributed.scheduler - INFO - Clear task state
2022-10-06 14:44:43,658 - distributed.scheduler - INFO - Scheduler at: tcp://172.17.0.2:41089
2022-10-06 14:44:43,658 - distributed.scheduler - INFO - dashboard at: :8787
If not, you will see an error like /opt/conda/bin/python: Error while finding module specification for 'distributed.cli.dask_spec' (ModuleNotFoundError: No module named 'distributed')
. For example:
> docker run --rm continuumio/miniconda3:latest python -m distributed.cli.dask_spec \
--spec '{"cls":"dask.distributed.Scheduler", "opts":{}}'
Unable to find image 'continuumio/miniconda3:latest' locally
latest: Pulling from continuumio/miniconda3
dc1f00a5d701: Already exists
a7a9c78d89b2: Already exists
44ac19016d77: Already exists
Digest: sha256:977263e8d1e476972fddab1c75fe050dd3cd17626390e874448bd92721fd659b
Status: Downloaded newer image for continuumio/miniconda3:latest
/opt/conda/bin/python: Error while finding module specification for 'distributed.cli.dask_spec' (ModuleNotFoundError: No module named 'distributed')
If the dask.distributed
scheduler fails to start, it’s good to check that distributed
is installed and the environment it is installed in has been activated. If you’re having trouble running your Docker container on your Coiled cluster, feel free to reach out to us for help.
Install pip-installable packages#
If you have a package that is pip-installable, but not yet publicly available on PyPI or conda-forge, for example, you can use Dask’s PipInstall Worker Plugin to pip install a set of packages. This is particularly useful for uploading modules that are still in development.
You can upload a public module in GitHub
from dask.distributed.diagnostics.plugin import PipInstall
plugin = PipInstall(packages=["git+<github url>"])
client.register_worker_plugin(plugin, name="<dependency name>")
If you want to install from a private repository you need to have a GitHub token set in your account by either having signed up with GitHub or by adding your GitHub token to your profile. GitHub tokens are stored with Coiled and then used on the machine that’s building the software environment; the token is not saved in the software environment.
from dask.distributed.diagnostics.plugin import PipInstall
plugin = PipInstall(packages=["git+https://{GITHUB_TOKEN}@github.com/<repo>"])
client.register_worker_plugin(plugin, name="<dependency name>")
Note
Using the name=
argument will allow you to call PipInstall
more than once, otherwise you might see a message from workers like {'tls://10.4.1.170:38403': {'status': 'repeat'}
.
Upload a local directory#
Similar to the PipInstall
Plugin, you can upload a local directory to your cluster by using the UploadDirectory Nanny Plugin.
You can upload a local directory from your machine to the cluster using:
from distributed.diagnostics.plugin import UploadDirectory
client.register_worker_plugin(UploadDirectory("/path/to/directory"), nanny=True)
It’s worth noting UploadDirectory
will not install anything on its own. Ideally, you would package the code and directly use the PipInstall
Worker Plugin mentioned above. However, if this is not possible you can write your own plugin for uploading and installation:
class InstallModulePlugin(UploadDirectory):
"""Use this plugging to upload a directory and install that directory in the workers."""
def __init__(self, dir_path, module):
"""Initializes the plugin
Arguments
---------
dir_path: str, path to the directory you want to upload
module: directory name
"""
super().__init__(dir_path, update_path=True)
self.module = module
async def setup(self, nanny):
await super().setup(nanny)
import os
import subprocess
path_to_module = os.path.join(nanny.local_directory, self.module)
# or whatever bash command to install package
subprocess.call(["pip", "install", "-e", path_to_module])
plugin = InstallModulePlugin("path_to_directory", "directory_name")
client.register_worker_plugin(plugin, nanny=True)