Software Environments#
It is important that the packages and versions installed on your local machine are the same as in the cloud computing environment where Dask clusters are created. Otherwise, you could get errors, or even incorrect results.
We recommend using Package Synchronization, which scans your local Python environment and replicates it on the cluster. This may not work for every use case, though, and instead can recreate your local environment in the cloud using coiled.create_software_environment()
.
Creating software environments#
You can create your Python environment by installing dependencies from conda, pip, or by specifying a pre-built Docker image.
Conda#
You can create a conda environment using the conda
keyword argument. You can pass a list of dependencies, for example:
coiled.create_software_environment(
name="my-conda-env", conda=["python=3.9", "dask", "coiled", "xarray"]
)
Note
When creating an environment with conda
it’s important to specify the Python version, otherwise, the highest supported version will be used.
A dictionary of channels and dependencies:
coiled.create_software_environment(
name="my-conda-env",
conda={
"channels": ["conda-forge", "defaults"],
"dependencies": ["python=3.9", "dask", "coiled", "xarray"],
},
)
or a local environment.yml file:
coiled.create_software_environment(
name="my-conda-env",
conda="environment.yml",
)
where environment.yml
is a local file that might look something like (see the conda documentation on how to export your environment.yml file):
# environment.yml
channels:
- conda-forge
- defaults
dependencies:
- python==3.9
- dask==2023.2.0
- bokeh==2.4.3
- numba
Pip#
Similarly, you can use the pip
keyword argument to install dependencies using pip. You can pass a list of dependencies, for example:
coiled.create_software_environment(
name="my-pip-env",
pip=["dask[complete]", "coiled", "xarray"],
)
Note
When creating an environment with pip
, your Python version will be detected automatically and used in your cluster.
Note
Pip does not automatically install distributed
along with dask
.
Specify dask with dask[complete]
or dask[distributed]
to ensure
distributed is installed.
Or you can pass a local requirements.txt file (see the pip documentation for more information on requirements files):
coiled.create_software_environment(
name="my-pip-env",
pip="requirements.txt",
)
where requirements.txt
might look something like:
bokeh==2.4.3
click==8.1.3
cloudpickle==2.2.1
dask==2023.2.0
distributed==2023.2.0
fsspec==2023.1.0
Private Repositories#
To use pip packages hosted in private repositories you must add a personal access token to your Coiled profile, which allows Coiled to pip install these packages on your behalf. To create a GitHub personal access token, follow the steps in GitHub’s guide. After you’ve created your access token, add it to your profile page at https://cloud.coiled.io/profile.
When specifying a pip package from a private repository use the format:
git+https://GIT_TOKEN@github.com/<github_account>/<github_repo>.git
For example:
coiled.create_software_environment(
name="my-pip-env",
pip=[
"dask[complete]",
"git+https://GIT_TOKEN@github.com/coiled/private_package.git",
],
)
Attention
For security reasons, you should not use your actual personal access
token when specifying pip requirements. Instead, use the literal string
GIT_TOKEN
which acts as a placeholder for your personal access token.
Your actual access token will be populated when Coiled builds the
corresponding software environment.
Docker#
You can also build environments based on Docker images using the container
keyword argument, for example:
coiled.create_software_environment(
name="my-docker-env",
container="rapidsai/rapidsai-core:23.02-cuda11.5-runtime-ubuntu20.04-py3.8",
)
will build a software environment named “my-docker-env” using the latest RAPIDS image. See Docker images for more details.