Advanced Usage#

Local packages and Git dependencies#

Package sync works with:

  • Locally installed packages, like those installed via pip install -e <some-directory>

  • Python files in your working directory. Any .py files in your working directory will be synced to your remote environment.

  • Packages installed from Git, like those installed via pip install git+ssh://git@github.com/dask/distributed.git@d74f5006.

Note

When installing packages from Git, you may need to add the --use-pep517 flag to pip, like:

pip install git+https://github.com/dask/distributed.git --use-pep517

Without this flag, pip may not record sufficient metadata to tell that the package was installed from Git.

Warning

By default, your compiled local packages are uploaded to a private, Coiled-owned S3 bucket, then downloaded by your cluster. If having a copy of your source code in a Coiled S3 bucket violates your organization’s security policies, you can instead use your own S3 bucket. Reach out to us at support@coiled.io if you’d like to use this option.

Private PyPI URLs#

If you have a private (or other custom) PyPI index_url or extra-index-url, you will need to have it configured in one of the following ways:

  1. Environment variables: PIP_INDEX_URL, PIP_PYPI_URL, or UV_INDEX_URL for the primary index URL; PIP_EXTRA_INDEX_URL or UV_EXTRA_INDEX_URL for extra index URL(s).

  2. A pyproject.toml file in your working directory: in one of the [[tool.poetry.source]], [tool.pixi.pypi-options], or [tool.uv.pip] sections.

  3. A uv.toml file in your working directory: in the [pip] section.

  4. A pixi.toml file in your working directory: in the [pypi-options] section.

  5. Running pip config set 'global.extra-index-url' <URL> or pip config set 'global.index-url' <URL>.

If you usually pass the --extra-index-url or --index-url argument when you run pip install, no record of where the package came from is stored in the environment, so package sync will fail to install it on the cluster.

If you do not want to include the username and password in the URL locally, you can include it in a netrc file, or use keyring credential storage. If you use keyring, we require the keyring package to be installed in your Python environment.

GPUs#

If you don’t have a GPU locally, but would like to use GPUs on your remote cloud VMs, package sync will automatically translate between CPU and GPU versions of commonly used GPU-accelerated packages (like PyTorch, for example). This enables you to drive computations on cloud GPUs from any local hardware. See GPU Software for more details.

Extra conda packages#

If you have conda packages that you would like installed on your cluster that you do not have installed locally (e.g., system packages that are required by your dependencies on Linux), you can list them in the package_sync_conda_extras argument to coiled.Cluster.

This should be used with caution, as it can potentially introduce dependency conflicts, because the dependencies for these packages will also be installed via conda.

Warning

This will not work for “noarch” conda packages, and should only be used for installing packages with platform-specific builds.

Ignoring packages#

If you have packages installed locally that you don’t want synced to the cluster, you can list them in the package_sync_ignore argument to coiled.Cluster. This is generally not needed, though, because package sync installation on the cluster is so fast that installing extra, unused packages has a negligible effect on cluster startup time.

Note that only these exact packages are ignored—their dependencies may still be installed. Additionally, if another package depends on them, they will still be installed.

Cross-platform fuzzing#

When using a macOS or Windows machine to launch clusters (which always run Linux), you may not get exactly the same versions of all packages on the cluster as you have locally. This is because packages sometimes require slightly different dependencies on different platforms, so package sync uses a looser version match with cross-platform clusters. If you have trouble using package sync for cross-platform clusters, we recommend creating a new environment only and installing the packages you need to run.

Mandatory packages#

Package sync will refuse to start a cluster if you don’t have these basic packages for running Dask installed locally:

dask
distributed
tornado
cloudpickle
msgpack

Additionally, package sync ensures that the versions of these packages match exactly with what you have locally, even cross-platform.