GPUs

Coiled supports running computations with GPU-enabled machines. In principle, doing this is as simple as setting worker_gpu=1 in the coiled.Cluster() constructor:

import coiled

cluster = coiled.Cluster(
    ...,
    worker_gpu=1,
)

But in practice there are additional considerations.

Getting Started

First, note that free individual cloud accounts do not have GPU access enabled (see Account Access, below). Also, GPUs are not available in the AWS ECS backend. You will need to use one of the VM backends to use GPUs.

Next, you will need a suitable software environment. The specifics will vary with your need. An example suitable for some initial work and testing is given here:

import coiled

# Create a software environment with GPU accelerated libraries
# and CUDA drivers installed
coiled.create_software_environment(
    name="gpu-test",
    container="gpuci/miniconda-cuda:10.2-runtime-ubuntu18.04",
    conda={
        "channels": [
            "rapidsai",
            "conda-forge",
            "defaults",
        ],
        "dependencies": [
            "dask",
            "dask-cuda",
            "cupy",
            "cudatoolkit=10.2",
        ],
    },
)

More information on GPU software environments is given below.

With a suitable software environment, creating a cluster is straightforward. Simply set worker_gpu=1. Currently, Coiled only permits a single GPU per worker.

# Create a Coiled cluster that uses
# a GPU-compatible software environment
cluster = coiled.Cluster(
    scheduler_cpu=2,
    scheduler_memory="4 GiB",
    worker_cpu=4,
    worker_memory="16 GiB",
    worker_gpu=1,
    software="gpu-test",
)

If desired, the cluster specified above can be tested with the following computation:

from dask.distributed import Client


def test_gpu():
    import numpy as np
    import cupy as cp

    x = cp.arange(6).reshape(2, 3).astype("f")
    return cp.asnumpy(x.sum())


client = Client(cluster)

f = client.submit(test_gpu)
f.result()

If successful, this should return array(15., dtype=float32).

Note

If you are a member of more than one team (remember, you are automatically a member of your own personal account), you must specify the team under which to create the cluster (defaults to your personal account). You can do this with either the account= keyword argument, or by adding it as a prefix to the name of the cluster, such as name="<account>/<cluster-name>". Learn more about teams.

Software Environments

When creating a software environment for GPUs, you will need to install the GPU accelerated libraries needed (e.g. PyTorch, RAPIDS, XGBoost, Numba, etc.) and also ensure that the container in use has the correct CUDA drivers installed.

Coiled infrastructure generally runs with CUDA version 10.2. If you already have a Docker image with your desired software and the drivers match, then you should be good to go.

If you plan to make a software environment with conda or pip packages then we recommend basing your software environment on a container with the correct drivers installed. For example: gpuci/miniconda-cuda:10.2-runtime-ubuntu18.04

import coiled

coiled.create_software_environment(
    name="gpu-test",
    container="gpuci/miniconda-cuda:10.2-runtime-ubuntu18.04",
    conda={
        "channels": ["conda-forge", "rapidsai", "defaults"],
        "dependencies": ["dask", "dask-cuda", "cupy", "cudatoolkit=10.2"],
    },
)

Current Hardware

Currently Coiled mostly deploys cost efficient T4 GPUs by default. If you are interested in using higher performance GPUs then please contact us.

Account Access

Free individual accounts do not have GPU access turned on by default. If you are interested in testing out GPU access then please contact us.

If you have been granted access it may be as part of a team account. If so, please be aware that you will have to specify the account under which you want to create your cluster in the coiled.Cluster constructor:

cluster = coiled.Cluster(
    scheduler_cpu=2,
    scheduler_memory="4 GiB",
    worker_cpu=4,
    worker_memory="16 GiB",
    worker_gpu=1,
    software="gpu-test",
    account="MY-TEAM-ACCOUNT",
)