Coiled clusters in 10 minutes#

In this guide you will:

  1. Sign up for Coiled

  2. Install the Coiled Python library

  3. Set up Coiled with your cloud provider

  4. Run your Dask computation in your cloud account

Here’s a video walkthrough of setting up Coiled using AWS:

1. Sign up#

Sign up for Coiled using GitHub, Google, or your email address.

2. Install#

pip install coiled 'dask[complete]'

3. Set up Coiled with your cloud provider#

You can set up Coiled using the coiled setup command line tool:

$ coiled setup

You’ll then navigate to https://cloud.coiled.io/profile on the Coiled web app where you can create and manage API tokens.

Please login to https://cloud.coiled.io/profile to get your token
Token:

Your token will be saved to Coiled’s local configuration file.

Note

For Windows users

You will need to first log in with coiled login --token <your-token> since the Windows clipboard will not be active at the “Token” prompt. Then, you can run coiled setup to finish setting up Coiled with your cloud provider account. Unless you are using WSL, you will need to use a command prompt or PowerShell window within an environment that includes coiled.

You can also login with !coiled login --token <your-token> from a Jupyter notebook.

You’ll then be prompted to configure your Google Cloud or AWS account (see Automatic Setup).

Terminal output prompting configuring your AWS or Google Cloud account. Text reads, Welcome to Coiled! To begin we need to connect Coiled to your cloud account. Select one of the following options: 1. Amazon Web Service (AWS), 2. Google Cloud Platform (GCP), 3. I don't have a cloud account, set me up with a free trial? x (in red) Exit setup.

If you don’t have an AWS or GCP account and would like help choosing which to use, see Need a cloud provider account?

4. Run your Dask computation#

Next, spin up a Dask cluster in your cloud by creating a coiled.Cluster instance and connecting this cluster to the Dask Client.

import coiled

# create a remote Dask cluster with Coiled
cluster = coiled.Cluster(name="my-cluster")

# connect a Dask client to the cluster
client = cluster.get_client()

# link to Dask scheduler dashboard
print("Dask scheduler dashboard:", client.dashboard_link)

Note

If you’re using a Team account, be sure to specify the account= option when creating a cluster:

cluster = coiled.Cluster(account="<my-team-account-name>")

Otherwise, the cluster will be created in your personal Coiled account.

You will then see a widget showing the cluster state overview and progress bars as resources are provisioned (this may take a minute or two).

Terminal dashboard displaying the Coiled cluster status overview, configuration, and Dask worker states.

Once the cluster is ready, you can submit a Dask DataFrame computation for execution. Navigate to the Dask scheduler dashboard (see Dashboard Address in the widget) for real-time diagnostics on your Dask computations.

import dask

# generate random timeseries of data
df = dask.datasets.timeseries("2000", "2005", partition_freq="2w").persist()

# perform a groupby with an aggregation
df.groupby("name").aggregate({"x": "sum", "y": "max"}).compute()

You can also monitor your cluster, access the Dask scheduler dashboard, and see cluster state and worker logs from https://cloud.coiled.io.

Cluster dashboard on the Coiled cloud web app with rows for each cluster and columns for cluster name, status, number of workers, software environment, last seen timestamp, and cost (in credits).

Cluster dashboard (click to enlarge)#

Lastly, you can stop the running cluster using the following commands. By default, clusters will shut down after 20 minutes of inactivity.

# Close the cluster
cluster.close()

# Close the client
client.close()

Learn more about options for launching Dask clusters here.