Dask Clusters#

Coiled manages cloud resources, networking, software environments, and everything you need to scale Python in the cloud. Spinning up Dask clusters with Coiled is done by creating a coiled.Cluster instance. coiled.Cluster objects manage a Dask cluster much like other cluster objects (e.g. distributed.LocalCluster).

  1. Launch a Dask cluster with Coiled:

    import coiled
    
    cluster = coiled.Cluster()
    
  2. Connect to your cluster with Dask:

    from dask.distributed import Client
    
    client = Client(cluster)
    
  3. Run your Dask computation in the cloud:

    import dask
    
    # generate random timeseries of data
    df = dask.datasets.timeseries("2000", "2005", partition_freq="2w").persist()
    
    # perform a groupby with an aggregation
    df.groupby("name").aggregate({"x": "sum", "y": "max"}).compute()
    
  4. Monitor your computation in real-time using the Dask dashboard:

    print(cluster.dashboard_link)
    

    You can also generate Performance Reports for later inspection or use Analytics to monitor Dask performance across clusters.

    You can monitor your cluster status and infrastructure state using the Coiled web application by navigating to cloud.coiled.io/<account-name>/clusters and selecting the cluster name (see Coiled cloud):

    ../_images/cloud-cluster-dashboard.png
  5. Once you’re done, close your cluster and the Dask client:

    cluster.close()
    client.close()
    

This overview covered some of the basics of creating a cluster, running a computation, and monitoring Dask computation and cluster status. The next sections provide a more comprehensive guide on customizing clusters and additional cluster methods.