Network Architecture#

Coiled’s zero trust network architecture protects your cluster and your data; you don’t have to worry about keeping your cluster private and secure. Coiled uses mTLS to ensure that only you can connect to your cluster and all communication is encrypted. Coiled also puts your scheduler dashboard behind authentication.

The network architecture is described below in terms of two different contexts:

  1. Communication related to users, the Coiled control plane, and your cloud provider account

  2. Communication related to the Dask client, scheduler, workers, and data sources

Communication with Coiled Control plane#

Network communication with the Coiled control plane occurs when users log in to https://cloud.coiled.io, view their cluster or analytics dashboard, or request Dask clusters. When you create a cluster, the Coiled control plane communicates with your cloud provider’s API to provision the necessary cloud resources in your cloud provider account.

The cluster nodes (VMs) do need to be able to initiate connections outbound to the Coiled control plane. Some of the telemetry they send is optional and can be disabled, but minimally they do need to send status messages that allow us to detect unhealthy nodes, and workers receive the scheduler address when they phone home to the control plane.

The Coiled control plane does not require direct connectivity inbound to cloud resources within your cloud provider account. Rather, all communication from the Coiled control plane to your cloud provider account happens via the cloud provider’s API. Therefore, there is no requirement to open ports or to whitelist network traffic originating from the Coiled control plane at https://cloud.coiled.io.

Diagram of network communication with the Coiled control plane. The user requests Dask cluster in the cloud, the Coiled control plane then communicates with your cloud provider API, which then provisions Dask clusters.

Source

Target

Protocol (Port)

Description

User (browser)

Coiled control plane (cloud.coiled.io)

HTTPS (443)

Users accessing cluster dashboard, analytics, etc.

User (Coiled client)

Coiled control plane (cloud.coiled.io)

HTTPS (443)

Users creating clusters, environments, etc.

Coiled control plane

Cloud provider APIs (AWS and GCP)

HTTPS (443)

Creation and management of cloud infrastructure

Dask scheduler

Coiled control plane

HTTPS (443)

Runtime analytics and performance metrics for Dask clusters

Communication with Dask clusters#

Network communication with Dask clusters occurs when users connect to Dask clusters via the Dask client, submit Dask computations, and view the Dask cluster status on the Dask dashboard. Users only communicate directly with the Dask scheduler, then the scheduler handles all network communication to the Dask workers and subsequent communication to data sources. Users are not required to have direct network access to Dask workers or data sources since they are only interacting with the Dask scheduler.

All compute resources used by Dask clusters, Dask client-to-scheduler communication, access to sensitive data, storage of software environment images, and system logging occur entirely within your cloud account. In other words, data from your data sources never flows through the Coiled control plane at any time because all network traffic related to the Dask client, scheduler, worker, and data access occurs outside of the Coiled network and only on your private cloud/network.

Diagram of network communication within a Dask cluster. The Dask client on your machine communicates with the scheduler in the cloud, which then communicates to the Dask workers, which can communicate with data stores as needed.

Source

Target

Protocol (Port)

Description

User (Dask client)

Dask scheduler

TCP + TLS (443)

Users submitting Dask computations

User (browser)

Dask dashboard

HTTPS (443)

Users accessing Dask status dashboard

Dask workers

Dask scheduler

TCP (8786)

Dask workers communicating with scheduler

Dask scheduler

Dask workers

TCP (1024-65535)

Dask scheduler communicating with workers

Dask workers

Dask workers

TCP (1024-65535)

Dask workers communicating with other workers

Dask workers

Data sources

Depends on data source

Reading and writing data for user computations

Note

The ports that are used by the Dask scheduler and Dask workers (listed in the table above) for inter-cluster communication are defaults as described in the Dask documentation. If desired, you can customize the ports used by the Dask scheduler and Dask workers by passing custom worker options when you create Dask clusters with Coiled.

For example, instead of using random ports within the unprivileged port range for the Dask workers, you can configure the Dask workers to use port 8000 as the Dask nanny port and port 9000 as the Dask computation port by specifying the following worker_options when creating a cluster:

import coiled

cluster = coiled.Cluster(worker_options={"port": 8000, "worker_port": 9000})

If you configure your clusters in this manner, then you’ll need to update your firewall or security group rules to allow traffic on ports 8000 and 9000 for scheduler-to-worker communication as well as worker-to-worker communication.