Bring Your Own Network#
Usually Coiled creates all the cloud networking resources required for running a cluster. For customers who are hosting Coiled in their own AWS or GCP account, we also provide the option to have Coiled use an existing network which you have created.
While this means you’re responsible managing more aspects of hosting Coiled, it also enables you to run Coiled while meeting specific needs for network security or configuration, such as:
you need to peer the VPC used for Coiled clusters with other networks
you need to configure additional network security, for example, routing traffic through a customer-managed firewall or limiting inbound connections to a VPN
you need to configure network access to your data sources, for example, using AWS PrivateLink
you need to limit the IAM permissions that you grant to Coiled
If you provide a network for Coiled to use, you’ll be responsible for:
VPC
subnet(s)
routing and internet access (including NAT for VMs without public IP address)
security groups (AWS) or firewall rules (GCP)
If you provide a network, Coiled will still be responsible for creating VMs (and associated storage, network interface, and public IPs) as well as machine images and Docker images (for your software environments).
You can configure Coiled to use your network when setting your Cloud Backend Options on the Account page on cloud.coiled.io, or you can Configure network using the Python API as explained below.
Network requirements#
See Network Architecture for details about the networking needs of a Coiled cluster.
The network you provide for Coiled to use needn’t match the networks we create by default, but they do need to meet some minimal requirements.
Our default network allows public ingress to the scheduler on ports 8786 and 8787. This isn’t a requirement, so long as the machine running the Python client is able to connect to the scheduler. For instance, you could be running the client on a machine inside a paired VPC or go through a VPC which allows you to connect to private IP of the scheduler. Ports 8786, 8787 need to be open for ingress so that the client can connect to scheduler.
It’s necessary that the scheduler and workers be able to download software (as well of course as any data used in your computations). This can be achieved by using a NAT Gateway which is set as next hop for outbound connections, but it can also be achieved by allowing us to assign public IP addresses for workers as well as the scheduler.
Example for a single, public subnet#
One way to structure your network on AWS is to have a single public subnet that’s used for both schedulers and workers. Scheduler and workers would all use public IP addresses, and you could use a Security Group to block ingress to the workers from outside the cluster.
The main components involved are:
VPC with attached Internet Gateway and a subnet to use for Coiled clusters
Route Table for the subnet with route for
0.0.0.0/0
to the Internet GatewaySecurity Group for scheduler(s) that allows ingress on ports 8786 and 8787 from the your Python client, which could be achieved by opening these ports to traffic from anywhere (
0.0.0.0/0
), or a more limited IP range such as a VPN you’re using to connect to scheduler from Python clientSecurity Group for entire cluster(s) that allows all ingress specifically from that Security Group, which allows scheduler and workers to connect to each other
When configuring Coiled to use your network, you’d specify the same subnet as both scheduler subnet and worker subnet. Since you aren’t using NAT Gateway, you’ll need to configure Coiled to give the workers public IP addresses.
Example for public and private subnets#
Another way to structure your network on AWS is to have a public subnet for the scheduler, and to use a private subnet and NAT Gateway for the workers. In this case, the workers will use NAT Gateway for egress (they need to be able to download things).
The main components involved are:
VPC with attached Internet Gateway
One public subnet with NAT Gateway and a route table for the subnet with route for
0.0.0.0/0
to the Internet GatewayOne private subnet with a route table with route for
0.0.0.0/0
to the NAT GatewaySecurity Group for scheduler(s) that allows ingress on ports 8786 and 8787 from the your Python client, which could be achieved by opening these ports to traffic from anywhere (
0.0.0.0/0
), or a more limited IP range such as a VPN you’re using to connect to scheduler from Python clientSecurity Group for entire cluster(s) that allows all ingress specifically from that Security Group, which allows scheduler and workers to connect to each other
The public subnet would be specified as the scheduler subnet, and the private subnet would be specified as the worker subnet.
Since the workers are in a private subnet and use NAT Gateway for egress, you can tell Coiled to not give the workers public IP addresses.
Configure network using the Python API#
While it’s easiest to configure your network using the UI for your account on cloud.coiled.io, it’s also possible to configure your backend options using our Python API.
If you want to have Coiled use a network you’ve created, you’ll need to specify the ID for the VPC network, the scheduler and worker subnets (needn’t be distinct), and the Security Groups.
Optionally, you can specify the give_workers_public_ip
option (defaults to True
) to control whether workers get public IPs which they can use for egress without NAT. If you put workers in a private subnet and don’t have them assigned public IP addresses, you’ll need a route on that subnet that goes through NAT so they can still download required files over the internet.
For example:
import coiled
coiled.set_backend_options(
backend="aws",
aws_access_key_id="...",
aws_secret_access_key="...",
network={
"network_id": "vpc-12345678",
"scheduler_subnet_id": "subnet-12345678",
"worker_subnet_id": "subnet-87654321",
"scheduler_firewall_id": "sg-12345678", # security group used for scheduler
"firewall_id": "sg-24680", # security group used for whole cluster
"give_workers_public_ip": True, # optional, defaults to True
},
)
The resource IDs are not the full ARN, just the ID. Specify the Security Group for the scheduler as scheduler_firewall_id
and the Security Group for the whole cluster as firewall_id
.