How to limit Coiled’s access to your AWS resources#

This article aims to provide you with some guidance on how to limit our access to your AWS resources and handle permissions in different phases of your pipeline. We will also talk briefly about the permissions you configured initially (see Configuring AWS).

Permissions#

On the backends, AWS documentation, you can find an IAM policy template under Using your AWS account, this template contains two sets of permissions - Setup and Ongoing.

The Setup set is used to set up all the network resources and roles that Coiled will use when launching a cluster in your AWS account. Once you’ve input your credentials and successfully created a Coiled cluster in your AWS account, you can remove this set of permissions in the future once Coiled has created all the needed resources for you.

The Ongoing set contains the permissions that Coiled needs to launch clusters in your account. Here you find permissions related to software environments, getting logs and so on.

Note

Most of the resources that Coiled creates will contain the tag owner: coiled to allow you to identify what we created.

Giving access#

When you set up Coiled to use your AWS account, data is run within your VPC (see Resources). If you want to limit Coiled’s access to your AWS resources even further, you can do this with users and roles.

By creating a user within your AWS account, you can give access to only those resources that you are comfortable sharing. Then you can create different roles that have a more restricted set of permissions.

Note

If you have an AWS Organization, you might need to follow the AWS documentation on creating an account in your organization.

Example: S3 restrictions#

Let’s assume that you have created a coiled user in your AWS account. This user has read permissions to an S3 bucket that you own, but you created a role that doesn’t allow access to the bucket.

import coiled
import dask.dataframe as dd
from dask.distributed import Client

cluster = coiled.Cluster()
client = Client(cluster)

df = dd.read_csv("s3://your-s3-url-here")

If you switch to the role that doesn’t allow access to S3, the code above will fail with a permissions error.