How to limit Coiled’s access to your AWS resources¶
This article aims to provide you with some guidance on how to limit our access to your AWS resources and handle permissions in different phases of your pipeline. We will also talk briefly about the permissions you can find in the Backends- AWS documentation.
On the backends, AWS documentation, you can find an IAM policy template under Using your AWS account, this template contains two sets of permissions - Setup and Ongoing.
The Setup set is used to set up all the network resources and roles that Coiled will use when launching a cluster in your AWS account. Once you’ve input your credentials and successfully created a Coiled cluster in your AWS account, you can remove this set of permissions in the future once Coiled has created all the needed resources for you.
The Ongoing set contains the permissions that Coiled needs to launch clusters in your account. Here you find permissions related to software environments, getting logs and so on. In this template, we have permissions for both ECS and EC2 since we currently have two methods for launching clusters using an ECS backend or VM backend.
Most of the resources that Coiled creates will contain the tag
to allow you to identify what we created.
If you use your AWS credentials, data is run within your VPC as described in our Security section. If you want to limit Coiled’s access to your AWS resources even further, you can do this with users and roles.
By creating a user within your AWS account, you can give access to only those resources that you are comfortable sharing. Then you can create different roles that have a more restricted set of permissions.
If you have an AWS Organization, you might need to follow the AWS documentation on creating an account in your organization.
Example: S3 restrictions¶
Let’s assume that you have created a
coiled user in your AWS account. This
user has read permissions to an S3 bucket that you own, but you created a role
that doesn’t allow access to the bucket.
import coiled import dask.dataframe as dd from dask.distributed import Client cluster = coiled.Cluster() client = Client(cluster) df = dd.read_csv("s3://your-s3-url-here")
If you switch to the role that doesn’t allow access to S3, the code above will fail with a permissions error.
Example: Launch a cluster from a coiled job¶
Let’s assume that you don’t have any roles created that narrow the scope of permissions,
but you have a
coiled user in your AWS account. We will attempt to read from that
same old S3 bucket, but we will do so from within a job this time.
When you launch a job, Coiled will create a
task-role with your account name. This
role will not get the same set of permissions as your user. The job will fail with the
same permissions error. The way to fix this is by attaching the S3 policy to the role
that Coiled created.
See the tutorial on using a job to launch a cluster.