Run your Python functions in the cloud
Sometimes you just need extra computing resources for key parts of your workloads. This is common for different reasons:
You want to run close to your data
You want a GPU
You want a bigger machine
You want many machines
This is easy with the
@coiled.function decorator. Coiled runs your decorated function on
the infrastructure of your choosing.
import coiled import pandas as pd df = pd.read_csv("data.csv") # Read local data @coiled.function() # This function runs remotely def process(df): df = df[df.name == "Alice"] return df result = process(df) # Process on remote VM print(result)
import coiled import pandas as pd @coiled.function( region="us-west-2", # Run close to data ) def process(filename): df = pd.read_parquet(filename) # Read S3 data from EC2 df = df[df.name == "Alice"] # Filter data on cloud return df # Return subset result = process("s3://my-bucket/data.parquet") # Runs remotely print(result) # Runs locally
import coiled @coiled.function( vm_type="p3.2xlarge", # Run on a GPU instance ) def train(): import torch device = torch.device("cuda") ... return model model = train() # Runs remotely
import coiled @coiled.function( memory="512 GB", cpu=128 ) def my_func(...): ...
import coiled import pandas as pd @coiled.function(region="us-west-2") # Run close to the data def process(filename): output_filename = filename[:-4] + ".parquet" df = pd.read_csv(filename) df.to_parquet(output_filename) return output_filename # result = process("s3://my-bucket/data.parquet") # one file results = process.map(filenames) # many files in parallel for filename in results: print("Finished", filename)
coiled.function() benefits from the standard Coiled features:
Easy to use API
Creates and manages VMs for you in the cloud
Automatically synchronizes your local software and credentials
Gives access to any cloud hardware (like GPUs) in any region
Auto-scales out if needed
It’s easy to customize the hardware and software resources your function needs. You can select any VM type available on your cloud (see VM Size and Type). For example:
@coiled.function( memory="256 GiB", # Bigger VM region="us-east-2", # Specific region arm=True, # Change architecture ) def my_func(...): ...
@coiled.function works the same as with Coiled clusters.
By default, Coiled will automatically synchronize your local software
environment. This works well in most cases, but you
can also specify an manual software environment
or Docker image:
@coiled.function(container="nvcr.io/nvidia/rapidsai/base:23.08-cuda11.8-py3.10") def my_gpu_func(...): ...
@coiled.function(software="my-software-env") def my_func(...): ...
The first time a
@coiled.function decorated function is called, a VM with the specified resources will be created.
This initial startup typically takes ~1–2 minutes and then your VM will be available for the rest of the Python session.
By default, if your VM has been idle for 24 hours, it will automatically be shut down (you can use the
parameter to control this timeout period). All cloud resources are automatically deprovisioned and cleaned up on Python
interpreter shutdown (controlled by
keepalive parameter, see Warm Start).
By default Coiled Functions will run sequentially, just like normal Python functions. However, they can also easily run in parallel.
If you want to run your function many times across different inputs in parallel,
you can use the
@coiled.function() def simulate(trial: int=0): return ... result = simulate(1) # run the function once on cloud hardware results = simulate.map(range(1000)) # Run 1000 times. Returns iterator immediately. list(results) # Wait until done. Retrieve results.
.submit() method for Coiled Functions returns a Dask Future immediately, while your computation
runs in parallel with other futures in the background. You can then call the
.result() method on
the future to block until the function is done running and retrieve the result.
@coiled.function() def process(data): return ... futures =  for filename in filenames: future = process.submit(filename) futures.append(future) results = [f.result() for f in futures]
Dask futures are powerful and flexible, for more information on their full API see the Dask Futures Documentation.
When using the
.submit() methods to run Coiled Functions in parallel,
the number of VMs used to run your functions will adaptively scale up and down depending
on your workload (see Adaptive Deployments in the Dask docs
for more information).
By default, Coiled Functions running in parallel will adaptively scale between 0-100
VMs. You can customize this range by using the
@coiled.function() def process(data): return ... # Have parallel Coiled Function scale between 10-300 VMs process.cluster.adapt(minimum=10, maximum=300)
You can reuse cloud VMs between scripts and interactive runs with the
import coiled @coiled.function( ... keepalive="5 minutes", ) def my_function(...): return ...
In this example, the VM will stay running for 5 minutes after your Python session closes.
This means repeated runs of this function start in about a second rather than a minute or two.
Like with Coiled clusters, you can still set
idle_timeout, which defaults to 24 hours.
This way, you benefit from an already running VM in the short term, while still avoiding
costs from an idle VM in the long term.
@coiled.function is under active development. We highly
value feedback from users and encourage you to play with this functionality
and then let us know about your experience on
this issue tracker
or by reaching out to firstname.lastname@example.org.
- coiled.function(*, software=None, container=None, vm_type=None, cpu=None, memory=None, gpu=None, account=None, region=None, arm=None, disk_size=None, shutdown_on_close=True, spot_policy=None, idle_timeout='24 hours', keepalive='30 seconds')
Decorate a function to run on cloud infrastructure
This creates a
Functionobject that executes its code on a remote cluster with the hardware and software specified in the arguments to the decorator. It can run either as a normal function, or it can return Dask Futures for parallel computing.
str]) – Name of the software environment to use; this allows you to use and re-use existing Coiled software environments, and should not be used with package sync or when specifying a container to use for this specific cluster.
str]) – Name or URI of container image to use; when using a pre-made container image with Coiled, this allows you to skip the step of explicitly creating a Coiled software environment from that image. Note that this should not be used with package sync or when specifying an existing Coiled software environment.
None]) – Instance type, or list of instance types, that you would like to use. You can use
coiled.list_instance_types()to see a list of allowed types.
None]) – Number, or range, of CPUs requested. Specify a range by using a list of two elements, for example:
None]) – Amount of memory to request for each VM, Coiled will use a +/- 10% buffer from the memory that you specify. You may specify a range of memory by using a list of two elements, for example:
int]) – Size of persistent disk attached to each VM instance, specified in GiB.
bool]) – Whether to attach a GPU; this would be a single NVIDIA T4.
str]) – The cloud provider region in which to run the cluster.
bool]) – Whether to use ARM instances for cluster; default is x86 (Intel) instances.
keepalive – Keep your cluster running for the specified time, even if your Python session closes. Default is “30 seconds”.
str]) – Purchase option to use for workers in your cluster, options are “on-demand”, “spot”, and “spot_with_fallback”; by default this is “on-demand”. (Google Cloud refers to this as “provisioning model” for your instances.) Spot instances are much cheaper, but can have more limited availability and may be terminated while you’re still using them if the cloud provider needs more capacity for other customers. On-demand instances have the best availability and are almost never terminated while still in use, but they’re significantly more expensive than spot instances. For most workloads, “spot_with_fallback” is likely to be a good choice: Coiled will try to get as many spot instances as we can, and if we get less than you requested, we’ll try to get the remaining instances as on-demand. For AWS, when we’re notified that an active spot instance is going to be terminated, we’ll attempt to get a replacement instance (spot if available, but could be on-demand if you’ve enabled “fallback”). Dask on the active instance will attempt a graceful shutdown before the instance is terminated so that computed results won’t be lost.
str) – Shut down the cluster after this duration if no activity has occurred. Default is “24 hours”.
coiled.Clusterdocstring for additional parameter descriptions.
>>> import coiled >>> @coiled.function() ... def f(x): ... return x + 1
>>> f(10) # calling the function blocks until finished 11 >>> f.submit(10) # immediately returns a future <Future: pending, key=f-1234> >>> f.submit(10).result() # Call .result to get result 11
>>> futures = [f(i) for i in range(1000)] # parallelize with a for loop >>> [future.result() for future in futures] ...