Coiled User Guide#
Cloud Computing for Data People
Coiled helps you use Python on the cloud easily and efficiently. Coiled provides tools that let you:
Scale your code to any cloud hardware, even GPUs.
Run your code close to your cloud data.
Sync packages, files, and credentials. No Docker required.
And save money while doing it.
Meet the Tools#
With Batch Jobs, run any code (doesn’t need to be Python) in the cloud from the comfort of your terminal.
$ coiled batch run --memory 64GB --region us-west-2 my_script.sh
With Coiled Functions, decorate a function to run it in the cloud. Scale out with the .map
method.
import coiled
import pandas as pd
@coiled.function(region="us-west-2") # Run close to the data
def process(filename):
output_filename = filename[:-4] + ".parquet"
df = pd.read_csv(filename)
df.to_parquet(output_filename)
return output_filename
# result = process("s3://my-bucket/data.parquet") # one file
results = process.map(filenames) # many files in parallel
for filename in results:
print("Finished", filename)
With Coiled Clusters, deploy and scale a Dask cluster. Pandas is shown below, but for more applications see Dask examples.
import coiled
cluster = coiled.Cluster(
n_workers=100,
)
client = cluster.get_client()
# Use Dask + Pandas together
import dask.dataframe as dd
df = dd.read_parquet("s3://bucket/lots-of-data.parquet")
df.groupby("name").amount.sum().compute()
Run Jupyter on large cloud-based VMs. Synchronize your files back to your local hard drive.
coiled notebook start --sync --vm-type m6i.16xlarge
How Does This Work?#
Coiled quickly creates cloud VMs that match your local environment. This lets you run on bigger/faster/more hardware, but with the ease and familiarity of normal development.
Code Locally: You write normal Python wherever you do today (like your laptop) and submit that code to run on Coiled.
Launch VMs: Coiled rapidly creates ephemeral VMs to run your code (this takes about a minute).
Environment synchronization: Coiled inspects your machine for packages, scripts, and credentials, and then installs those quickly on your remote machines so that they match your development environment.
Execute and monitor: Your code runs at scale with loads of metrics running in the background to help you debug and optimize.
Robust Cleanup: Everything cleans up when you’re done, leaving you with a clean slate and low costs.
Coiled’s approach of environment scraping and rapid deployment of raw VMs gives a compute stack that endeavors to be easy, powerful, and cheap.