Coiled Documentation#
What does this do?#
Dask runs sophisticated Python code and integrates with many libraries. Pandas is shown below, but for more applications see Dask examples.
# Make Dask cluster
import coiled
cluster = coiled.Cluster(
n_workers=100,
)
client = cluster.get_client()
# Use Dask + Pandas together
import dask.dataframe as dd
df = dd.read_parquet("s3://bucket/lots-of-data.parquet")
df.groupby("name").amount.sum().compute()
Run simple functions on cloud hardware close to your data. Easily scale up with the .map method. For more information and examples see Coiled Functions.
import coiled, random
@coiled.function()
def estimate_pi(n: int) -> float:
total = 0
for _ in range(n):
x = random.random()
y = random.random()
if x ** 2 + y ** 2 < 1:
total += 1
return total / n * 4
pi = estimate_pi(100_000)
print(pi)
Run an executable on a cloud VM. Dead simple.
See Interactive CLI Jobs for more information.
coiled run echo "Hello, world"
Run Jupyter on large cloud-based VMs. Synchronize your files back to your local hard drive.
For more information see Jupyter Notebooks.
coiled notebook start --sync --vm-type m6i.16xlarge
How does this work?#
Coiled quickly creates cloud VMs that match your local environment. This lets you run on bigger/faster/more hardware, but with the ease and familiarity of normal development.
Code Locally: You write normal Python wherever you do today (like your laptop) and submit that code to run on Coiled.
Launch VMs: Coiled rapidly creates ephemeral VMs to run your code (this takes about a minute).
Environment synchronization: Coiled inspects your machine for packages, scripts, and credentials, and then installs those quickly on your remote machines so that they match your development environment.
Execute and monitor: Your code runs at scale with loads of metrics running in the background to help you debug and optimize.
Robust Cleanup: Everything cleans up when you’re done, leaving you with a clean slate and low costs.
Coiled’s approach of environment scraping and rapid deployment of raw VMs gives a compute stack that endeavors to be easy, powerful, and cheap.
Examples & Use Cases#
Train, predict, and track on cloud hardware with Coiled.
Dask is faster and easier to use than Spark.
Run lightweight data pipelines on a schedule on the cloud.
Process TBs of geospatial data with Coiled and Xarray.
Easily run Jupyter notebooks on cloud hardware.
Parallelize custom Python functions on the cloud.