Integrations#

It is easy to integrate Coiled with different systems including

  • Developer environments like cloud notebooks and IDEs

  • Production systems like Airflow/Prefect/Dagster/Cron/Jenkins

  • CI/CD systems like GitHub Actions

  • … and much more

This is because Coiled is API-first, primarily being driven through its Python interface

import coiled
cluster = coiled.Cluster(...)

This means that Coiled can be invoked from anywhere that runs Python. Coiled feels more like using a library (like pandas or numpy) than it feels like using a platform (like Databricks or Kubernetes).

Example: Cloud Notebooks#

It is easy to integrate Coiled to your favorite cloud notebook provider. These are the same steps you would do locally.

  1. Install the coiled Python library

    pip install coiled
    
  2. Authenticate your cloud notebook environment with Coiled

    coiled login
    

    This stores an API token in your notebook’s persistent drive

  3. Use Coiled in your notebook

    import coiled
    
    cluster = coiled.Cluster(...)
    

Your software packages and cloud credentials will be copied from your cloud notebook environment and replicated on the remote machines.

Users familiar with Coiled will notice that this process is identical to the process to use Coiled from a personal laptop. There’s no difference to using Coiled in different locations (cloud, local, HPC) or development environments (notebooks, VS Code, workflow managers).

Example: AWS Lambda#

As an example, let’s say that you want to integrate Coiled with AWS Lambda to augment your Lambda functions with larger cloud hardware (Lambda jobs are constrained in their available hardware)

You start with your normal AWS Lambda Python script

def lambda_handler(event, context):
    # List filenames to process
    filenames = event.get('filenames')

    # Process all filenames (This is slow and runs out of memory sometimes)
    results = []
    for filename in filenames:
        data = load(filename)
        result = process(data)
        results.append(result)

    # Return a response
    return {
        'statusCode': 200,
        'body': results,
        })
    }

Because we have access to Python in this Lambda function we can invoke Coiled, asking for more and larger hardware

def lambda_handler(event, context):
    # List filenames to process
    filenames = event.get('filenames')

    # Ask for larger machines
    cluster = coiled.Cluster(worker_memory="512 GiB")
    client = cluster.get_client()

    # Process all files on these larger machines, gather results back
    def compute(filename):
        data = load(filename)
        return process(data)

    tasks = client.map(compute, filenames)
    results = client.gather(tasks)

    # Return a response
    return {
        'statusCode': 200,
        'body': results,
        })
    }

This pattern is common. We use Coiled together with existing production systems to augment those systems with greater scale on the cloud. There are a few common questions with these systems:

  • Q: How do I set a Coiled API key?
    A: You can set an API key with the environment variable DASK_COILED__TOKEN

  • Q: How do I specify a production software environment for the Coiled workers?
    A: You’ve already set a production software environment in the location where you’re invoking the Coiled cluster (like the AWS Lambda environment above). Coiled will replicate that exact locked down environment on the remote machines automatically.

  • Q: How do I ensure Coiled has the right permissions to access my data?
    A: If your data is stored on cloud storage then Coiled will replicate the credentials in the hosting environment (AWS Lambda in this case) and forward those credentials to the workers.

    If your permissions depend on environment variables (common with databases like Snowflake for example) you can send those to the workers with the coiled.Cluster.send_private_envs method.

Conclusion#

Coiled integrations are easy because Coiled operates as just a Python API. In this way Coiled feels more like a library like pandas or numpy rather than a platform like Databricks or Kubernetes. Coiled is designed to be used anywhere you can run Python.

So in most cases the answer to the question of “Can I integrate Coiled with X” is “yes, just import coiled in that system and go.”