Best Practices#
Cloud computing is incredibly powerful when done well. Coiled makes cloud computing easy, but doing things well still requires some experience.
This page contains suggestions for Coiled best practices and includes solutions to common Coiled problems.
Run computations close to your data#
Cloud computing can be surprisingly cheap when done well. When processing large amounts of cloud-hosted data, it’s important to have your compute VMs in the same cloud region where your data is hosted. This avoids expensive data transfer costs that scale with the size of your data.
Avoid data transfer costs by using the region=
option to provision your cloud VMs in the same region
as the data you’re processing.
# Create Dask cluster in `us-west-2`
import coiled
cluster = coiled.Cluster(region="us-west-2", ...)
client = cluster.get_client()
# Load dataset that lives in `us-west-2`
import dask.dataframe as dd
df = dd.read_parquet("s3://...")
Create a fresh software environment#
By default Coiled inspects your local environment for Python packages and replicates that same environment on cloud VMs. This usually works well but can run into issues when your local software environment has inconsistent versions of packages installed (as tends to happen with long-lived environments that evolve organically over time).
In these cases, creating a new, fresh local software environment with the libraries you need installed almost always resolves package consistency issues.
Use conda for non-Python libraries#
By default Coiled inspects your local environment for Python packages and replicates that same
environment on cloud VMs. For pure Python packages installed with pip
or conda
, this usually works great.
However, for more complex libraries that involve installing additional system packages like gdal
,
graphviz
, etc., automatic package synchronization can fail if those
additional system packages weren’t installed with pip
or conda
(e.g. apt
or brew
were
used instead).
In these cases, we recommend using conda
to install more complex libraries as Coiled’s package
synchronization will handle these properly.
Using graphviz
as an example, replace this:
brew install graphviz # Install system graphviz
pip install graphviz # Install Python bindings
with this:
conda install python-graphviz # Install system graphviz and Python bindings
Use Dask best practices#
When using Dask on Coiled, continue to use normal Dask best practices:
Set your default Coiled workspace#
You can use the workspace=
option to switch between different Coiled accounts:
import coiled
cluster_dev = coiled.Cluster(workspace="company-dev", ...)
cluster_prod = coiled.Cluster(workspace="company-prod", ...)
However, it can be easy to forget to do this, especially for new users who were recently added to an existing account.
For users that use multiple Coiled accounts, we recommend setting your default account from your profile page to be the account you use most often.
See Manage Users for more information on managing Coiled accounts.
GPU availability#
GPUs are in high demand these days, and so can be hard to find, leading to availability issues. There are two things you can do to help address this:
Avoid the larger GPU instance types, if you can, in particular A100 and H100 types. These GPUs have the large amounts of memory that are necessary to run LLMs, and so are in particularly high demand. If you can get away with it we recommend using smaller GPUs, like the A10’s available in the
g5.xlarge
instance type used above, which are generally more available.Search Different Regions where GPUs may be more or less available. You should avoid large amounts of cross-region data transfer (this can quickly become expensive) but if you’re not moving around large volumes of data then trying different regions can open you up to more availability than would otherwise be possible. To do this use the
region=
keyword for@coiled.function
orcoiled.Cluster
.
Use S3 for data, not your local hard drive#
You can keep your files and notebooks and models on your local drive (Coiled is good about synchronizing these to cloud machines) but it’s usually best to keep your data in the cloud, especially when it’s large. Fortunately, with libraries like s3fs, gcsfs, and the AWS CLI this is pretty straightforward.
import s3fs
s3 = s3fs.S3FileSystem()
s3.get("s3://mybucket/myfile", "./local/file")
# or
with s3.open("s3://mybucket/myfile", mode="rb") as f:
data = f.read()
For more information, read Connect to remote data in the Dask documentation.
Third-party platform authentication#
Commonly used platforms like Hugging Face, MLflow, etc. have their own authentication
systems for accessing private assets. When using these platforms on Coiled, cloud VMs
also need to be authenticated with the platform. Most platform support authentication
through environment variables (e.g.
Hugging Face uses HF_TOKEN.
We recommend using the environ=
keyword in @coiled.function
or
coiled.Cluster
to securely pass authentication information to Coiled cloud VMs.
import coiled
from transformers import pipeline
@coiled.function(environ={"HF_TOKEN": <your-token>})
def train(file):
transcriber = pipeline(task="automatic-speech-recognition")
...
return result