Spot Instances#

AWS and Google Cloud both offer Spot instances at substantially lower costs (see the AWS Spot and Google Cloud Spot documentation). You can specify whether to request Spot instances by passing one of “on-demand” (default), “spot”, or “spot_with_fallback” to the spot_policy keyword argument, where “spot_with_fallback” permits using on-demand instances as needed if the requested Spot instances are unavailable.


For AWS, there is graceful shutdown and replacement of spot instances to minimize interruptions. This feature is still in development for Google Cloud, in part due to the relatively shorter notice for termination from Google Cloud.

Spot instances can be harder to get. You can set use_best_zone=True when creating a Coiled cluster to allow your cloud provider to pick the best availability zone (inside your selected region). This argument also helps increase the chances of obtaining harder-to-get instance types.





Purchase option to use for workers in your cluster. Options are “on-demand”, “spot”, and “spot_with_fallback”. Google Cloud refers to this as the “provisioning model” for your instances.



Allow the cloud provider to pick the zone (in your specified region) that has best availability for your requested instances. We’ll keep the scheduler and workers all in a single zone in order to avoid any interzone network traffic (which would be billed).


You can combine these arguments to minimize cost and maximize availability:

import coiled

cluster = coiled.Cluster(
    use_best_zone=True, spot_policy="spot_with_fallback"