Posts by Sarah Johnson
Scaling AI-Based Data Processing with Hugging Face + Dask
- 09 October 2024
Sarah Johnson, James Bourbeau, Quentin Lhoest, Daniel van Strien
Processing a 250 TB dataset with Coiled, Dask, and Xarray
- 05 September 2023
We processed 250TB of geospatial cloud data in twenty minutes on the cloud with Xarray, Dask, and Coiled. We do this to demonstrate scale and to think about costs.
Just in time Python environments
- 23 February 2023
Docker is a great tool for creating portable software environments, but we found it’s too slow for interactive exploration. We find that clusters depending on docker images often take 5+ minutes to launch. Ouch.