Posts by James Bourbeau
Large Scale Geospatial Benchmarks: First Pass
- 16 October 2024
We implement several large-scale geo benchmarks. Most break. Fun!
Scaling AI-Based Data Processing with Hugging Face + Dask
- 09 October 2024
Sarah Johnson, James Bourbeau, Quentin Lhoest, Daniel van Strien
Easy Scalable Production ETL
- 08 April 2024
We show a lightweight scalable data pipeline that runs large Python jobs on a schedule on the cloud.
Processing Terabyte-Scale NASA Cloud Datasets with Coiled
- 01 November 2023
We show how to run existing NASA data workflows on the cloud, in parallel, with minimal code changes using Coiled. We also discuss cost optimization.
Parallel Serverless Functions at Scale
- 07 September 2023
The cloud offers amazing scale, but it can be difficult for Python data developers to use. This post walks through how to use Coiled Functions to run your existing code in parallel on the cloud with minimal code changes.
Data-proximate Computing with Coiled Functions
- 10 August 2023
Coiled Functions make it easy to improve performance and reduce costs by moving your computations next to your cloud data.
Coiled notebooks
- 14 June 2023
We recently pushed out a new, experimental notebooks feature for easily launching Jupyter servers in the cloud from your local machine. We’re excited about Coiled notebooks because they:
Distributed printing
- 18 May 2023
Dask makes it easy to print whether you’re running code locally on your laptop, or remotely on a cluster in the cloud.
Upstream testing in Dask
- 18 April 2023
Dask has deep integrations with other libraries in the PyData ecosystem like NumPy, pandas, Zarr, PyArrow, and more. Part of providing a good experience for Dask users is making sure that Dask continues to work well with this community of libraries as they push out new releases. This post walks through how Dask maintainers proactively ensure Dask continuously works with its surrounding ecosystem.