Just in time Python environments

Docker is a great tool for creating portable software environments, but we found it’s too slow for interactive exploration. We find that clusters depending on docker images often take 5+ minutes to launch. Ouch.

In the latest Coiled release, version 0.4.0, you can use a new system for creating software environments on the fly using only mamba instead. We’re seeing start times 3x faster, or about 1-2 minutes.

Why we dropped Docker for Python environments

Docker is a great tool for managing software environments, but we found that it’s just too slow, especially for exploratory data workflows where users change their python environments frequently.

This article goes into the challenges we faced, the solution we chose, and the performance impacts of that choice. Here are the results:

../_images/senvs2_build_push_pull.svg

If you’re a Coiled user these changes are now deployed if you update to coiled>=0.4.

Background

In a previous blog post, we discussed the importance of matching your local Python packages and versions to those installed on remote machines running in the cloud. The easiest way to do this with a Coiled cluster is using package sync, which scans your local environment and replicates it on your cluster. This is faster, easier, and safer than building a Docker image each time.

However, some folks still want statically defined software environments stored in the cloud, like what Docker offers, and so we maintain an API create_software_environment , which used to make Docker images given conda/pip specifications, but now does something much faster.

coiled.create_software_environment(
    conda=["dask", "pandas", "pytorch", "arrow"],
    pip=["sqlalchemy", "dask-snowflake"],
)

Conda and Docker are slow

Previously, Coiled used conda to solve Python environments and Docker to build those environments. We then pushed those Docker images to users’ container registries and on cluster start, pulled these images onto the cloud instances that formed a Dask/Coiled cluster.

Benchmarking these steps using a complex/real-world software environment from the wild, we notice the first bottleneck is the environment solve. The customer reported anecdotally this could sometimes take 30 minutes! We fixed this simply by switching to Mamba, which is much faster.

That still leaves the build, push and pull steps. We noticed that this takes about 8 minutes; which is painfully slow for users. We break down that time cost below:

environment.yml
name: base
channels:
  - conda-forge
  - defaults
dependencies:
  - gcc
  - _libgcc_mutex=0.1=conda_forge
  -_openmp_mutex=4.5=2_gnu
  - abseil-cpp=20210324.2=h9c3ff4c_0
  - aiobotocore=2.3.4=pyhd8ed1ab_0
  - aiohttp=3.8.1=py39hb9d737c_1
  - aioitertools=0.10.0=pyhd8ed1ab_0
  - aiosignal=1.2.0=pyhd8ed1ab_0
  - alembic=1.8.1=pyhd8ed1ab_0
  - alsa-lib=1.2.6.1=h7f98852_0
  - anyio=3.6.1=py39hf3d152e_0
  - appdirs=1.4.4=pyh9f0ad1d_0
  - argon2-cffi=21.3.0=pyhd8ed1ab_0
  - argon2-cffi-bindings=21.2.0=py39hb9d737c_2
  - arrow-cpp=8.0.0=py39h811ffd7_4_cpu
  - asn1crypto=1.5.1=pyhd8ed1ab_0
  - asttokens=2.0.5=pyhd8ed1ab_0
  - async-timeout=4.0.2=pyhd8ed1ab_0
  - attr=2.5.1=h166bdaf_0
  - attrs=21.4.0=pyhd8ed1ab_0
  - aws-c-cal=0.5.11=h95a6274_0
  - aws-c-common=0.6.2=h7f98852_0
  - aws-c-event-stream=0.2.7=h3541f99_13
  - aws-c-io=0.10.5=hfb6a706_0
  - aws-checksums=0.1.11=ha31a3da_7
  - aws-sdk-cpp=1.8.186=hb4091e7_3
  - babel=2.10.3=pyhd8ed1ab_0
  - backcall=0.2.0=pyh9f0ad1d_0
  - backports=1.0=py_2
  - backports.functools_lru_cache=1.6.4=pyhd8ed1ab_0
  - beautifulsoup4=4.11.1=pyha770c72_0
  - bleach=5.0.1=pyhd8ed1ab_0
  - blinker=1.4=py_1
  - blosc=1.21.1=h83bc5f7_3
  - bokeh=2.4.3=py39hf3d152e_0
  - botocore=1.24.21=pyhd8ed1ab_1
  - brotli=1.0.9=h166bdaf_7
  - brotli-bin=1.0.9=h166bdaf_7
  - brotlipy=0.7.0=py39hb9d737c_1004
  - bzip2=1.0.8=h7f98852_4
  - c-ares=1.18.1=h7f98852_0
  - ca-certificates=2022.6.15=ha878542_0
  - certifi=2022.6.15=py39hf3d152e_0
  - cffi=1.15.1=py39he91dace_0
  - click=8.1.3=py39hf3d152e_0
  - cloudpickle=2.1.0=pyhd8ed1ab_0
  - colorama=0.4.5=pyhd8ed1ab_0
  - configparser=5.2.0=pyhd8ed1ab_0
  - cryptography=37.0.4=py39hd97740a_0
  - cycler=0.11.0=pyhd8ed1ab_0
  - cytoolz=0.12.0=py39hb9d737c_0
  - dask=2022.7.1=pyhd8ed1ab_0
  - dask-core=2022.7.1=pyhd8ed1ab_0
  - databricks-cli=0.17.0=pyhd8ed1ab_0
  - dbus=1.13.6=h5008d03_3
  - debugpy=1.6.0=py39h5a03fae_0
  - decorator=5.1.1=pyhd8ed1ab_0
  - defusedxml=0.7.1=pyhd8ed1ab_0
  - distributed=2022.7.1=pyhd8ed1ab_0
  - docker-py=5.0.3=py39hf3d152e_2
  - docker-pycreds=0.4.0=py_0
  - entrypoints=0.4=pyhd8ed1ab_0
  - executing=0.9.0=pyhd8ed1ab_0
  - expat=2.4.8=h27087fc_0
  - fftw=3.3.10=nompi_h77c792f_102
  - flask=2.1.3=pyhd8ed1ab_0
  - flit-core=3.7.1=pyhd8ed1ab_0
  - font-ttf-dejavu-sans-mono=2.37=hab24e00_0
  - font-ttf-inconsolata=3.000=h77eed37_0
  - font-ttf-source-code-pro=2.038=h77eed37_0
  - font-ttf-ubuntu=0.83=hab24e00_0
  - fontconfig=2.14.0=h8e229c2_0
  - fonts-conda-ecosystem=1=0
  - fonts-conda-forge=1=0
  - fonttools=4.34.4=py39hb9d737c_0
  - freetype=2.10.4=h0708190_1
  - frozenlist=1.3.0=py39hb9d737c_1
  - fsspec=2022.5.0=pyhd8ed1ab_0
  - gettext=0.19.8.1=h73d1719_1008
  - gflags=2.2.2=he1b5a44_1004
  - giflib=5.2.1=h36c2ea0_2
  - gitdb=4.0.9=pyhd8ed1ab_0
  - gitpython=3.1.27=pyhd8ed1ab_0
  - glib=2.72.1=h6239696_0
  - glib-tools=2.72.1=h6239696_0
  - glog=0.6.0=h6f12383_0
  - greenlet=1.1.2=py39h5a03fae_2
  - grpc-cpp=1.45.2=h3b8df00_4
  - gst-plugins-base=1.20.3=hf6a322e_0
  - gstreamer=1.20.3=hd4edc92_0
  - gunicorn=20.1.0=py39hf3d152e_2
  - heapdict=1.0.1=py_0
  - htmlmin=0.1.12=py_1
  - icu=70.1=h27087fc_0
  - idna=3.3=pyhd8ed1ab_0
  - imagehash=4.2.1=pyhd8ed1ab_0
  - importlib-metadata=4.11.4=py39hf3d152e_0
  - importlib_metadata=4.11.4=hd8ed1ab_0
  - importlib_resources=5.9.0=pyhd8ed1ab_0
  - ipykernel=6.15.1=pyh210e3f2_0
  - ipython=8.4.0=py39hf3d152e_0
  - ipython_genutils=0.2.0=py_1
  - itsdangerous=2.1.2=pyhd8ed1ab_0
  - jack=1.9.18=h8c3723f_1002
  - jedi=0.18.1=py39hf3d152e_1
  - jinja2=3.1.2=pyhd8ed1ab_1
  - jmespath=1.0.1=pyhd8ed1ab_0
  - joblib=1.1.0=pyhd8ed1ab_0
  - jpeg=9e=h166bdaf_2
  - json5=0.9.5=pyh9f0ad1d_0
  - jsonschema=4.7.2=pyhd8ed1ab_0
  - jupyter_client=7.3.4=pyhd8ed1ab_0
  - jupyter_core=4.11.1=py39hf3d152e_0
  - jupyter_server=1.18.1=pyhd8ed1ab_0
  - jupyterlab=3.4.4=pyhd8ed1ab_0
  - jupyterlab_pygments=0.2.2=pyhd8ed1ab_0
  - jupyterlab_server=2.15.0=pyhd8ed1ab_0
  - keyutils=1.6.1=h166bdaf_0
  - kiwisolver=1.4.4=py39hf939315_0
  - krb5=1.19.3=h3790be6_0
  - lcms2=2.12=hddcbb42_0
  - ld_impl_linux-64=2.36.1=hea4e1c9_2
  - lerc=3.0=h9c3ff4c_0
  - libblas=3.9.0=15_linux64_openblas
  - libbrotlicommon=1.0.9=h166bdaf_7
  - libbrotlidec=1.0.9=h166bdaf_7
  - libbrotlienc=1.0.9=h166bdaf_7
  - libcap=2.64=ha37c62d_0
  - libcblas=3.9.0=15_linux64_openblas
  - libclang=14.0.6=default_h2e3cab8_0
  - libclang13=14.0.6=default_h3a83d3e_0
  - libcrc32c=1.1.2=h9c3ff4c_0
  - libcups=2.3.3=hf5a7f15_1
  - libcurl=7.83.1=h7bff187_0
  - libdb=6.2.32=h9c3ff4c_0
  - libdeflate=1.12=h166bdaf_0
  - libedit=3.1.20191231=he28a2e2_2
  - libev=4.33=h516909a_1
  - libevent=2.1.10=h9b69904_4
  - libffi=3.4.2=h7f98852_5
  - libflac=1.3.4=h27087fc_0
  - libgcc-ng=12.1.0=h8d9b700_16
  - libgfortran-ng=12.1.0=h69a702a_16
  - libgfortran5=12.1.0=hdcd56e2_16
  - libglib=2.72.1=h2d90d5f_0
  - libgomp=12.1.0=h8d9b700_16
  - libgoogle-cloud=1.40.2=habd0e3a_0
  - libiconv=1.16=h516909a_0
  - liblapack=3.9.0=15_linux64_openblas
  - libllvm11=11.1.0=hf817b99_3
  - libllvm14=14.0.6=he0ac6c6_0
  - libnghttp2=1.47.0=h727a467_0
  - libnsl=2.0.0=h7f98852_0
  - libogg=1.3.4=h7f98852_1
  - libopenblas=0.3.20=pthreads_h78a6416_0
  - libopus=1.3.1=h7f98852_1
  - libpng=1.6.37=h753d276_3
  - libpq=14.4=hd77ab85_0
  - libprotobuf=3.20.1=h6239696_0
  - libsndfile=1.0.31=h9c3ff4c_1
  - libsodium=1.0.18=h36c2ea0_1
  - libssh2=1.10.0=ha56f1ee_2
  - libstdcxx-ng=12.1.0=ha89aaad_16
  - libthrift=0.16.0=h519c5ea_1
  - libtiff=4.4.0=hc85c160_1
  - libtool=2.4.6=h9c3ff4c_1008
  - libudev1=249=h166bdaf_4
  - libutf8proc=2.7.0=h7f98852_0
  - libuuid=2.32.1=h7f98852_1000
  - libvorbis=1.3.7=h9c3ff4c_0
  - libwebp=1.2.3=h522a892_1
  - libwebp-base=1.2.3=h166bdaf_2
  - libxcb=1.13=h7f98852_1004
  - libxkbcommon=1.0.3=he3ba5ed_0
  - libxml2=2.9.14=h22db469_3
  - libzlib=1.2.12=h166bdaf_2
  - llvmlite=0.38.1=py39h7d9a04d_0
  - locket=1.0.0=pyhd8ed1ab_0
  - lz4=4.0.0=py39h029007f_2
  - lz4-c=1.9.3=h9c3ff4c_1
  - mako=1.2.1=pyhd8ed1ab_0
  - markdown=3.4.1=pyhd8ed1ab_0
  - markupsafe=2.1.1=py39hb9d737c_1
  - matplotlib=3.5.2=py39hf3d152e_0
  - matplotlib-base=3.5.2=py39h700656a_0
  - matplotlib-inline=0.1.3=pyhd8ed1ab_0
  - missingno=0.4.2=py_1
  - mistune=0.8.4=py39h3811e60_1005
  - mlflow=1.27.0=py39ha39b057_0
  - msgpack-python=1.0.4=py39hf939315_0
  - multidict=6.0.2=py39hb9d737c_1
  - multimethod=1.4=py_0
  - munkres=1.1.4=pyh9f0ad1d_0
  - mysql-common=8.0.29=haf5c9bc_1
  - mysql-libs=8.0.29=h28c427c_1
  - nbclassic=0.4.3=pyhd8ed1ab_0
  - nbclient=0.6.6=pyhd8ed1ab_0
  - nbconvert=6.5.0=pyhd8ed1ab_0
  - nbconvert-core=6.5.0=pyhd8ed1ab_0
  - nbconvert-pandoc=6.5.0=pyhd8ed1ab_0
  - nbformat=5.4.0=pyhd8ed1ab_0
  - ncurses=6.3=h27087fc_1
  - nest-asyncio=1.5.5=pyhd8ed1ab_0
  - networkx=2.8.5=pyhd8ed1ab_0
  - notebook=6.4.12=pyha770c72_0
  - notebook-shim=0.1.0=pyhd8ed1ab_0
  - nspr=4.32=h9c3ff4c_1
  - nss=3.78=h2350873_0
  - numba=0.55.2=py39h66db6d7_0
  - numpy=1.22.4=py39hc58783e_0
  - oauthlib=3.2.0=pyhd8ed1ab_0
  - openjpeg=2.4.0=hb52868f_1
  - openssl=1.1.1q=h166bdaf_0
  - orc=1.7.5=h6c59b99_0
  - packaging=21.3=pyhd8ed1ab_0
  - pandas=1.4.3=py39h1832856_0
  - pandas-profiling=3.2.0=pyhd8ed1ab_0
  - pandoc=2.18=ha770c72_0
  - pandocfilters=1.5.0=pyhd8ed1ab_0
  - parquet-cpp=1.5.1=2
  - parso=0.8.3=pyhd8ed1ab_0
  - partd=1.2.0=pyhd8ed1ab_0
  - patsy=0.5.2=pyhd8ed1ab_0
  - pcre=8.45=h9c3ff4c_0
  - pexpect=4.8.0=pyh9f0ad1d_2
  - phik=0.12.2=py39ha791e8c_0
  - pickleshare=0.7.5=py_1003
  - pillow=9.2.0=py39hae2aec6_0
  - pip=22.2=pyhd8ed1ab_0
  - ply=3.11=py_1
  - portaudio=19.6.0=h57a0ea0_5
  - prometheus_client=0.14.1=pyhd8ed1ab_0
  - prometheus_flask_exporter=0.20.2=pyhd8ed1ab_0
  - prompt-toolkit=3.0.30=pyha770c72_0
  - protobuf=3.20.1=py39h5a03fae_0
  - psutil=5.9.1=py39hb9d737c_0
  - pthread-stubs=0.4=h36c2ea0_1001
  - ptyprocess=0.7.0=pyhd3deb0d_0
  - pulseaudio=14.0=h7f54b18_8
  - pure_eval=0.2.2=pyhd8ed1ab_0
  - pybind11-abi=4=hd8ed1ab_3
  - pycparser=2.21=pyhd8ed1ab_0
  - pydantic=1.9.1=py39hb9d737c_0
  - pygments=2.12.0=pyhd8ed1ab_0
  - pyjwt=2.4.0=pyhd8ed1ab_0
  - pyopenssl=22.0.0=pyhd8ed1ab_0
  - pyparsing=3.0.9=pyhd8ed1ab_0
  - pyqt=5.15.7=py39h18e9c17_0
  - pyqt5-sip=12.11.0=py39h5a03fae_0
  - pyrsistent=0.18.1=py39hb9d737c_1
  - pysocks=1.7.1=py39hf3d152e_5
  - python=3.9.13=h9a8a25e_0_cpython
  - python-dateutil=2.8.2=pyhd8ed1ab_0
  - python-fastjsonschema=2.16.1=pyhd8ed1ab_0
  - python_abi=3.9=2_cp39
  - python~=3.9
  - pytz=2022.1=pyhd8ed1ab_0
  - pywavelets=1.3.0=py39hd257fcd_1
  - pyyaml=6.0=py39hb9d737c_4
  - pyzmq=23.2.0=py39headdf64_0
  - qt-main=5.15.4=ha5833f6_2
  - querystring_parser=1.2.4=py_0
  - re2=2022.06.01=h27087fc_0
  - readline=8.1.2=h0f457ee_0
  - s2n=1.0.10=h9b69904_0
  - s3fs=2022.5.0=pyhd8ed1ab_0
  - scikit-learn=1.1.1=py39h4037b75_0
  - scipy=1.8.1=py39he49c0e8_0
  - seaborn=0.11.2=hd8ed1ab_0
  - seaborn-base=0.11.2=pyhd8ed1ab_0
  - send2trash=1.8.0=pyhd8ed1ab_0
  - setuptools=63.2.0=py39hf3d152e_0
  - shap=0.41.0=py39h1832856_0
  - sip=6.6.2=py39h5a03fae_0
  - six=1.16.0=pyh6c4a22f_0
  - slicer=0.0.7=pyhd8ed1ab_0
  - smmap=3.0.5=pyh44b312d_0
  - snappy=1.1.9=hbd366e4_1
  - sniffio=1.2.0=py39hf3d152e_3
  - sortedcontainers=2.4.0=pyhd8ed1ab_0
  - soupsieve=2.3.2.post1=pyhd8ed1ab_0
  - sqlalchemy=1.4.39=py39hb9d737c_0
  - sqlite=3.39.2=h4ff8645_0
  - sqlparse=0.4.2=pyhd8ed1ab_0
  - stack_data=0.3.0=pyhd8ed1ab_0
  - statsmodels=0.13.2=py39hd257fcd_0
  - tabulate=0.8.10=pyhd8ed1ab_0
  - tangled-up-in-unicode=0.2.0=pyhd8ed1ab_0
  - tblib=1.7.0=pyhd8ed1ab_0
  - terminado=0.15.0=py39hf3d152e_0
  - threadpoolctl=3.1.0=pyh8a188c0_0
  - tinycss2=1.1.1=pyhd8ed1ab_0
  - tk=8.6.12=h27826a3_0
  - toml=0.10.2=pyhd8ed1ab_0
  - toolz=0.12.0=pyhd8ed1ab_0
  - tornado=6.1=py39hb9d737c_3
  - tqdm=4.64.0=pyhd8ed1ab_0
  - traitlets=5.3.0=pyhd8ed1ab_0
  - typing-extensions=4.3.0=hd8ed1ab_0
  - typing_extensions=4.3.0=pyha770c72_0
  - unicodedata2=14.0.0=py39hb9d737c_1
  - urllib3=1.26.10=pyhd8ed1ab_0
  - visions=0.7.4=pyhd8ed1ab_0
  - wcwidth=0.2.5=pyh9f0ad1d_2
  - webencodings=0.5.1=py_1
  - websocket-client=1.3.3=pyhd8ed1ab_0
  - werkzeug=2.1.2=pyhd8ed1ab_1
  - wheel=0.37.1=pyhd8ed1ab_0
  - wrapt=1.14.1=py39hb9d737c_0
  - xcb-util=0.4.0=h166bdaf_0
  - xcb-util-image=0.4.0=h166bdaf_0
  - xcb-util-keysyms=0.4.0=h166bdaf_0
  - xcb-util-renderutil=0.3.9=h166bdaf_0
  - xcb-util-wm=0.4.1=h166bdaf_0
  - xorg-libxau=1.0.9=h7f98852_0
  - xorg-libxdmcp=1.1.3=h7f98852_0
  - xz=5.2.5=h516909a_1
  - yaml=0.2.5=h7f98852_2
  - yarl=1.7.2=py39hb9d737c_2
  - zeromq=4.3.4=h9c3ff4c_1
  - zict=2.2.0=pyhd8ed1ab_0
  - zipp=3.8.0=pyhd8ed1ab_0
  - zlib=1.2.12=h166bdaf_2
  - zstd=1.5.2=h8a70e8d_2
  - pip:
      - aenum==3.1.11
      - affinegap==1.12
      - altair==4.2.0
      - awswrangler==2.16.1
      - backoff==1.11.1
      - bcrypt==3.2.2
      - black==22.6.0
      - blis==0.7.8
      - boto3==1.21.21
      - btrees==4.10.0
      - cachetools==5.2.0
      - catalogue==2.0.7
      - categorical-distance==1.9
      - charset-normalizer==2.0.12
      - coiled==0.2.15
      - colored==1.4.3
      - commonmark==0.9.1
      - coverage==6.4.2
      - cymem==2.0.6
      - dateparser==1.1.1
      - datetime-distance==0.1.3
      - dedupe==2.0.13
      - dedupe-hcluster==0.3.9
      - dedupe-variable-datetime==0.1.5
      - dedupe-variable-name==0.0.14
      - doublemetaphone==1.1
      - elasticsearch==7.13.4
      - et-xmlfile==1.1.0
      - faiss-cpu==1.7.2
      - fastcluster==1.2.6
      - flake8==4.0.1
      - floret==0.10.2
      - future==0.18.2
      - gender-guesser==0.4.0
      - google-auth==2.9.1
      - google-auth-oauthlib==0.5.2
      - gremlinpython==3.6.0
      - gspread==5.4.0
      - gspread-pandas==3.2.2
      - haversine==2.6.0
      - highered==0.2.1
      - iniconfig==1.1.1
      - ipywidgets==7.7.1
      - isodate==0.6.1
      - jsondiff==2.0.0
      - jsonpath-ng==1.5.3
      - jupyterlab-widgets==1.1.1
      - langcodes==3.3.0
      - levenshtein-search==1.4.5
      - lxml==4.9.1
      - mccabe==0.6.1
      - murmurhash==1.0.7
      - mypy-extensions==0.4.3
      - openpyxl==3.0.10
      - opensearch-py==1.1.0
      - paramiko==2.11.0
      - parseratorvariable==0.0.18
      - pathspec==0.9.0
      - pathy==0.6.2
      - pdfkit==1.0.0
      - persistent==4.9.0
      - pg8000==1.29.1
      - platformdirs==2.5.2
      - pluggy==1.0.0
      - preshed==3.0.6
      - probableparsing==0.0.1
      - probablepeople==0.5.4
      - progressbar2==4.0.0
      - psycopg2-binary==2.9.3
      - py==1.11.0
      - pyarrow==7.0.0
      - pyasn1==0.4.8
      - pyasn1-modules==0.2.8
      - pycodestyle==2.8.0
      - pydeck==0.7.1
      - pyflakes==2.4.0
      - pyhacrf-datamade==0.2.6
      - pylbfgs==0.2.0.14
      - pympler==1.0.1
      - pymysql==1.0.2
      - pynacl==1.5.0
      - pytest==7.1.2
      - pytest-cov==3.0.0
      - python-crfsuite==0.9.8
      - python-dotenv==0.20.0
      - python-utils==3.3.3
      - pytz-deprecation-shim==0.1.0.post0
      - ratelimit==2.2.1
      - redshift-connector==2.0.908
      - regex==2022.3.2
      - requests==2.28.0
      - requests-aws4auth==1.1.2
      - requests-oauthlib==1.3.1
      - rich==12.5.1
      - rlr==2.4.6
      - rsa==4.9
      - s3transfer==0.5.2
      - scramp==1.4.1
      - semver==2.13.0
      - simplecosine==1.2
      - smart-open==5.2.1
      - spacy==3.4.0
      - spacy-legacy==3.0.9
      - spacy-loggers==1.0.3
      - srsly==2.4.4
      - sshtunnel==0.4.0
      - streamlit==1.11.0
      - syrupy==2.3.1
      - textual==0.1.18
      - thinc==8.1.0
      - tomli==2.0.1
      - ttictoc==0.5.6
      - typer==0.4.2
      - tzdata==2022.1
      - tzlocal==4.2
      - unidecode==1.3.4
      - validators==0.20.0
      - wasabi==0.9.1
      - watchdog==2.1.9
      - widgetsnbextension==3.6.1
      - zope-index==5.2.0
      - zope-interface==5.4.0

Step

Time in seconds (mean ± σ)

Build image

155 ± 1.7

Push image to registry

126

Pull image to cloud instance (t3.medium)

163 ± 21

Why is this slow? Mostly compression.

Docker images are compressed with bzip2 which is efficient at creating smaller files, but slow as it trades CPU time for smaller image size. Caching helps a little, but most of each image is Python packages, which vary wildly and are hard to cache effectively. Additionally, each image layer is decompressed sequentially, which also slows things down despite using a multithreaded Gzip implementation.

So we dropped Docker.

A brief word from our sponsors: Coiled makes it very easy to scale up your Python code in the cloud. We deploy in your AWS or Google Cloud account, and we configure the infrastructure so that you get sensible security and cost-savings by default. If you have Python code that you need to run at scale, give us a try!

Switching to mamba

Mamba environments normally aren’t portable (and in fact has had to pull some tricks to allow package installation in the first place). We can cheat though because Mamba environments are portable if you keep them in the same location on the filesystem, and we control the filesystem on our clusters. Our new build system instead does the following:

  1. Creates an environment on a large machine

  2. Turns it into lz4-compressed tarballs

  3. Loads it into Amazon S3

  4. Downloads/extracts it from s3 directly onto the cluster

We can tune each of these stages using more modern techniques than Docker provides.

Tuning Mamba performance

First, let’s understand how Mamba works:

  • Download metadata.json for each channel in use (40MB+)

  • Solve package dependencies (single-core, RAM hungry)

  • Downloading and extract the required packages

  • Symlink those packages into the new environment

  • Run any post_install steps for packages that have them

  • Run pip install on any additional pip packages

This process presents some challenges to overcome:

  • Solving the environment can be slow and require lots of memory (up to 10 GB+)

  • Downloading packages can be limited by network and disk speeds

To address these we perform a few basic optimizations:

  • Download packages concurrently

  • Use a big machine and ask conda/pip to store packages to a RAM disk (we’re using S3 as persistent storage eventually, so local disk is slow and not that useful anyway)

  • Compress with LZ4 rather than bzip2

  • Stream into S3 in 16 MB chunks, achieving multiple gigabit upload and download speeds

Results: Faster builds

Our new system provides the same user experience, but is significantly faster:

../_images/senvs2_build_push_pull.svg

That’s under 2 minutes total, over three times faster than using a Docker image.

If you’re using coiled>=0.4, you’ll automatically be using the new environment build system, and you can keep using the old system by pinning an older version of Coiled. Though we’re no longer using Docker to build software environments, you can keep using pre-built Docker images in your clusters. See our docs for more details.

If you’re not using Coiled then what are you waiting for? The Coiled free tier is enough for most people’s workloads and is easy to get started.