How many PEPs does it take to install a package?#

A few months ago we released package sync, a feature that takes your Python environment and replicates it in the cloud with zero effort.

The reaction has been very positive; people enable the feature and then mostly forget about their previous woes with docker or version mismatches.

However to make this feature really stand out, it needs to work the first time, every time. The feature worked great when you used Python exactly like we did, but there are so many ways of using Python it was rare for this to be true.

The first and most critical step for package sync, is to identify every installed package, the installed version, and where it came from. This is critical information if we’re going to reverse engineer your environment! Some existing tools try this, like pip list, but none provided all the information we needed.

To achieve this we needed to do a deep dive on not only current PEP standards for Python environments, but every previous iteration of them (including some from before the time of PEPs) and every operating system we intended to support.

There’s a surprisingly large number of ways of getting a Python package in position so import foo works.

Some of these also have different variants. For example, PEP 376 has an addendum that allows a JSON file describing where the package was installed from.

While Python has a relatively powerful tool to discover the names of all the packages installed, we have to find out where you got those packages from and how we might be able to repeat what you did. Did you install dask from conda, Git, PyPI or perhaps a local editable directory?

Every package manager also implements its own version of these standards. For example, Poetry prior to version 1.2.0 implemented an incorrect standard of PEP 610. PDM uses PEP 582, which is still a draft, and installs everything to a __pypackages__ folder.

Each operating system also has its own quirks, we recently had to deal with importlib.metadata reporting paths to files in egg packages with / separators instead of \, as they are technically in a zip file.

So how do we make sure package sync works and keeps working in all of these cases?

With all this knowledge we built out a large integration test suite, where we install a dummy version of every package variant we know of using a bunch of different package managers on a bunch of a different operating systems.

It currently has 29 different combinations of operating systems, versions, and package managers.

OS:

  • Ubuntu

  • Windows

  • macOS

  • Debian

  • amazonlinux

Package managers:

  • apt

  • yum

  • venv

  • pyenv

  • mamba

  • venv

Python versions:

  • 3.11

  • 3.10

  • 3.9

  • 3.8

  • 3.7

During development of the test suite we squashed 7 different bugs, some minor and some show stoppers.

Our test suite will grow as we spot more ways of managing package and as the Python community creates even more. I created package sync with a firm belief that our solution should “just work” for everyone.