Python Dependency Management

Applications vs Libraries

In Python it can be useful to draw a clear distinction between applications, and libraries.

A library is a package which will be used within an application, but does not provide substantial functionality on its own.
For example, our UKRDC SQLAlchemy models are in a library. The package has no use on its own, but is used elsewhere.

On the other hand, an application is a package which will be deployed and run on its own.
Importantly, unlike a library we have full control over the environment an application runs in, such as the specific Python version and the underlying OS.

How does this affect dependency management?

When writing a library, your dependencies should not be pinned too specifically.
Doing so will place severe restrictions on any applications using your library, which may cause downstream issues.

For example, if we were to require a specific patch version of SQLAlchemy in our models, then any applications using our model library would need to ensure all other functionality is available on that version. This is unrealistic in practice and can cause serious problems.

Likewise, your library should ideally support several Python versions. This obviously changes over time, but at the time of writing (early 2021) it would be reasonably to expect library support on Python 3.6, 3.7, 3.8, and 3.9.

Conversely, applications absolutely should have specifically pinned dependencies. An application needs to be deployed in a specific reproducible environment which we control.
Now manually managing pinned dependencies, and doing it right, can be challenging, but there are countless tools available to handle this for you.

For example, pip-tools can read an unpinned requirements.in file, and generate a fully resolved and pinned requirements.txt.
More modern tools such as Poetry will dynamically handle dependency pinning depending on if you are running as an application or publishing as a library.

Git dependencies

In some situations it may be beneficial to use Git dependencies, instead of using a package repository such as PyPII.

For example, libraries undergoing rapid development with multiple active branches, or those without a well defined release schedule.

In these cases, both pip requirements.txt and the more modern pyproject.toml support git dependencies.
The target repo/branch/commit is cloned locally, then installed using whatever package manager you're running.

It is possible to pin your dependency to a specific branch, tag, or commit. Which of these you choose depends on the development workflow.

To add a git dependency, use the following format:

git+https://some-vcs.tld/someorgname/pkg-repo-name#commit-hash-or-tag

Note, tools such as Poetry are able to resolve an unpinned git dependency, and pin to a specific commit in its lock file.
For reproducible application deployment, this is a really solid option.