Standalone, single-file, editable Python scripts WITH DEPENDENCIES

How badly I wanted something like that?

The problem: Python for scripting

Beside programming and data science, I find Python to be a very useful glue language; I think it's great for shell replacement when bash/zsh scripts get too complex, but there's one caveat: as long as you can work with its standard library, you're in the sweet spot. As soon as you'd like to use an external dependency, that can be a problem, because if you don't want to contaminate your system with external dependencies, you'll either a) hope that your system packages a proper version for such library or b) start needing virtualenv and so on.

Both options are ok for manual development, a bit less ok if you're willing to deliver such scripts to multiple servers for automating some kind of process.

For sure, there're many options to fully package a Python executable - PyInstaller comes to my mind, but other exist. But then you've got a kind of "build process" for your script, and you cannot edit it directly on a server. But I find that, very often, for internal tasks and scripts, my process is exactly that: I do edit the script on the server, then, when I get it right, I copy it on my version control system and deliver it to other machines. Yes, I wouldn't do the same for "real" software, but as I said, those are often internal scripts, used for reporting, cron jobs, other small automated tasks.

The solution: editable python scripts with isolated dependencies

So what? That's what I baked. Not a perfect solution, but a decent one. Just have python and pip on your system, add a REQUIREMENTS string (equivalent to the content from requirements.txt), then import everything.

This will install the dependencies in separate location in a temporary directory at first use, then reuse them when necessary.

So: just copy & paste the following snippets, edit the two USER SERVICEABLE sections, then start writing your desired code at the bottom. The snippet here includes an example of how to run requests, so you can just delete the requirements, imports and requests call at the bottom if you don't need that.

#!/usr/bin/python3
import os
import sys
from tempfile import gettempdir, NamedTemporaryFile
import hashlib

# USER SERVICEABLE: paste here your requirements.txt
# the recommendation is to create a development virtualenv,
# install the deps with pip inside it, then do a `pip freeze`
# and paste the output here
REQUIREMENTS = """
certifi==2018.11.29
chardet==3.0.4
idna==2.8
requests==2.21.0
urllib3==1.24.1
"""
# USER SERVICEABLE end

def add_custom_site_packages_directory(raise_if_failure=True):
    digest = hashlib.sha256(REQUIREMENTS.encode("utf8")).hexdigest()
    dep_root = os.path.join(gettempdir(), "pyallinone_{}".format(digest))
    os.makedirs(dep_root, exist_ok=True)

    for dirpath, dirnames, filenames in os.walk(dep_root):
        if dirpath.endswith(os.path.sep + "site-packages"):
            # that's our dir!
            sys.path.insert(0, os.path.abspath(dirpath))
            return dep_root

    if raise_if_failure:
        raise ValueError("could not find our site-packages dir")

    return dep_root

dep_root = add_custom_site_packages_directory(False)

deps_installed = False

while True:
    try:
        # USER SERVICEABLE: import all your required deps in this block! and keep the break at the end!
        import requests
        # USER SERVICEABLE end

        break
    except ImportError:
        if deps_installed:
            raise ValueError("Something was broken, could not install dependencies")
        try:
            from pip import main as pipmain
        except ImportError:
            from pip._internal import main as pipmain

        with NamedTemporaryFile() as req:
            req.write(REQUIREMENTS.encode("utf-8"))
            req.flush()
            pipmain(["install", "--prefix", dep_root, "--upgrade", "--no-cache-dir", "--no-deps", "-r", req.name])

        add_custom_site_packages_directory()
        deps_installed = True

# HERE you can start writing the actual code of your script

r = requests.get("https://www.google.com")
print(r.status_code)

How does this work?

  • It creates a subdir in the directory where temporary files are held on your filesystem, and downloads the dependencies there using pip. Such subdir name is autogenerated depending on your requirements - so, if your requirements change, a new directory is employed.
  • Then, it adds such directory to your sys.path, allowing Python to find modules and packages there.
  • When you restart the script, it first tries to find the libraries that were previously downloaded, and only if that fails it goes downloading the libraries again.

CAVEATS:

  • Of course the target system must have internet access, at least to pypi or github or other vcs (depending on your requirements format), and python and pip must be installed.
  • If your requirements change, nothing deletes files in your temp directory. But that's usually sweeped at system boot or by cronjobs, so it's not a real problem. BUT: if your cronjob only partially sweeps files in the subdir, it could break something (check if anything like that exists on your system, I can remember some older Redhat/Centos doing that).
  • If your packages have binary dependencies and/or require to build extensions, you still need the shared libs (for runtime) AND the proper header files / -dev packages. There's no silver bullet for that in this recipe.

Photo by Milan Popovic on Unsplash