Table of contents
In distri, packages (e.g.
emacs) are hermetic. By hermetic, I mean that the dependencies a package uses (e.g.
libusb) don’t change, even when newer versions are installed.
For example, if package
libusb-amd64-1.0.22-7 is available at build time, the package will always use that same version, even after the newer
libusb-amd64-1.0.23-8 will be installed into the package store.
Another way of saying the same thing is: packages in distri are always co-installable.
This makes the package store more robust: additions to it will not break the system. On a technical level, the package store is implemented as a directory containing distri SquashFS images and metadata files, into which packages are installed in an atomic way.
Out of scope: plugins are not hermetic by design
One exception where hermeticity is not desired are plugin mechanisms: optionally loading out-of-tree code at runtime obviously is not hermetic.
Debian ships about a dozen NSS libraries for a variety of purposes, and enterprise setups might add their own into the mix.
Having packages be as hermetic as possible remains a worthwhile goal despite any exceptions: I will gladly use a 99% hermetic system over a 0% hermetic system any day.
Side note: Xorg’s driver model (which can be characterized as a plugin mechanism) does not fall under this category because of its tight API/ABI coupling! For this case, where drivers are only guaranteed to work with precisely the Xorg version for which they were compiled, distri uses per-package exchange directories.
Implementation of hermetic packages in distri
On a technical level, the requirement is: all paths used by the program must always result in the same contents. This is implemented in distri via the read-only package store mounted at
/ro, e.g. files underneath
/ro/emacs-amd64-26.3-15 never change.
To change all paths used by a program, in practice, three strategies cover most paths:
ELF interpreter and dynamic libraries
Programs on Linux use the ELF file format, which contains two kinds of references:
First, the ELF interpreter (
PT_INTERP segment), which is used to start the program. For dynamically linked programs on 64-bit systems, this is typically
Many distributions use system-global paths such as
/lib64/ld-linux-x86-64.so.2, but distri compiles programs with
-Wl,--dynamic-linker=/ro/glibc-amd64-2.31-4/out/lib/ld-linux-x86-64.so.2 so that the full path ends up in the binary.
The ELF interpreter is shown by
file(1), but you can also use
readelf -a $BINARY | grep 'program interpreter' to display it.
And secondly, the rpath, a run-time search path for dynamic libraries. Instead of storing full references to all dynamic libraries, we set the rpath so that
ld.so(8) will find the correct dynamic libraries.
Originally, we used to just set a long rpath, containing one entry for each dynamic library dependency. However, we have since switched to using a single
lib subdirectory per package as its rpath, and placing symlinks with full path references into that
lib directory, e.g. using
-Wl,-rpath=/ro/grep-amd64-3.4-4/lib. This is better for performance, as
ld.so uses a per-directory cache.
Note that program load times are significantly influenced by how quickly you can locate the dynamic libraries. distri uses a FUSE file system to load programs from, so getting proper
-ENOENT caching into place drastically sped up program load times.
Instead of compiling software with the
-Wl,-rpath flags, one can also modify these fields after the fact using
patchelf(1). For closed-source programs, this is the only possibility.
The rpath can be inspected by using e.g.
readelf -a $BINARY | grep RPATH.
Environment variable setup wrapper programs
Many programs are influenced by environment variables: to start another program, said program is often found by checking each directory in the
PATH environment variable.
Such search paths are prevalent in scripting languages, too, to find modules. Python has
PYTHONPATH, Perl has
PERL5LIB, and so on.
To set up these search path environment variables at run time, distri employs an indirection. Instead of e.g.
teensy-loader-cli, you run a small wrapper program that calls precisely one
execve system call with the desired environment variables.
Initially, I used shell scripts as wrapper programs because they are easily inspectable. This turned out to be too slow, so I switched to compiled programs. I’m linking them statically for fast startup, and I’m linking them against musl libc for significantly smaller file sizes than glibc (per-executable overhead adds up quickly in a distribution!).
Note that the wrapper programs prepend to the
PATH environment variable, they
don’t replace it in its entirely. This is important so that users have a way to
PATH (and other variables) if they so choose. This doesn’t hurt
hermeticity because it is only relevant for programs that were not present at
build time, i.e. plugin mechanisms which, by design, cannot be hermetic.
Shebang interpreter patching
The Shebang of scripts contains a path, too, and hence needs to be changed.
We don’t do this in distri yet (the number of packaged scripts is small), but we should.
The performance improvements in the previous sections are not just good to have, but practically required when many processes are involved: without them, you’ll encounter second-long delays in magit which spawns many git processes under the covers, or in dracut, which spawns one
cp(1) process per file.
Downside: rebuild of packages required to pick up changes
Linux distributions such as Debian consider it an advantage to roll out security
fixes to the entire system by updating a single shared library package
The flip side of that coin is that changes to a single critical package can break the entire system.
With hermetic packages, all reverse dependencies must be rebuilt when a
library’s changes should be picked up by the whole system. E.g., when
curl must be rebuilt to pick up the new version of
This approach trades off using more bandwidth and more disk space (temporarily) against reducing the blast radius of any individual package update.
Downside: env wrapper long paths are cumbersome to deal with
TODO: describe the results of trying to mitigate this issue by removing empty directories at build time to reduce the number of components
The implementation outlined above works well in hundreds of packages, and only a small handful exhibited problems of any kind. Here are some issues I encountered:
Issue: accidental ABI breakage in plugin mechanisms
NSS libraries built against glibc 2.28 and newer cannot be loaded by glibc 2.27. In all likelihood, such changes do not happen too often, but it does illustrate that glibc’s published interface spec is not sufficient for forwards and backwards compatibility.
In distri, we could likely use a per-package exchange directory for glibc’s NSS mechanism to prevent the above problem from happening in the future.
Issue: wrapper bypass when a program re-executes itself
Some programs try to arrange for themselves to be re-executed outside of their
current process tree. For example, consider building a program with the
mesonfirst configures the build, it generates
ninjafiles (think Makefiles) which contain command lines that run the
ninjais called as a separate process, so it will not have the environment which the
mesonwrapper sets up.
ninjathen runs the previously persisted
mesoncommand line. Since the command line uses the full path to
meson(not to its wrapper), it bypasses the wrapper.
Luckily, not many programs try to arrange for other process trees to run them. Here is a table summarizing how affected programs might try to arrange for re-execution, whether the technique results in a wrapper bypass, and what we do about it in distri:
|technique to execute itself||uses wrapper||mitigation|
|run-time: find own basename in
|compile-time: embed expected path||no; bypass!||configure or patch|
Misc smaller issues
- zsh does not detect it is a login shell when using a wrapper
- TODO: also file an issue on GitHub, this is not yet debugged
- TODO: what are the minimum steps to reproduce?
- LDFLAGS leaked to pkgconfig
- TODO: file bugs upstream with 4 packages
- mozjs tries to run autoconf with the shell directly, but should use autoconf’s wrapper
Appendix: Could other distributions adopt hermetic packages?
At a very high level, adopting hermetic packages will require two steps:
Using fully qualified paths whose contents don’t change (e.g.
/ro/emacs-amd64-26.3-15) generally requires rebuilding programs, e.g. with
Once you use fully qualified paths you need to make the packages able to exchange data. distri solves this with exchange directories, implemented in the
/rofile system which is backed by a FUSE daemon.
The first step is pretty simple, whereas the second step is where I expect controversy around any suggested mechanism.
Appendix: demo (in distri)
This appendix contains commands (run on distri version
supersilverhaze) and their output. Large outputs have been collapsed and can be expanded by clicking on the output.
/bin directory contains symlinks for the union of all package’s
distri0# readlink -f /bin/teensy_loader_cli
The wrapper program in the
bin subdirectory is small:
distri0# ls -lh $(readlink -f /bin/teensy_loader_cli)
-rwxr-xr-x 1 root root 46K Apr 21 21:56 /ro/teensy-loader-cli-amd64-2.1+g20180927-7/bin/teensy_loader_cli
Wrapper programs execute quickly:
distri0# strace -fvy /bin/teensy_loader_cli |& head | cat -n
1 execve("/bin/teensy_loader_cli", ["/bin/teensy_loader_cli"], ["USER=root", "LOGNAME=root", "HOME=/root", "PATH=/ro/bash-amd64-5.0-4/bin:/r"..., "SHELL=/bin/zsh", "TERM=screen.xterm-256color", "XDG_SESSION_ID=c1", "XDG_RUNTIME_DIR=/run/user/0", "DBUS_SESSION_BUS_ADDRESS=unix:pa"..., "XDG_SESSION_TYPE=tty", "XDG_SESSION_CLASS=user", "SSH_CLIENT=10.0.2.2 42556 22", "SSH_CONNECTION=10.0.2.2 42556 10"..., "SSHTTY=/dev/pts/0", "SHLVL=1", "PWD=/root", "OLDPWD=/root", "=/usr/bin/strace", "LD_LIBRARY_PATH=/ro/bash-amd64-5"..., "PERL5LIB=/ro/bash-amd64-5.0-4/ou"..., "PYTHONPATH=/ro/bash-amd64-5.b0-4/"...]) = 0
2 arch_prctl(ARCH_SET_FS, 0x40c878) = 0
3 set_tid_address(0x40ca9c) = 715
4 brk(NULL) = 0x15b9000
5 brk(0x15ba000) = 0x15ba000
6 brk(0x15bb000) = 0x15bb000
7 brk(0x15bd000) = 0x15bd000
8 brk(0x15bf000) = 0x15bf000
9 brk(0x15c1000) = 0x15c1000
10 execve("/ro/teensy-loader-cli-amd64-2.1+g20180927-7/out/bin/teensy_loader_cli", ["/ro/teensy-loader-cli-amd64-2.1+"...], ["USER=root", "LOGNAME=root", "HOME=/root", "PATH=/ro/bash-amd64-5.0-4/bin:/r"..., "SHELL=/bin/zsh", "TERM=screen.xterm-256color", "XDG_SESSION_ID=c1", "XDG_RUNTIME_DIR=/run/user/0", "DBUS_SESSION_BUS_ADDRESS=unix:pa"..., "XDG_SESSION_TYPE=tty", "XDG_SESSION_CLASS=user", "SSH_CLIENT=10.0.2.2 42556 22", "SSH_CONNECTION=10.0.2.2 42556 10"..., "SSHTTY=/dev/pts/0", "SHLVL=1", "PWD=/root", "OLDPWD=/root", "=/usr/bin/strace", "LD_LIBRARY_PATH=/ro/bash-amd64-5"..., "PERL5LIB=/ro/bash-amd64-5.0-4/ou"..., "PYTHONPATH=/ro/bash-amd64-5.0-4/"...]) = 0
Confirm which ELF interpreter is set for a binary using
distri0# readelf -a /ro/teensy-loader-cli-amd64-2.1+g20180927-7/out/bin/teensy_loader_cli | grep 'program interpreter'
[Requesting program interpreter: /ro/glibc-amd64-2.31-4/out/lib/ld-linux-x86-64.so.2]
Confirm the rpath is set to the package’s lib subdirectory using
distri0# readelf -a /ro/teensy-loader-cli-amd64-2.1+g20180927-7/out/bin/teensy_loader_cli | grep RPATH
0x000000000000000f (RPATH) Library rpath: [/ro/teensy-loader-cli-amd64-2.1+g20180927-7/lib]
…and verify the lib subdirectory has the expected symlinks and target versions:
distri0# find /ro/teensy-loader-cli-amd64-*/lib -type f -printf '%P -> %l\n'
libc.so.6 -> /ro/glibc-amd64-2.31-4/out/lib/libc-2.31.so
libpthread.so.0 -> /ro/glibc-amd64-2.31-4/out/lib/libpthread-2.31.so
librt.so.1 -> /ro/glibc-amd64-2.31-4/out/lib/librt-2.31.so
libudev.so.1 -> /ro/libudev-amd64-245-11/out/lib/libudev.so.1.6.17
libusb-0.1.so.4 -> /ro/libusb-compat-amd64-0.1.5-7/out/lib/libusb-0.1.so.4.4.4
libusb-1.0.so.0 -> /ro/libusb-amd64-1.0.23-8/out/lib/libusb-1.0.so.0.2.0
To verify the correct libraries are actually loaded, you can set the
environment variable for
distri0# LD_DEBUG=libs teensy_loader_cli
678: find library=libc.so.6 ; searching
678: search path=/ro/teensy-loader-cli-amd64-2.1+g20180927-7/lib (RPATH from file /ro/teensy-loader-cli-amd64-2.1+g20180927-7/out/bin/teensy_loader_cli)
678: trying file=/ro/teensy-loader-cli-amd64-2.1+g20180927-7/lib/libc.so.6
NSS libraries that distri ships:
find /lib/ -name "libnss_*.so.2" -type f -printf '%P -> %l\n'
libnss_myhostname.so.2 -> ../systemd-amd64-245-11/out/lib/libnss_myhostname.so.2
libnss_mymachines.so.2 -> ../systemd-amd64-245-11/out/lib/libnss_mymachines.so.2
libnss_resolve.so.2 -> ../systemd-amd64-245-11/out/lib/libnss_resolve.so.2
libnss_systemd.so.2 -> ../systemd-amd64-245-11/out/lib/libnss_systemd.so.2
libnss_compat.so.2 -> ../glibc-amd64-2.31-4/out/lib/libnss_compat.so.2
libnss_db.so.2 -> ../glibc-amd64-2.31-4/out/lib/libnss_db.so.2
libnss_dns.so.2 -> ../glibc-amd64-2.31-4/out/lib/libnss_dns.so.2
libnss_files.so.2 -> ../glibc-amd64-2.31-4/out/lib/libnss_files.so.2
libnss_hesiod.so.2 -> ../glibc-amd64-2.31-4/out/lib/libnss_hesiod.so.2