Build Reproducibility Workshop Report
cal at macports.org
Tue Dec 8 14:48:35 PST 2015
Hello fellow MacPorts people,
Brace yourselves, this mail is going to be long.
As some of you may know , I recently represented MacPorts at a
workshop on build reproducibility in Athens organized by the Debian
folks. A wide range of projects and package management systems was
represented; the full list is available on the website . The *BSDs
and Homebrew (Mike McQuaid) are probably most relevant to our interests.
If you are not sure why we should care about reproducible builds, let me
assure you there are plenty of reasons. A more detailed rationale was
prepared at the workshop and should be available at  soon, but I'll
take the time to mention a few points. Note that "reproducible builds"
in this context always means bit-by-bit reproducibility, unless
explicitly stated otherwise.
- Source <-> Binary Correspondence: Reproducible builds allow
developers to verify that what our buildbot is serving us is actually
what it claims to be.
- Attack Surface Reduction: Having reproducible builds reduces the
motivation to attack our buildbot setup, because modifications can be
- Caching: You can avoid rebuilds of packages that build reproducibly
if the inputs haven't changed. This doesn't seem to be *that*
important to us at the moment, but is a big selling point, especially
for commercial software development.
- Delta Reduction: With reproducible builds, small changes in source
will be more likely to cause small changes in the resulting binary.
This could be used to allow binary-delta updating, reducing download
time, bandwidth requirements, and update time.
- Support Burden Reduction: Build reproducibility can provide
confidence that a user's build is exactly what a packager intended
and rule out a whole class of bugs.
How to Build Reproducibly
The process to get reproducible builds is pretty well-understood. The
reproducible-builds.org documentation outlines the most common problems
and issues that prevent reproducible builds . For the most part, all
distributions face the same issues, which allows us to build on the
effort of projects with larger man power, like Debian. There are a
couple of points that might not be obvious or easily overlooked that I'd
like to point out:
- Filesystem ordering and locale-dependent sorting: Relying on the
order of files that readdir(3) returns makes builds unreproducible.
Sorting those files will only help if the sort result doesn't differ
- Timestamps are everywhere and are responsible for a large part of
unreproducible builds. Using __DATE__, __TIME__, __TIMESTAMP__, or
similar macros should be avoided. Version numbers or version control
system information are much better replacements: If your build is
reproducible, it does not matter *when* it happened. However, lots of
tools include timestamps by default, such as gzip(1) when compressing
our manpages (r143068) or tar(1) when creating our binary archives.
Strategies to solve these problems exist, e.g. by providing a ceiling
value for all time stamps while creating a tarball or using the
environment variable SOURCE_DATE_EPOCH  for date-dependent macros.
- Well-defined build environments: Pretty much the rest of the world
has good OS-level support for a chroot(2)-like mechanism that can be
used to provide a build environment that only contains inputs from a
controlled list of dependencies. FreeBSD has jails, Linux has
namespaces, but the only thing OS X supports in this direction are
chroots, and those have a reputation of breaking some of Apple's
tools like xcodebuild (a reputation I may set out verifying or
falsifying). Trace mode is a step into the right direction, but
doesn't catch everything and is very slow compared to other methods.
To additionally make matters more complicated, we rely on Apple's
toolchain, which can be updated and/or changed independent of
Testing Build Reproducibility
In order to find out whether a build can be reproduced, it should be
done multiple times, with possibly varying input settings. The more
input and environment settings can be modified without the build result
changing, the higher the reproducibility. Debian has a couple of
machines available and runs a Jenkins setup that will build each package
twice but vary a couple of settings for the second build, such as:
hostname, domainname, environment variables (TZ, LANG, LC_ALL, PATH),
UID/GID, Kernel version, umask, CPU type, current time (by a large
amount to trigger changes in year, month and day regardless of
timezone), and filesystem sort order (by using a FUSE filesystem that
will make readdir(3) return different results). While Debian's setup is
available for use by other projects, it is of little use to us because
OS X cannot be virtualized on non-Apple hardware without violating the
EULA. One of the biggest hurdles towards systematic testing for build
reproducibility in MacPorts (and Homebrew as well, btw) is thus the
availability of Apple hardware.
To track down the differences that cause builds to be non-reproducible,
a couple of people from the Debian reproducible builds effort have
written diffoscope , a python diff tool that will interpret file
formats and try hard to give you a human-readable difference between two
files. Support for Mach-O binaries is available as a patch at  (and I
hope to push it upstream soon). This tool could also be helpful to look
at differences in stealth updates.
State of Reproducible Builds in MacPorts
Despite the several obstacles mentioned above, build reproducibility in
MacPorts is actually not a lost cause. This is partly because we have
historically always tried to keep a clean and similar build environment
across machines, e.g. by using privilege separation, removing all but a
few white-listed environment variables and trace mode. Timestamps are
our biggest issue on the road towards reproducible tarballs at the
moment. In a sloppy test done by Marius Schamschula and me, we managed
to reproduce our builds of bash down to timestamp issues in gzip headers
and tarball metadata. Unfortunately, generating statistics on
reproducibility requires buildserver support.
To fix the timestamp issues, I am looking for a suitable value to use as
SOURCE_DATE_EPOCH and then add a find statement before creating the
archive that will put an upper mtime limit on all files to be packaged.
I am not yet sure what a good (reproducible!) timestamp might be:
- The Portfile mtime would be perfect, but is not preserved by
Subversion, so we cannot rely on it. It is preserved by our rsync
sync, but the mtime in that is probably meaningless since it's the
one generated on the rsync server during svn update.
- The newest timestamp inside a source code tree is a good choice (and
https://github.com/0-wiz-0/findnewest could easily give us that
timestamp), but sources fetched from version control systems do not
always set it to the time of the commit (AFAIK Git doesn't, for
- A fixed value of 0 or 1 is not a very good choice.
We could put an additional piece of metadata into Portfiles to be used
as timestamp (e.g. just like we have checksums). It is my understanding
that FreeBSD will chose to go this route.
I've learned that our builds of GHC and all Haskell modules are likely
ABI-incompatible when downloaded from the buildbot vs. built locally.
We should disable parallel building for Haskell to fix this until
upstream provides a better solution. Luckily, this hasn't largely
affected us yet, because binary availability in the Haskell land is
Homebrew achieves good binary package coverage for non-default prefixes
by scanning the build results for $prefix. In library load commands, the
path is changed using install_name_tool(1) on installation locally, in
text files, the path is simply changed. If $prefix is found in a binary
file, the archive is marked as non-prefix-invariant and ignored by
non-default prefix installations.
Homebrew has methods to provide compiler wrappers that ensure that build
systems are UsingTheRightCompiler, and additionally ensure that the
compiler flags are set as expected (e.g. -arch flags, -stdlib flag for
Google's Blaze (Open Source: Bazel) build system supports license
annotation on build results and license compatibility analysis. Their
approach to the problem might be interesting input for the set of
scripts we use to determine whether a binary archive is distributable.
I'd like to thank portmgr@ for giving me the chance to represent the
MacPorts Project at this event.
Travel and Accomodation has been sponsored by the Linux Foundation.
Conference Location and Moderation have been sponsored by the Open
Dinner has been provided by Google ;-)
More information about the macports-dev