Build Reproducibility Workshop Report

Clemens Lang cal at macports.org
Wed Dec 9 12:31:39 PST 2015


Hello Rainer, hello Mojca,

On Wed, Dec 09, 2015 at 12:14:42PM +0100, Rainer Müller wrote:
> > To track down the differences that cause builds to be non-reproducible,
> > a couple of people from the Debian reproducible builds effort have
> > written diffoscope [5], a python diff tool that will interpret file
> > formats and try hard to give you a human-readable difference between two
> > files. Support for Mach-O binaries is available as a patch at [6] (and I
> > hope to push it upstream soon). This tool could also be helpful to look
> > at differences in stealth updates.
> 
> This looks interesting. Anyone up for creating a port? :-)

I actually started to write the Portfile but then abandoned my efforts
after I noticed that I had missed dependencies and diffoscope would
easily run in a virtualenv. Look for py-diffoscope in the repository
history though, I did have the port committed without dependencies (and
then deleted it again).

https://github.com/Homebrew/homebrew/blob/master/Library/Formula/diffoscope.rb
might help in getting you started.

We'd probably want to apply my Mach-O patch. Upstream plans merging it
but expects some delay.


> Subversion can be configured to apply the commit timestamps to files
> (use-commit-times=yes). We could use that on the server creating the
> port tarball for rsync. Although I would prefer a more generic
> solution.

I've checked the portindex generation code, and this would actually work
for our solution. Enabling this setting in subversion working copies
locally would however lead to ports not being reindexed when they should
in some situations, so not really a good solution.


> Previously I wanted to abandon keywords, but taking the date from the
> expanded $Id$ in the Portfile would probably be quite handy for
> this...

Yes. It is, however, a solution that would only work for Subversion. We
could also enable the timestamp exporting on the rsync server (where it
wouldn't cause problems with the portindex), and decide depending on the
type of the ports tree which method to use:
 - if it's git/git-svn, use git
 - if it's svn, use svn info
 - if it's neither, use file mtime synced from rsync server

> >  - The newest timestamp inside a source code tree is a good choice (and
> >    https://github.com/0-wiz-0/findnewest could easily give us that
> >    timestamp), but sources fetched from version control systems do not
> >    always set it to the time of the commit (AFAIK Git doesn't, for
> >    example).
> >  - A fixed value of 0 or 1 is not a very good choice.
> > We could put an additional piece of metadata into Portfiles to be used
> > as timestamp (e.g. just like we have checksums). It is my understanding
> > that FreeBSD will chose to go this route.
> 
> Hm, that would be an additional value that always needs to be
> increased by the maintainer? That sounds error-prone.

Well, as long as the value is newer than any source files, not
incrementing it would not cause problems. I agree that manually managing
this sounds tedious, though.

Maybe this would work: Depending on the fetch type, choose a method to
determine the SOURCE_DATE_EPOCH. For
 - tarballs, use findnest and use the newest source file mtime
 - svn, use svn info to look up the date of the newest commit (or
   checkout preserving timestamps)
 - git, use the commit date of the latest commit
 - ...


> > Homebrew achieves good binary package coverage for non-default prefixes
> > by scanning the build results for $prefix. In library load commands, the
> > path is changed using install_name_tool(1) on installation locally, in
> > text files, the path is simply changed. If $prefix is found in a binary
> > file, the archive is marked as non-prefix-invariant and ignored by
> > non-default prefix installations.
> 
> Such a check to verify the build results is something I always wanted
> post-destroot. Not necessarily to fix it, but just to validate the build
> result. But the implementation to do such analyses is still only in the
> diverged branch gsoc11-post-destroot.

ACK

> We already have been using -headerpad_max_install_names for a few
> years, but never dared to go further and actually fix this in a
> compiled port archive. I am surprised this works for them.

So was I, which is why I included the information. Mike told me they'd
actually get decent coverage using this approach.


> On a related side note, all of this would be good additions for
> MacPorts base. However, currently our trunk is not in the best shape
> after merging the feature branches from last GSoC. The remaining bugs
> in the interactive mode and the new reclaim should be fixed first
> before beginning new work. Otherwise I fear we will not reach a state
> where we could split off a new release. Maybe we should make a real
> release plan with milestone targets to make progress?

Yes, I'm aware of the instabilities. I try to work on these things from
time to time, but other tasks have been keeping me off fixing more of
them. I would certainly welcome a plan, but I'd say our infrastructure
has priority at the moment.


On Wed, Dec 09, 2015 at 12:55:16PM +0100, Mojca Miklavec wrote:
> If we stick with SVN, one could extract "Last Changed Date" from "svn
> info" on the directory. An example where this wouldn't work too well
> are subports (where only one subport changes, but not the others) or
> random edits, even if only for whitespace changes on the Portfile. I
> don't know how problematic those random edits are in general.

There are already a couple of cases where we see non-reproducibility
even though nothing important changed. For example, if you upgrade a
library dependency to the next minor version (where a rebuild against
the new version isn't required) you would not expect any differences in
the binary if you rebuild. However, due to Apple's decision to always
copy the current version of a library you link against into the binaries
load command, your build results will actually differ.

> Clemens, I guess we can count on a talk (and/or discussion/hacking
> session) about all this at the meeting?
> http://trac.macports.org/wiki/Meetings/MacPortsMeeting2016
> Feel free to make an entry somewhere on the page.

Yes.

-- 
Clemens


More information about the macports-dev mailing list