Force build on wrong builder
ryandesign at macports.org
Thu Oct 12 12:37:06 UTC 2017
On Oct 9, 2017, at 11:49, Zero King wrote:
> Did you trigger a force build  on the wrong builder? I started
> another one  on 10.13.
>  https://build.macports.org/builders/ports-10.12_x86_64-watcher/builds/8476
>  https://build.macports.org/builders/ports-10.13_x86_64-watcher/builds/427
It was intentional, but didn't have the effect I expected.
The High Sierra builder has been busy building ports as commits come in. It builds and deploys binaries not only for the port being modified, but also for all of its dependencies recursively. This is a good way to get started out, so that we provide binaries for things that are actively being updated.
During lulls, I've been scheduling small batches of additional ports to be built. For example I built all the gcc and clang compilers because they are large and time-consuming to compile; I did all the postgresql and mysql databases; I did xorg since that would recursively build all xorg ports; and I did all the third-party php modules since there are just a few of them and they're mostly standalone. Then I moved on to batches of perl modules, python modules and libraries beginning with a, then b, then c, etc. Each small batch takes some hours to do, but afterward builds from commits can resume. In this way, builds from commits are somewhat delayed, but not too much. So that's one reason why I try to schedule small batches of ports rather than large ones, but there's another reason too.
The buildbot (specifically, portwatcher) is smart enough that, when it is asked to build a batch of ports, it only schedules (portbuilder) builds for those ports and subports which haven't already been built. Then it works through that set of scheduled builds before moving on to the next batch. The check for whether a port is already built happens only once per batch (in portwatcher), before any port from that batch is built, not at any other time during that batch (i.e. not in portbuilder).
For this reason, it is more efficient to build small batches of ports at a time. Imagine a new builder which has not yet built any ports, and you want it to build zlib and xz. zlib has an extract dependency on xz, and xz has a few library dependencies. zlib also has a subport, minizip, which has dependencies on several ports including zlib.
Suppose you schedule a single batch and list both zlib and xz in the portlist field. This can be inefficient. First, buildbot will check if zlib or minizip or xz have already been built. They haven't, so all three ports are scheduled to be built. Currently, builds are inadvertently scheduled in somewhat random order , so it could be that the minizip build occurs first. It will build its dependencies first, which include zlib and xz, then build minizip, and then all installed active ports will be deactivated. Finally, all the binaries will be distributed. Maybe the zlib build got scheduled second, and it installs (well, reactivates) its dependencies. It then wants to build zlib, except it's already been built, so it just reactivates the already-built zlib. It then deactivates all installed ports. All the binaries have already been distributed, so nothing needs to be done there either. This build was a waste of time, since all the work this build would have done was already done by the previous build. Finally we get to xz which happened to be scheduled third. Much the same thing happens: dependencies reactivate, xz reactivates, everything deactivates, nothing needs to be distributed. Waste of time.
Now suppose that instead, you schedule two batches of ports: First zlib, then xz. The first batch proceeds similar to the above, with two builds being scheduled, one for zlib and one for minizip. Sure, there's still some wasted time there, if minizip goes first. But when it comes time for the second batch, portwatcher realizes xz has already been built, and does not schedule a build for it. Time saved.
You can imagine that this gets worse the more ports you schedule at once, and the more interdependencies they have. And we have lots of python and perl modules, and they tend to have lots of interdependencies. So your forced build of all python modules on the High Sierra builder scheduled 2400 builds, which have so far taken 68 hours, during which no builds from commits could take place. We can let it run, because it's almost done now, and we do need to attempt those builds sooner or later, but I suspect many of those builds could have been avoided if they had been scheduled in smaller batches.
So why then did I schedule a build of all python modules on the Sierra builder? We have never scheduled a build of all ports on the Sierra builder, because when reimplementing the buildbot after moving away from macOS forge we never added the capability to just type the word "all" into the portlist field, and because I suspect attempting to build all ports would take a much longer time than on the old system, on which it already took weeks. We have a ticket  about adding that capability, and I wanted to see what happened if I requested to build a larger-than-usual number of ports that was still fewer than all ports. I expected that since the Sierra builder has existed for about a year already, most python modules would already have been built due to updates or as dependencies, so I expected maybe a few hundred builds to get scheduled.
What I found was that our implementation of checking whether ports have already been built is inefficient and slow . And I was surprised that 2500 builds got scheduled, which have been building for the past 100 hours. Maybe tons of our python modules are unused and don't get updated. Probably part of the problem was that I had redelivered hook messages that GitHub said failed to deliver  (though they probably did deliver) and as a result, some ports that had been built got completely uninstalled when mpbb inadvertently believed they were too old. (They were, in fact, "too new" -- newer than the old commit whose hook message was redelivered.)
Some Sierra binaries are being produced, but not as many as you might hope with 2500 scheduled builds. Maybe most of the ports being built here aren't distributable.
I have some ideas for changes to buildbot/mpbb which might help. I'll bring those up later.
More information about the macports-dev