GSoC Proposal

Wed Apr 12 22:37:55 UTC 2017

Hi,

Just to clarify one thing (also to any other GSOC applicant): this
discussion does not provide any indication about the ranking of the
submitted proposal or whether this or any other proposal might be or
not be accepted. There might be better or worse applications where
less clarification is needed. Independent of whether or not the
proposal would be accepted, I felt that this particular idea needs
some further discussion & clarifications.

Another disclaimer to Zero King: I'm clearly not an authority and my
thoughts might not fully match with the views of other developers, so
please don't take all of my words or ideas for granted. This is
supposed to trigger discussion, not suggestions that one should
blindly follow.

I'm also aware that it makes absolutely no sense to convince someone
to go into directions that don't sound like fun to him, so consider
this just brainstorming for now.

On 2 April 2017 at 14:21, Zero King wrote:
>
> Travis won't keep the VMs up. Every build gets a fresh VM (no need to
> cleanup) and could only run for 50 minutes so if a build takes too long it
> will fail anyway.
>
> You're right about high server load. Since we already have the Buildbot
> infrastructure I'll only do install tests in PRs. Lint tests will be done
> for both commits and PRs. I'll limit the number of ports to test so if a PR
> touches too many ports it'll only do lint tests. We can also use tags like
> [ci-skip] in PR title or label to skip some tests.
>
> Installing from packages.macports.org would avoid rebuilding distributable
> ports after minor edits.
>
> Travis is unlikely to ban us but if that happens we can keep only macOS
> 10.12 or only the lint tests (which can even be done on Linux) and contact
> them to unban us.

I was talking to someone from the HB community. Apparently they tried
very hard to make Travis CI work. One of the major problems they had
other than the time limit (the builder is relatively slow and jobs get
killed after 50 minutes, so any builds like Qt would fail of course)
was the fact that it was impossible to upload the binaries from
"untrusted pull requests" (not sure what that means, in any case, it
didn't work for them). They set up their own build infrastructure
almost from scratch. We clearly don't need to upload the result
anywhere, but the limitations will stay in any case. I would find it
valuable if we put our effort in something that can be expanded.

Another option being mentioned was CircleCI (no limitation to upload
files), but that one has an additional limitation of only 500 free
minutes per month for OpenSource orgs, so most likely not anywhere
near enough.

I was thinking about two aspects:
(a) It would in fact be helpful to have some build jobs running
automatically for PRs
(b) But we also need something to test complex builds. Exactly those
where maintainers can hardly afford to build and test them on their
own machines.

>> The alternative can always be to use our own server for builds, but of
>> course that brings other problems to the table (mostly security) and
>> requires a completely different approach.
>
> Yes, mostly security considerations. We'll have to use VMs and revert to
> snapshot after every build in our own servers. Managing VMs and sending data
> around securely would be painful to setup.

>From what I understand there are two different kinds of security issues:

(a) People being able to submit arbitrary code as pull requests

(b) Most of committers are probably not really inspecting diffs or
full sources of packages. In theory one can write some originally
useful opensource tool that we package. Once people start using it,
some problematic code gets introduced, we happily upgrade the port,
and both our main builder and any user installing that port would be
affected.

We currently don't have a problem with (a) because we don't build pull
requests, and with Travis we would simply outsource the problem to
someone else. For (b) we currently don't have any reliable solution,
but using Travis would not actually help since a maintainer would
probably eventually merge the code (if it looks reasonable). And then
the code from official repository would be built on the main builder.

I'm thinking of an alternative approach that would cover another use
case (which cannot be solved by Travis CI due to limitations).

Use cases I have in mind:
- developer testing a new release of Qt; or a bunch of KDE apps
- developer trying to build all the 1000+ perl modules once perl5.26
gets released
- developer without access to older machines, trying to figure out why
the build fails on 10.6; having some clue why that could be, but
trying to avoid doing ten semi-random commits to the main repository
just to test the hypothesis

What we *could* do is to set up a few more build slaves on the
"existing" infrastructure (the build slaves could actually be
anywhere, people could even volunteer to provide access to their
machines; but I guess the idea would be to eventually set up a few new
ones next to the existing slaves). The build master could either run
in the same instance or separately.

The idea would be that people with commit access would have to
manually approve building of a particular pull request (and then that
pull request would get a green icon of course once the build is
complete). When a pull request gets updated/modified, the build could
be tested again, but it would have to be triggered manually again.

For each manually approved/triggered build the build slave would:
- clone the repository
- (?) most likely rebase that on top of master to make sure that the
latest packages of everything would be used at the time of build
- determine which ports have been modified
- maybe run lint as an initial quick test (?)
- build those ports in a very similar way to what the buildbot does at
this moment, with a few exceptions:

  (a) packages from other/existing ports would be fetched from the
main builders, perhaps even when they are not distributable (we could
have private copies of binary packages)
  (b) the resulting packages would be stored between individual builds
that belong to the same PR, but they would not be uploaded anywhere
and both sources and binaries (basically anything that has been
fetched or modified) would be deleted once PR testing is complete

All we need to implement this is:
- The code (Bash/Tcl) that removes anything that has been downloaded
and built when testing the pull request. As a first approximation we
could actually remove *all* binaries and all distfiles provided we
have a nearby mirror from where we can fetch an up-to-date version of
everyting.
- Some glue code that knows how to communicate between GitHub and
BuildBot: which jobs to submit, how to report success etc.

The glue code might need some time to be implemented, but it sounds
like something in the same direction that you already proposed with
that "go" bot. To be honest, I currently don't know yet what exactly
would be needed on that part.

One of the quick&dirty options would even be to add an additional
custom field (like "Port list" on https://build.macports.org/builders
that allows rebuilding an arbitrary port) saying "PR Number" where a
developer would enter the number of the pull request and that would
then be built. Another alternative would be to write a simple website
interface where developers could log in with OAuth and click on a
build button for open pull requests. Just brainstorming.

One of the things I really miss is a website with easy access to build
summary on per-port basis:
- https://trac.macports.org/ticket/51995#Statistics
(This might be something that might even be possible to implement as a
special view right on the buildbot.)

Please don't get me wrong: I'm actually all for the idea to get builds
running on Travis. It would be super helpful to have an "alternative
implementation". But I would find it even more intriguing if we would
keep the liberty of being able to run PRs on exotic setups like
10.6/libc++ and be able to do test builds of Qt, KDE and other huge
projects. (Or have both? :) But maybe other developers disagree?

Many of the building blocks (other than integration with GitHub) are
ready or nearly ready. There are also a number of builbot-related
issues that are waiting for someone to pick up the work.

Working with buildbot requires Python. Fixing mpbb scripts requires
some Bash and Tcl in any case, but probably the same level as if you
work with Travis.

Independent: it would be helpful to hear a better
justification/defence for selecting Go as the language of the "helper
bot".

Also, we would still have to find the main mentor for this project in
case it gets selected (or rather the other way around: having a mentor
assigned before the end of evaluation period is a strict prerequisite
to even qualify). That's mainly a question for other developers on the
list. We do have some volunteers for backup mentors.

Mojca