ports with bootstrapping dependencies

Fri Jun 22 12:12:14 PDT 2007

Hello!  I'd like to make a port for MIT Scheme, an implementation of
the Scheme programming language.  The port is somewhat complicated,
though, because of a bootstrapping process.  However, I think that I
have found a way to make the bootstrapping process work without
introducing circular dependencies, which requires only a small change
to MacPorts' dependency specifications.

MIT Scheme is a large program written primarily in Scheme, and based
on a native-code compiler.  This would mean that building it is
impossible without an older version of itself, but its compiler also
has a C back end, and the next snapshot to be released very shortly
will support the construction of distributions with pre-generated C
code by which the whole system can be bootstrapped.  This means that
it is easy to write a port that just fetches a tarball with the
generated C code and bootstraps the whole system from that.

The trouble with this is that it is very slow, and only the C back end
can be built from generated C code.  I'd like to have another port for
the native back end, so I'll call the port with the C back end
`mit-scheme-c', and the port with the native back end
`mit-scheme-native'.  Building Scheme from scratch is slow (requiring
about an hour or two on my 2 GHz MacBook), and building either back
end with the other back end is also slow (C -> native or native -> C).
If the user already has an older version of either back end, then
upgrading to a newer version can run much faster (about ten or twenty
minutes for me).

In an ideal world, I could have two portfiles, something like
<http://people.csail.mit.edu/riastradh/tmp/liarc-Portfile> and
<http://people.csail.mit.edu/riastradh/tmp/native-Portfile>:

mit-scheme-c has three variants, all of which are mutually exclusive,
and which I'll write with a `+' prefix just to visually distinguish
them from port names:

  +from-scratch requires nothing but a C compiler and an operating
    system, fetches a tarball containing pre-generated C files,
    and bootstraps the whole system from that.  Slow.

  +from-c requires mit-scheme-c (usually an older version), fetches a
    pristine source tarball, and just builds the system, without any
    extra intermediate bootstrapping stages.  Decently fast.

  +from-native requires mit-scheme-native, fetches a pristine source
    tarball, and bootstraps the system with an intermediate
    cross-compiler.  (This one isn't actually very useful; it does
    more work than +from-scratch, because it has to prepare what was
    pre-generated for +from-scratch.)

mit-scheme-native has two variants, both of which depend on an
existing installation:

  +from-c requires mit-scheme-c, fetches a pristine source tarball,
    and bootstraps a native compiler from that.  Slow, but this is the
    only way to get a native system to begin with.

  +from-native requires mit-scheme-native (usually an older version),
    fetches a pristine source tarball, and builds the system without
    any extra intermediate bootstrapping stages.  Fast.

Unfortunately, `mit-scheme-c +from-c' and `mit-scheme-native
+from-native' don't work because of the obvious bootstrapping problem.
I discussed the issue with Juan Manuel Palacios and Eric Hall on IRC,
and was informed that the dependency engine would require an
inordinate amount of work before this kind of nearly circular
dependency would work at all.

However, it occurred to me that we could somewhat dangerously work
around the *port* dependency by rendering it into a *binary*
dependency -- that is, the +from-c variants depend on the existence of
a binary called `mit-scheme-c', and the +from-native variants depend
on the existence of a binary called `mit-scheme-native'.  MacPorts do
have a way to specify a dependency on a binary file by some name --
`depends_* bin:foo:bar' --, but it requires a port to be associated
with the file.  What I need for MIT Scheme is *almost* this: if the
file isn't there, I want MacPorts not to try to build the port
associated with it, but just to give up; users shouldn't use the
+from-c or +from-native variants if they aren't available.

What I propose, then, is to allow a dependency of the form
`bin:<filename>', without an associated port (or perhaps with
`I-know-what-I-am-doing' or `just-complain-to-the-user-if-unavailable'
instead of a port).  Although it is slightly dangerous to do this,
it's a very small change that lets us work around what would otherwise
appear to be circular dependencies, and it would mean that upgrading
the MIT Scheme port could take fifteen to thirty minutes, rather than
anywhere from two to four hours.

Thoughts?  Does this sound reasonable?  Is there another way to go
about this that I hadn't thought of?