[GSoC][Binaries support] Architecture

Jordan K. Hubbard jkh at apple.com
Sun Mar 27 13:21:12 PDT 2011

Whoops, this time it was my turn for a premature Send!  Continuing from where I left off...

On Mar 27, 2011, at 12:42 PM, Jordan K. Hubbard wrote:

> On Mar 27, 2011, at 1:58 AM, Anders F Björklund wrote:
>> (continued, from premature send)
>> Jordan K. Hubbard wrote:
>>> That deceptively simple little command is the reason we don't have binary packages *already*.  Seriously.  If we hadn't hit this issue and gotten into a big argument about how best to solve it, there would have been binary package support in the very first version of MacPorts that Apple released!   The problem isn't the specific post-install rule that ccal is using, either - it's very simple, yes?  A simple ui_msg to tell the user how to use ccal.  The problem is that you can do *anything* from this rule and over 115 ports in the current collection do.  They use it for creating custom users in the accounts database (if you install postgres, for example, it needs a postgres user account to run as), they use it for emitting helpful "post installation comments" as you see above, they can create special symbolic links or create custom configuration files, etc.  Go grep for post-install in all of the ports today and you will quickly come to realize the scope of the problem, and that's just today with a comparatively small 7870 ports in the collection (FreeBSD has over 22000).
>> There could have been archives (pre-compiled destroots), even if the ports wasn't ready for binary archives.
>> But in the first versions of MacPorts, it did build .rpm packages of all of the ports. Don't recall scripts.
> Jeff semi-flamed me in a private email (Hi Jeff!) over not making specific mention of RPM in my little screed, but as I said to him, sure we created a bunch of files that ended with the .rpm suffix but we never actually created "RPMs" in the full sense of the word.

... And what I meant by "in the full sense of the word" was that the RPMs (and .debs) we created were pretty much just for testing purposes, where we completely avoided the post-install / activate issues I talked about in my earlier post.  It was therefore never possible for us to create a collection of RPMs (or .debs or .pkgs or any of the other formats we "supported") and say that "pkg install foo" was completely trustworthy in the sense that it would always create the equivalent of "port install foo" and you (the user) could now go either way based purely on personal preference.   If we had, I suspect those early .rpm collections that Jeff put together, or the .pkg collections that *I* put together (with even less in the way of postflight functionality or even the ability to automatically install dependent packages), would have attracted an enthusiastic user base, particularly amongst the disk-space-deprived folks for whom installing all of DevTools was a big deal.

>> 1. pkg needs to be able to instantiate a tcl runtime environment that can pull in all of the port 1.0 Tcl files such that the post-install (and any other special post installation procedures) can run with the full environment, e.g. commands like "ui_msg" will be present and variables like ${name} will expand properly.
> Maybe this is better handled by "port -b", if it needs the full environment anyway ?

I'm not sure what "port -b" is, but I'll assume it's "install from binary package."  Let me tell you a little story that, I think, will also make my point nicely:

When I initially checked in that first version of bsd.port.mk which implemented the BSD make macros which became the "FreeBSD ports collection", I already had a small collection of few dozen ports to go with it which I'd used to validate the concept, and all of them simply used a make install rule (in effect) to install their bits on the system.  That initial few dozen ports quickly grew into 200-300, at which point I turned day-to-day responsibility for the whole mess over to Satoshi Asami, who then proceeded to make the ports collection the success it is today, but now I'm getting ahead of myself.  Free'd of the basic day to day responsibilities for actually running the bake-it-yourself pizza factory, I listened to what the users were asking for and, not surprisingly, the #1 thing they were asking for was the freedom to stop building software and start using the end-results directly.  I then wandered off into a corner and wrote the completely bletcherous pkg-install suite of tools which used tar as a transport with some embedded metadata (+CONTENTS) to drive the installation process via a very minimal little "DSL" (really just an installation state machine that mutated and grew out of control).  The package tool suite was intended to essentially replace the "make install" phase I had just written over in bsd.port.mk, nothing more or less really, and this showed itself in the design.  Once I had the package format more or less done, I then proceeded to behead the install rule in bsd.port.mk to simply install the package that previous build rules had created.  Now I had successfully chopped the pizza factory into two parts:  The part that made the pizzas (ports) and the part that "ate" the pizzas [Ed Note:  This analogy has gone completely off the rails], the package installer tool.  This also enabled us to find and fix a large number of problems with semi-deterministic installation procedures that had otherwise gone undetected due to the the extra facilities available to make(1) essentially hiding our sins, and ever since then, ports has not been in the business of installing software, it's been in the business of creating packages and that's it.  The "make install" bit in ports is just a shim to pkg-add, for backwards-compatibility reasons.

My little story doesn't actually help us to decide whether to express package installation commands as Tcl procedures or a mixture of shell commands and property settings which are consumed by some pkg tool, of course, but it does demonstrate that you can actually re-converge the evolution of these two separate problems by brute-force if you really want to. :-)

>> 2. We need to create a different set of "installation commands or scripts" that are associated with a port and specifically designed to run at post-installation time, e.g. instead of a post-install command in Tcl, we would have some sort of "install script" that is bundled with the port and merely relies on a few key environment variables to be set in order to run.  Then all 115 ports which currently declare post-install need to be modified such that they use the external script instead, and the port(1) system needs to stop running (or looking for) the post-install command and instead chain to the script so that "port install blah" still does the same thing as "pkg install blah".
> This would probably be better, either @exec/@unexec or something much more "tailored" ?

If you're willing to "sin" from a security standpoint by allowing arbitrary @exec/@unexec operations to occur, then you can just skip the tailoring and leave things open-ended (or defined by policy) since you have no realistic hope of controlling what packages do if that's your sin of choice.  Of course, If you think you can actually express the full set of installation operations as a set of properties which folks who are used to "make install" can also realistically cope with then you're a smarter / better man than myself and I wish you all the luck in the world, seriously, because I will be observing with great interest! :-)

>> Option #2 is the simplest in terms of not having to instantiate a full Tcl environment instead of pkg(1), but you'd then have to change the way all ports/pkgs are installed.
> I think it would be best to *not* have a Tcl interpreter (and thus ignore +PORTFILE), as long as it can interact with the registry. It should have "enough" information in the +CONTENTS, even if it requires modifying those 115 out of 7870 ports that require it. Otherwise there isn't much of a difference between port(1) and pkg(1) ? But you would still need to re-implement all the other bits - like compression, signatures, depsolving, etc. etc.

As long as the registry can be interacted with by @commands in the +CONTENTS file as well, sure.  Say an @exec command creates a configuration file on the package's behalf at install time - I need to be able to then declare what that file name was to the registry even though it wasn't in the original packing list.  Similarly, an @unexec might want to remove something.  Keeping the registry in sync is one of the MacPorts-specific challenges that actually having a registry database entails.  I don't know if the FreeBSD folks ever added one (I suspect they did), but they must have figured this out in the pkg tools if so.

>> I honestly have no preference at this point given that Landon and I managed to argue ourselves into a stalemate over the question of port runtime versioning, how to deal with different architectures (there's not just a Tcl piece, but also Pextlib) and blah blah blah, and in hindsight I think we vastly overcomplicated the problem in our minds.  The actual implementation, particularly now that we've learned that the port system never versions (we're still at 1.0 how many years later? :)) and there's really only one architecture working caring about now (x86_64), of option #1, should you choose to go that way, should be pretty easy to get to "good enough" status.
> MacPorts spent a couple of years discussing the optimal logging format, instead of actually implementing it. So that seems rather common. Hopefully at least archives can be delivered this time, before moving on to packages.

I'm still not sure I understand the distinction if archives become "smart enough" to also be packages.  I think that's also a good thing.

- Jordan
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.macosforge.org/pipermail/macports-dev/attachments/20110327/1a83e7ae/attachment-0001.html>

More information about the macports-dev mailing list