Packages Not [was Re: ambivalence about fortran (was Re: numpy & non-Apple gcc?)]

Tue Sep 21 01:29:41 PDT 2010

Jordan K. Hubbard wrote:

> On Sep 20, 2010, at 2:20 PM, Ryan Schmidt wrote:
>
>> I have been interested in binary packages but have not really  
>> understood how people wanted to do it, so this thread is  
>> instructive to me.
>>
>> I had assumed that this "tool" you talk about being run would be  
>> MacPorts: the port command: "sudo port install foo", that it would  
>> just be enhanced to be able to download pre-compiled binaries that  
>> matched the OS, architecture and variants the user requested, if  
>> such existed on our hypothetical binaries server, and if not, it  
>> would build from source as it always has so far.
>
> So, here's the 10,000 foot view of packages in the hope that it may  
> be somewhat instructive (sorry, Jeremy, I know you're all about the  
> archives but archives just don't interest me since they cannot be  
> used stand-alone and separate from MacPorts ;-).
>
> First, and to answer your specific question, if you have to fall  
> back to building from source then you're really talking about a  
> hybrid system and I think hybrids basically suck because they fail  
> to keep church and state separate. [...]

> So, to finish the "mission statement" side of this, builders should  
> build and installers should install.  That is the separation of  
> church and state which allows people like me to sleep at night,  
> knowing that the entire "assembly line" has been put together as  
> robustly, and with as many clean points of separation, as possible  
> and I'm not going to have anything actually permute a system unless  
> I explicitly ask for that to happen.

This is what RPM does already... rpmbuild(8) builds, rpm(8) installs.

Building has BuildRequires, Installing has Requires. Otherwise both  
use .rpm packages (for digests, signatures, etc). There's even  
separate types of binary packages. Some packages have really long and  
complex needs for building from source, but installing the binary  
does not. So having "foo" and "foo-devel" helps cut down the number  
of dependencies, when not always wanting to install the developer  
files too (like done with ports). Libraries can also be split off  
from programs, same reason.

Fink is an example of how a "package" system works with a "port" one.

> So, as complicated as everyone seems to like to make this every  
> time the subject comes up, the actual task to be done is fairly  
> simple.   MacPorts needs to work essentially like this:
>
> port install foo
> entails:
> 	fetch foo
> 	extract foo
> 	configure / build foo
> 	"install" foo to destroot
> 	make package from destroot (say, just for discussion's sake, that  
> package goes into /tmp in this scenario)
> 	cleanup destroot and basically "finish" the macports phase
>
> 	package-install /tmp/foo.xpkg

Where the only difference to what it does now is that the final  
install step is separate from the others, so that you can start with  
just "pkg(1)" and "*.xpkg" and not have to do all the rest ? Archives  
are different in that they only allow you to shortcut the "destroot"  
phase, but do need all of the others. Packages would need to have all  
runtime information available, without peeking in the port. I think  
you are missing the "activate" step above, but then again I don't  
think that step is very useful.

>> From an end-user perspective, as Jeremy has already described,  
>> "port install foo" is still all that anyone using MacPorts  needs  
>> to know and it's still just at arms-length as it always was, it  
>> just makes MacPorts simpler to only have to generate packages as  
>> its end-goal.  It does not have to worry about installation,  
>> rollback, upgrades or any of that stuff since that is the job of  
>> package-install / package-delete / package-upgrade, and the  
>> creators of those tools can take on all of the security / auditing  
>> requirements that any software which actively changes your system  
>> ought to at least pay lip service to - macports does not have to  
>> worry about it because it is not, in a large sense, macports'  
>> problem anymore.
>
> Now, if that all seems maybe a bit *too* simplistic, it's because  
> it is.  There is some stuff that needs to be portaged across from  
> the Portfile (or some other source of metadata which lives in  
> MacPorts) in the form of package metadata because, as I've already  
> pointed out to Jeremy et al, there are things which don't simply  
> fit into the archive.  You probably want a list of checksums to  
> compare the target files against, just to make sure the package  
> file was not corrupted/tampered with in transit, and then there are  
> files which don't actually get installed with the package but need  
> to be preserved across upgrades (configuration files that the end- 
> user creates, mostly).
>
> There are also the usual post-install actions ("install user  
> postgres for this postgres database package", etc) which need to be  
> expressed in some form - preferably not just by throwing a shell  
> script at the problem, either, since a shell script is hard to  
> introspect and audit.  A list of "requirements" as an XML file  
> could probably do the job just as well, assuming you designed the  
> requirements with some reasonable degree of care and had the  
> requisite code in package-install to parse and act on that data as  
> a final package install step.   We (Landon and I) got ourselves  
> wrapped around the axle in the past trying to figure out a scheme  
> where the existing Tcl procs (post-install, et al) could be run by  
> the package runtime, making the Portfile the single authoritative  
> source for all of this information, but maybe that was setting the  
> bar too high and we should have aimed at something far more  
> simplistic.  It's never too late to do that.

It just seems like a lot of work, making a new package manager and  
reinventing digests and signatures and embedded scripts and storage  
formats ?

At least that was one of the main reasons for making http:// 
rpm4darwin.sourceforge.net/, all of that was already available  
elsewhere (RPM).

> Daniel also chastises me somewhat for "having a lot to say" every  
> time this topic comes up, and I apologize if I've seemed overly  
> verbose over the years, but I hope folks will cut me at least a few  
> inches of slack given that I have actually written all of this crap  
> from scratch before and know how it actually works in practice, at  
> least. :-)

Having switched from Darwin to FreeBSD, I would say I prefer it...

Could just be that the FreeBSD OS is easier to handle than the Darwin  
OS (talking about the "pure" variant!), or that Ruby is nicer than  
Tcl for writing code other than the build recipes. Perhaps it is the  
available releases with binary packages, or the stricter porting. It  
is definitely _not_ the KDE, as I would normally be using Xfce with  
either.

Just wished that something much better was available for Mac OS X.

>> I had no idea there was a goal for users not to have MacPorts  
>> installed.
>>
>> We just recently added the hard requirement in the MacPorts  
>> installer that Xcode be installed, since not having Xcode  
>> installed was causing so many various bug reports. But if MacPorts  
>> gets a binaries server, and the ability to download and install  
>> from it, that requirement could be relaxed and the check moved  
>> into the port command in such the port command only errors out and  
>> notifies the user Xcode is required if a binary package matching  
>> the user's request was not found.
>
> You could do that, or you could just punt and say "file a request  
> asking for a binary package for foo" and not even try to make  
> MacPorts do any complicated fall-back behavior (since, again, this  
> behavior would actually have to be in package-install, and the  
> church/state line starts getting blurry again).  That said, I see  
> no reason why package-install couldn't ask a server "in the cloud"  
> somewhere to actually build a package to spec, and get this package  
> once done.  The very act of making the request could cause the  
> server to create the package on demand, in which case all of your  
> abstraction boundaries remain nicely preserved.

That would be a nice addition, either way (archives or packages).

The major *user* annoyance is the requirement to build, not Xcode.

> The reason not to have MacPorts installed is largely one of  
> purity.  If you're an end-user who does not care, then you use the  
> packages collection through some Cocoa / Web front-end and never  
> have to even know MacPorts exists.  If you're a developer /  
> propeller-head type and actually do want to build stuff yourself,  
> then you can download MacPorts and use it the same as aways, the  
> actual package step being embedded into/hidden by "port install" as  
> described above.  I just see no reason why a user who cares only  
> about binary packages should have to download MacPorts since the  
> two are logically distinct.

And I didn't think MacPorts cared about making binary packages...

So I see no reason for that average user to use MacPorts either.

> You also ask why they have to be separate, and I guess if I had to  
> give a one-word answer other than "purity", it would be security.   
> The smaller your attack surface is, and getting bits onto a user's  
> system in an untrusted/unauditable/one-way fashion is definitely an  
> attack surface, the better off you are.  It doesn't even have to be  
> a malicious type of attack, but simply one caused by inattention to  
> some detail.   If the set of operations can be constrained to not  
> calling out to a 3rd party build system, that's a lot less attack  
> surface to worry about and maintain control of.

And of course, the Developer Tools DVD is a quite big download ?

Offering a GUI would also allow less foot-shooting opportunities.

--anders