talkin' 'bout GSoC 2011

Tue Mar 29 15:51:22 PDT 2011

Hello,

I've been somewhat active on MacPorts previously and am interested in becoming more active via GSoC 2011 in particular.

Here's a little bit about me: I worked for nine years at a financial sector multinational in a variety of roles in Unix infrastructure. Ours was a pretty interesting Unix environment, supporting tens of thousands of Unix servers running highly standardised Linux and Solaris builds. The distributed architecture was derived in significant part from the Athena and Andrew work done at MIT and Carnegie Mellon. Software distribution was built around an in-house integration of AFS, which allowed us to run almost everything out of a network filesystem that allowed updates to be made to dozens of sites globally in as little as a few minutes. (If you're familiar with AFS, what we had was a system for sync'ing contents across cells and brokered delegations of afsadmin access for particular content areas within a cell.) Part of that framework was a structured namespace that supported multiple versions of software projects and cross-platform builds (without fat binaries but with the availability of AFS @sys, the mechanics were substantially different to what is possible on OS X). We had a system somewhat like Macports that was used to build FOSS software for multiple platforms and run it out of AFS, although it consisted largely of an additional layer of GNUMakefile infrastructure rather than something Tclish. I did some work in porting software, as well as writing system integration tools in perl and C and looking after security architecture and tooling.

I'm recently started an MSc program in computer security and forensics. I've spent some time recently learning Python, as well as getting more familiar with C, which I've used pretty sporadically. I don't have much experience with Tcl, but I've been learning lately to understand macports.

My interests in macports are pretty diverse, so I'll start with some more blue sky ideas I have based on previous experience. I'd like to see macports become a more distributed system with broader capabilities across the development lifecycle (e.g. able to do automated building and testing, able to build and distribute packages, or to provide builds software for running off of network storage). I'd like to see a profile, test, fingerprint, and, sign implementation to provide more robust comparisons between builds or even update decisions (e.g. I built on this platform, with these deveopment tools, against a code base with this signature, using these parameters, got binaries with platform/section signatures that look like this, had dependencies that look like this, ran the following test suite, and got the following results; deploy the latest version of X as soon as test suite Y is published under Z's signature; if I build using these parameters vs. these parameters, I can see that the changes look like this and limit test coverage accordingly; if I rebuild from scratch, I can confirm that I get identical results; I built this port just like someone else but have dependencies that are built differently, which may explain why things don't work the same). I'd like to see the namespace support more strict dependency information (e.g. rather than using -L /opt/local/lib and -I /opt/local/include, I'd prefer to see something closer to -L /opt/local/var/macports/software/db48/4.8.30_0/lib, etc., as well as symlinks to direct children of /opt/local limited to things like /opt/local/bin and maybe /opt/local/man) and to simplify support for concurrent version deployment (even if it's just to overlap through testing and deployment and simplify back-out). (The best public reference I know on how to do the namespace design is OpenEFS, see http://openefs.org/content/introduction-basic-structure-efs.) I suppose my observation is that macports works great if I'm using it as I am, running on a single system to get the latest and greatest FOSS products. On the other hand, I can't imagine the implicit port lifecycle envisioned by macports as adequate to more demanding production environments where there's more focus on testing and stability, not to speak of the kind of controls people are looking for in the context of regulatory audits. I don't know how much that overlaps with the target audience for macports, but I would like to understand in any case what that looks like.

I kinda think security's going to be a profound problem for macports as long as so many commands are run as root under sudo, and the Portfiles are scripts rather than configuration files per se, running without in any kind of sandbox or sanitisation. This makes it really flexible because it's endlessly extensible and pretty endlessly exploitable for the same reason. If I want to exploit a macports user, all I need to do is MITM the connection to rsync.macports.org (which doesn't use ssh to provide transport security or end-point verification), take something I know my target uses, throw in a script I'd like to run under files for that port, make it executable, and call it via system from the Portfile. If I want to screw a lot of people, I break into repository and modify a popular port there (most obviously the macports port, if that project comes through ;->) or find a host that allows me to intercept and modify the traffic. None of that is going to register with the port checksums (and checksums are nice, but I'm not particularly convinced that it's better to have three different checksums run against the source tarball but no verification of patches or preference for signature-based checks like GPG to make sure that it's the right code from the right source). This strikes me as a little crazy. I realise that there is a high degree of inherent trust being placed in a system that manages software installation, but Macports trusts in a lot of places that seem difficult to justify and easy to exploit. For one, I don't really see why most of the cases for using scripting for non-standard software is required in the first place, as the simplest answer seems to be patching in the same operations as Makefile that macports can drive, where the current system allows but doesn't force more layering and compartmentalisation.

Also on the topic of security: as I see it, pretty much the only place that root is consistently needed is for the install step (as opposed to fetch, extract, configure, build, etc.), but the way the system currently works, people use it as root most of the time. Other than that, running the code as a non-privileged user, who would own most of the filesystem data, seems far more sane, and running client-server might allow much more of the current start-up time to be passed along to the server, making the client a lot snappier.

Maybe I'm missing something with these conclusions, but it seems to me that the current system has been designed with a strong bias to ease of maintenance and use and not a lot of consideration for security. I've been using the system with much higher expectations of security, and it's only when I've spent some time with the code recently that I've come to appreciate how considerable the gap is from those expectations.

As far as interest in listed projects, some of the blue-sky ideas I've mentioned tie into the work that's listed for dependency analysis and testing. I've been learning how to use muniversal as a port group, so this is something I'd be happy to do. I've also been trying to get NSS to build as universal (at least 32- and 64-bit), allowing for universal xulrunner and xpcom, most of which is in hopes of providing pyjamas for OS X, which doesn't seem to particularly exist at the moment. I've also started hacking on getting a current port of StrongSwan VPN for IKEv2 functionality on Darwin (the last working port of StrongSwan for OS X looks to be two years old). If providing additional ports is work that could be considered part of an application, those are things I'd be happy to bring to the table (and want to sort out in any case). I appreciate that a lot of what I've laid out isn't a summer project, but I don't see talking more broadly as a problem in this context–I'd like to be engaged longer term with the project, just as I'd like to get a sense of where people see it going. I also understand that GSoC encourages having discussions about idea before applying, so I thought I'd start in that spirit.

Anyway, hello, and, I hope this gives us some topics for further conversation.

Cheers,
Bayard
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.macosforge.org/pipermail/macports-dev/attachments/20110329/c663ce3b/attachment-0001.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: smime.p7s
Type: application/pkcs7-signature
Size: 1515 bytes
Desc: not available
URL: <http://lists.macosforge.org/pipermail/macports-dev/attachments/20110329/c663ce3b/attachment-0002.bin>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: PGP.sig
Type: application/pgp-signature
Size: 203 bytes
Desc: This is a digitally signed message part
URL: <http://lists.macosforge.org/pipermail/macports-dev/attachments/20110329/c663ce3b/attachment-0003.bin>