Codesigning everything and combatting malicious code

Tue Mar 22 01:20:26 UTC 2022

Sorry, this got long.

I want to move a discussion to the macports-dev list that began with a user's question to macports-mgr. This user ran a third-party utility which reported that some files installed by MacPorts were not codesigned and asked if it was a concern.

I replied that the files installed by the MacPorts installer are codesigned but that MacPorts predates the existence of codesigning and nobody has yet added code to MacPorts that arranges for files installed by ports to be codesigned. Individual ports might codesign their files if that's how their build system was written. On Apple Silicon all (code?) files must be codesigned and the linker (?) takes care of ad-hoc codesigning the files automatically. I also said I wasn't really clear on what purpose codesigning serves; MacPorts and macOS got by fine without it for years.

The user replied asking if codesigning might reduce the risk of compromised open source packages, referencing this incident in which the developer of node-ipc deliberately released a version containing malicious code:

https://www.schneier.com/blog/archives/2022/03/developer-sabotages-open-source-software-package.html

This is a hypothetical question since node-ipc is not in MacPorts and as far as I know a similar situation has not happened with any software that we do have in MacPorts. And since then node-ipc appears to have been forked by a different developer and a new non-malicious version has been released.

I felt that the broader MacPorts developer community might care to hear a reply to this or to discuss whether and how we should modify MacPorts base to codesign everything installed by ports.

~

Certainly the situation with node-ipc was undesirable but I don't see how MacPorts codesigning its files would solve it. I'll talk about that in a minute. First let me discuss what safeguards MacPorts already contains.

MacPorts ports already fetch specific maintainer-tested versions of the source code. If a developer releases a new version, nothing changes in MacPorts until a MacPorts contributor updates the Portfile to use that new version. Before doing so the MacPorts contributor is expected to verify that the port at least compiles on one macOS version; that the port's test phase runs, if the port has one; that "port lint" doesn't show any easily fixable mistakes; and ideally that some basic functionality of the installed files works. If the port has subports, this applies to all subports as well.

MacPorts already checksums distfiles. The MacPorts master build server saves a copy of distfiles (assuming the checksum matches at the time) and these are mirrored on servers around the world. If a developer replaces a distfile on their server with something else (maliciously or not), MacPorts will not install the port using that distfile; it will issue a checksum mismatch error and tell the user how to report it to us. Or MacPorts might download the previous "good" file from a MacPorts mirror, bypassing the developer's new replacement file. When we notice such a "stealth update" has occurred, proper procedure is to compare the old and new file to see whether something malicious has happened so that we can decide whether we should modify the port to use the new file or to bypass it. (We might also bypass a new developer distfile if it contains no relevant differences from the old one.)

Ports that fetch their sources from a revision control system do not enjoy the protection of checksums. Although ports that fetch source from a revision control system specify which tag or commit hash to fetch, it is conceivable that a developer with sufficient access to that repository could delete an old tag and replace it with a new tag of the same name that contains different software. This is one of the reasons why we recommend ports fetch using distfiles, and the vast majority do. We might consider recommending that ports that fetch directly from a git repository (fetch.type git) never use a tag, and always use the commit hash corresponding to that tag, since replacing the contents of the repository while keeping the same sha1 hash is, as far as I know, still impossible in the general case. (Yes, it is possible to engineer a sha1 collision, but only if you can carefully control both the old and new files. Generating a sha1 collision against some existing old file is a very different matter.)

Some ports' build systems fetch additional files from the Internet at build time. This too is discouraged in MacPorts. Who knows whether such build systems download specific versions of those files or whether their integrity is verified in any way. This represents another avenue by which unverified files (which might have been replaced with malicious versions) could make their way into a build of a port. And if that one server that the build system is programmed to download from is down, the installation fails. Ports should specify all their needed files as distfiles and allow MacPorts to be in charge of downloading and mirroring and checksumming them. To my annoyance, developers are increasingly adopting build systems that download their own files. The difficulties that this poses for verifying the integrity of the final product should be communicated to those developers by interested MacPorts contributors. Developers should continue to employ the time-tested practice of offering a single distfile that contains all the code needed to build the project (excluding external libraries).

For most ports, most users can receive binary archives from us and do not need to obtain the source code and compile it themselves. These archives are produced on build servers that I control; nobody else has access or could compromise them (except for the Apple Silicon build machine which is colocated in a reputable data center). All build machines transfer the archives they build to my master build server which signs them with our private key. Nobody has access to this private key except me at the moment. Prior to 2017 when our build servers were hosted by Apple, Apple server administrators had access to the private key. Just like with the distfiles, the archives and their signatures are mirrored on servers around the world. When MacPorts on a user's computer downloads a binary archive it verifies the signature using our public key before installing it.

So nobody could replace distfiles or binary archives on any server without error messages appearing that would stop an installation.

The ports tree that users get from our rsync server as a tarball is also signed with our private key and verified with our public key, so a malicious mirror operator could not, for example, replace Portfile code without verification of the tarball failing.

~

That brings us to node-ipc. On March 7 a developer committed this malicious change to node-ipc:

https://github.com/node-ipc/node-ipc/commit/847047cf7f81ab08352038b2204f0e7633449580#diff-c2dd3b497ae886cfb8f5bf8c66c649fc2ae4afaa6660d9bbf3105d69884679c6

and version 10.1.1 was released shortly thereafter. We don't have node-ipc in MacPorts but if we did and if someone noticed that this new version was available, they would have updated the Portfile's version line, fetched the new distfile, updated the Portfile's checksums to match it, and tried to install the port. They might have run the test suite and/or verified the basic functionality of the files that got installed. Then they would have committed the update. They would most likely not have known about or noticed the malicious code since it was programmed to take effect only under certain conditions.

So now the checksums of the malicious distfile would be in MacPorts. MacPorts servers would mirror the malicious distfile. MacPorts build servers would build it and create binary archives of the malicious version which would be signed with our public key. If MacPorts contained any code that codesigns files, it would codesign the malicious files before they went into the binary archives. So I don't see how the addition of codesigning to MacPorts would prevent malicious code from getting into MacPorts via this method.

It is not MacPorts policy to require contributors to examine every upstream change between the old software version and the new one in order to verify that it is free from malware or for any other purpose. I don't think anybody could be expected to do that. I would guess most MacPorts contributors are not familiar with the upstream code of the projects they maintain, may not even be proficient in the programming language they're written in, and might not be able to recognize a malicious change if one were there. Some projects do not even have public source code repositories so you couldn't review each individual commit and would have to examine a single aggregate diff between an old and new version which could be tens or hundreds of thousands of lines long; asking a MacPorts contributor to ascertain whether such a diff contains malicious changes seems like an impossibility. (node-ipc does have a public repository, and it is a small project and the diff between versions 10.1.0 and 10.1.1 was under 500 lines, so a review of that change would have been possible.)

It is not even our policy that contributors must read the NEWS or CHANGELOG files to see what changed, though I sometimes do this as it helps me discover things like if dependencies need to be changed. This malicious change in node-ipc was obfuscated so the code's purpose could not be seen at a glance. The change was in a JavaScript file and JavaScript files are often "minified" (obfuscated in order to reduce their size for lower bandwidth use) so this might not have been seen as suspicious. The description of the change in the commit message did not seem suspicious, and that project does not maintain NEWS or CHANGELOG files, nor would I imagine that a malicious change would be described as such in them if it did. Seems like more and more projects are ditching NEWS and CHANGELOG files. Some developers suggest reading the commit messages between versions as a substitute, which strikes me as inadequate.

~

So what can we do to combat malicious code in upstream projects? Remove it as soon as we learn about it. Revert the port back to the previous good version. Or patch out the malicious code. Write a comment in the port advising future contributors to exercise special care when updating it and to actually check the differences for potential malware. If the software developer is an ongoing threat consider removing the port from MacPorts or switching to a fork by a different developer if that exists.

I'll also say that all contributions to MacPorts from the public are reviewed by trusted contributors. Proposed changes are submitted either as an attachment in a Trac ticket or as a pull request on GitHub, where potential problems can be identified and fixed prior to those changes being merged into the repository and becoming available to everyone else. Pull requests automatically run continuous integration. Currently we check that the port builds on two macOS versions and run it through "port lint". Additional checks could be added. For example, if there is a tool that checks a directory of files for known malware, we could run that.

Once the changes land in the repository, an email of the changes is sent to the macports-changes mailing list where I and others review them again. All MacPorts contributors are encouraged to subscribe to the macports-changes mailing list to see what's changing in MacPorts, to learn common Portfile development techniques, and to help spot and fix mistakes before they are noticed by our users. This review process would not have caught an update of a hypothetical node-ipc port to 10.1.1 since the bad code was not in MacPorts but in the upstream project, but it constantly helps uncover and fix other issues.

~

Could MacPorts codesign everything installed by ports? If so, should we? What benefits would that bring? How would we do it?