We need a v2 Golang Portgroup
Austin Ziegler
halostatue at gmail.com
Fri Aug 30 02:35:40 UTC 2024
Several months ago, I posted a message asking about the Golang Portgroup
and offline builds[1] and got one response which suggested that this was a
broader ecosystem problem[2]. There are problems unique to how the MacPorts
Golang portgroup handles `go.vendors` and how it does so *differently* than
the Cargo portgroup handles `cargo.crates`, and it makes it such that using
`go.vendors` for Go packages is wasteful, inconvenient, and IMO the wrong
choice with the v1 portgroup.
For those not familiar with Golang portgroup internals, the
handle_set_go_vendors proc attempts to translate the provided `go.vendors`
package into a source repository package, with whole project and subproject
support for GitHub-based packages (this is broken; see below) and whole
project support on Bitbucket, gitlab, salsa.debian.org, git.sr.ht, and
go.googlesource.com — with no support for other domains. This means that,
although `go install` will work for a cloned repository because of the way
that `go mod` works (this is the way forward; also see below), it cannot be
resolved into a meaningful dependency by the portgroup, leaving online
builds as the only viable option.
The fact that project and subproject support is entirely repository based
is a much bigger problem with the v1 portgroup than it might appear,
because it means that multiple archive files must be downloaded for
different tags. In the case of `gopkg.in/yaml`, if I refer to both `v2` and
`v3`, then I’ve duplicated that twice. But when I start adding AWS
dependencies:
```
$ go get github.com/aws/aws-sdk-go-v2/aws
$ go get github.com/aws/aws-sdk-go-v2/config
$ go get github.com/aws/aws-sdk-go-v2/service/dynamodb
```
I’m now downloading the *full repository archive* for each and every direct
and transitive dependency in that repository. In February, I found that it
was downloading > 10 copies of the Google Cloud repository, and would be
doing the same for the AWS and Azure SDKs as well. If GitHub allowed one to
request a *partial* archive, this would be less problematic, but it is not
possible.
However, Go has already solved this problem and provides authenticated
sources for those subprojects through GOPROXY:
```
$ curl
https://proxy.golang.org/github.com/aws/aws-sdk-go-v2/service/dynamodb/@v/v1.34.6.info
{"Version":"v1.34.6","Time":"2024-08-22T18:48:12Z","Origin":{"VCS":"git","URL":"
https://github.com/aws/aws-sdk-go-v2
","Subdir":"service/dynamodb","Hash":"d1d210dd11afc2ef0efa2e980d6b6c09cac0f5cb","Ref":"refs/tags/service/dynamodb/v1.34.6"}}
$ curl -sSL
https://proxy.golang.org/github.com/aws/aws-sdk-go-v2/service/dynamodb/@v/v1.34.6.zip
-O
$ ls -l v1.34.6.zip
.rw-r--r--@ 315k austin 22 Aug 15:17 -I v1.34.6.zip
```
It's only a little more complicated when pulling down these files when
referencing hashes, but Google is providing the package infrastructure
while making it *look* like it's just a Git clone (it sometimes is, but not
often).
This, by the way, is really close to what the `rust::handle_crates` proc
does, with explicit support for `cargo.crates_github`.
I’d like to explore how this might be approached as it would allow us to
take advantage of the same sort of caching that the Cargo portgroup
provides and will result in smaller build directories (only the required
code from each subresource is included), faster builds, and as far as I can
tell, working with *any* dependency module URI.
There are some challenges to this:
1. Go's hash calculations are stable based on the *contents* of the dep
zipfile[3], not the zipfile itself. (An approach *similar* to this would
likely be advisable for Macports itself as we were affected by the GitHub
archive apocalypse[4]. It would require changing every hash calculation,
though.)
2. go2port would need to be updated, but I think that this would
*simplify* the approach by allowing it to only parse `go.mod` and/or
`go.sum` for the target repo), and not all ports would be able to use even
this new offline approach.
3. I don't really know where to begin to explore this, and if I get it
working for a few test cases, I don't know the process for upstreaming this.
-a
[1] https://marc.info/?l=macports-dev&m=170796760520486&w=2
[2] It might be, but many ecosystems—including Go—handle offline build
requirements with vendoring.
[3]
https://cs.opensource.google/go/x/mod/+/refs/tags/v0.20.0:sumdb/dirhash/hash.go
[4]
https://github.blog/changelog/2023-01-30-git-archive-checksums-may-change/
--
Austin Ziegler • halostatue at gmail.com • austin at halostatue.ca
http://www.halostatue.ca/ • http://twitter.com/halostatue
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.macports.org/pipermail/macports-dev/attachments/20240829/0055e2e5/attachment.htm>
More information about the macports-dev
mailing list