WACZ (Web Archive Collection Zipped) software

akierig akierig at fastmail.de
Tue Mar 28 03:29:52 UTC 2023


On 2023-03-27 (KW 13) at 16:07:45 (-0500) Eric Gallager via 
macports-users wrote:

> So, the Internet Archive has recently added an "Email me a WACZ file
> with the results" option to their "Save Page Now" service in the
> Wayback Machine, so I tried that out and got some WACZ files, although
> now I don't know what to do with them. Is anyone aware of any software
> for handling WACZ files that's available in MacPorts? Or, if there
> isn't any yet, could some be added?
> More info on the format can be found here:
> https://replayweb.page/docs/wacz-format
> There are some python tools for interacting with the format, but I
> couldn't get pypi2port to generate a Portfile for me for them, and
> plus there are kind of too many python things in MacPorts anyways:
> https://github.com/webrecorder/py-wacz
> Anything else?
> Thanks,
> Eric Gallager

I’m a librarian who does a fair bit with web archives. the short 
version is this:

replayweb.page will work to ‘play’ a web archive (warc/wacz). there 
is a desktop application (electron) that you can grab from github. I 
find it better than trying to load something like that into firefox. I 
don’t know what the policy is about adding an electron app into 
macports is but speaking as a maintainer for an electron app on a linux 
distro...I’d personally avoid it.

py-wacz is great for converting warc files into wacz. the primary 
difference is that the later are compressed. That’s the primary 
function it has.

One thing for creating warc files in Macports is wget which works with 
something like: wget -pkrm --warc-cdx --warc-file=foo -e robots=off 
[https://foo.org](https://foo.org). I did a write up about it back in 
2020.

I hope that helps a little bit.

ander
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.macports.org/pipermail/macports-users/attachments/20230327/b7727573/attachment.htm>


More information about the macports-users mailing list