MacPorts Statistics (was Re: usage numbers for macports vs. homebrew?)

Clemens Lang cal at macports.org
Sun Mar 23 07:08:31 PDT 2014


Hi,

> On 22 Mar 2014, at 20:55 , Clemens Lang <cal at macports.org> wrote:
> > I do not log any post data along with the IP address, so I cannot tie IPs
> > to UUIDs, except maybe by correlating the last-update timestamp and the
> > log file.
> 
> We are talking NSA-like meta-data, which could be linked to a real person…

I have (as I said I was going to in a different part of my last mail you
didn't quote) changed my webserver logging setup to anonymize the IP address
in the logfiles and removed all previous logs. A successful stats submit now
looks like this to me:

 0.0.0.0 - - [23/Mar/2014:14:37:01 +0100] "POST /submissions HTTP/1.1" 302 968 "-" "MacPorts/2.2.99 libcurl/7.30.0"
 0.0.0.0 - - [23/Mar/2014:14:37:08 +0100] "GET /submissions HTTP/1.1" 200 2603 "-" "MacPorts/2.2.99 libcurl/7.30.0"

I don't know what "NSA-like meta-data" you mean in this case. Without the
IP addresses there is absolutely no way to link a user to his UUID. And
honestly, I find it disrespectful and inappropriate to compare this (and
me) with the NSA's actions. An opt-in based system that submits specific
data (and it's no secret what it submits either, the code is open source
and the FAQ on the stats site clearly explains it) is hardly comparable to
global mass surveillance without the victim's knowledge. Please refrain
from using such inappropriate comparisons in the future.

> Well, I wanted to bring up these concerns, since - back then when the project
> was being worked on in GSOC - we had a big discussion about whether the
> stats project should go live at all (or rather not) because of privacy
> issues.

And from what I recall the result of the discussion was that the system had
to be opt-in, and that's what it currently is. You might not see the point
of having package statistics, but nobody is forcing you to participate. Frankly
I see the benefit of these statistics and I have put quite some work into it –
if you think you can do better or improve the privacy, submit patches.

> No one should have UUID linked content. This should be usage statistics and
> not user statistics, may it be anonymised or not.

The mpstats port currently submits statistics data up to four times per month
automatically. We however only store one set of data per user per month. To
get this done, we need a way to link multiple submissions by the same
installation together. I think the UUID is a good way to achieve this. If you
disagree, back your criticism with a patch and a better way to do this. I think
what we have is sufficient (especially when compared to all the involuntary(!)
tracking anybody using a web browser out there is subject to) and I will not be
putting more of my time into it.

-- 
Clemens Lang


More information about the macports-users mailing list