GSoC 2019 [Collect build statistics]

Saagar Jha saagar at saagarjha.com
Sat Mar 9 11:55:50 UTC 2019


Saagar Jha

> On Mar 8, 2019, at 08:12, Craig Treleaven <ctreleaven at macports.org> wrote:
> 
>> On Mar 8, 2019, at 9:45 AM, Arjun Salyan via macports-dev <macports-dev at lists.macports.org <mailto:macports-dev at lists.macports.org>> wrote:
>> 
>> Thank you Mojca.
>> 
>> The provided references have cleared a lot of my doubts and I am really interested to do this project: 'Collect build statistics'
>> 
>> Here is what I have understood so far:
>> 1. dynamic page for each port displaying basic information (description, version etc.), installation stats, build history etc.
>> 2. From suggested ideas, I found the following to be added to each page:
>> whether the current version of port built on each particular OS/arch
>> when was the last time the port built on that OS/arch
>> links to all builds
>> list of installed files, differences in installed files on different OS versions
>> perhaps include some basic functionality to allow checking for build reproducibility
>> what is the latest version of port (in case it's already outdated)
>> I do not understand : "perhaps include some basic functionality to allow checking for build reproducibility".
>> 
>> 3. I would further want to take up the task of migrating a redesigned website (or some components) into the same Django* app.
>> 
>> Please help me with that 'build reproducibility' point and also how do I plan from here (I know Django, but I am still learning about MacPorts)?
> 
> 
> Have you seen the following wiki page?
> 
> https://trac.macports.org/wiki/StatisticsIdeas <https://trac.macports.org/wiki/StatisticsIdeas>
> 
> There are some deficiencies in the current data collected.  A key issue is whether a port was requested or installed as a dependency.  That then leads to the need for a versioned API.  Other data elements need a re-think.
> 
> My interpretation is that some key MacPorts people had privacy concerns related to collecting such information.  As such, there was no appetite to strongly encourage users to participate in submitting the data.  In crude terms, the “opt-out” v. “opt-in” question.  I don’t know if that is now changed or not.  
> 
> I fall firmly on the side that it is fair to ask users for such data as it helps us to understand how MacPorts is used.  That can then guide us, as MacPorts contributors, into where to channel our available time.  I note that Homebrew collects such information and there does not seem to be much resistance, if any.

This isn’t true at all. Homebrew’s introduction of this feature was vehemently contested, especially because it 1. introduced analytics without reasonable warning or communication with users 2. used a third party (Google Analytics) and 3. was opt-in by default. The debate was silenced by the core team, who locked the associated thread <https://github.com/Homebrew/brew/issues/142> on GitHub because they felt that the decision was not up for discussion.

FWIW, while I am not opposed to MacPorts adding clearly communicated, opt-in, self-hosted analytics; I would be very strongly against doing this if any of these conditions was not met. Doubly so if the initial discussion to do this is made without users being able to provide input, triply if the feature stays largely unchanged even when users complain, and quadruply if this discussion is forcefully closed, unresolved, because “the feature has already shipped”.

>  I think relying on opt-in would mean poor data quality and that implementing such a collection and reporting system is largely a waste of time.  IMHO.
> 
> Craig
> 
> PS My mail archive show me that we’ve been talking about such a facility for more than 5 years!  

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.macports.org/pipermail/macports-dev/attachments/20190309/a0632dee/attachment-0001.html>


More information about the macports-dev mailing list