GSoC 2019 [Collect build statistics]

Arjun Salyan arjun.salyan.che17 at itbhu.ac.in
Tue Apr 2 13:12:12 UTC 2019


Hi Mojca,

On Tue, Apr 2, 2019 at 3:14 AM Mojca Miklavec <mojca at macports.org> wrote:

> The drawbacks may include:
> - some ports will be skipped on the builder, for various reasons (port
> is known not to build on a particular builder, it may not be
> distributable, ...)
> - the buildbot master may be down or experience problems, so data
> might go missing
>

Thanks. I will consider these factors when improving upon this.


> A strange observation from your source code: you synced portindex and
> ran the conversion, but then loaded the data from another json file?
> Am I missing something?
>

No, the conversion "tclsh portindex2json.tcl portindex" is writing to the
file "syncedportindex.json". And I am reading from the same file. I am
really sorry that I did not submit a PR and it was difficult for you to
review the code.


> There are various ways to achieve the goal. Note that if you run
> portindex yourself, it will detect which files have been updated and
> only ever touch data of those ports. The portindex command could be
> modified to only output the file with changes (when you pass some
> options to it). This will still miss deletes, but it would be an
> efficient way with almost no dependencies.
>

Does this imply that we will keep a clone of macports-contrib locally and
run a modified 'portindex' command to generate a file with only the updated
ports?


> One way would be to generate portindex yourself and always remember
> what git shasum has been used, and store that shasum to the database.
> Next time when you update, check and store the latest shasum, then ask
> git which paths have changed between the two commits, and only update
> ports whose paths match the paths reported by git as changed.
>
> It could also help if you stored a "complete" git history to the
> database (shasum, which ports changed at that point, timestamp,
> parents). Not sure if that's really so helpful, just as an option.
>
> What might be an interesting approach would be to try to squeeze the
> git shasum to the PortIndex. This could also help when submitting
> statistics as it would be easier to determine how old the database is
> / when the user last synced. (It would not work for people with their
> own modifications of the tree.) If you had the shasum in portindex,
> you could still run git independently to check for the difference.
>

These methods are not very clear to me, I haven't dealt with shasums yet. I
will discuss about them, after my research.


> Just some random ideas.


Thank you so much.

Regarding updates of builds: just ask the database about which build
> you synced last, and then sync any builds newer than that, up to the
> last one. You may need to check whether a build was complete when you
> last enquired.
>

Thanks, I am already using the same method.

Arjun
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.macports.org/pipermail/macports-dev/attachments/20190402/4bfb3b5d/attachment.html>


More information about the macports-dev mailing list