Gsoc 18 Project | Collect build statistics

Rainer Müller raimue at macports.org
Sat May 12 12:17:11 UTC 2018


On 2018-05-12 10:34, Vishnu wrote:
> I am not saying that my db and the exiting Db would be interdependent.
> Rather i am saying just once in forever I can copy the content to my
> database.
> Then do the code to keep on updating it whenever something is added or
> deleted or modified.
> 
> Like for exaple maintainers.
> You already have a table of mainatainers and their ports.So i can copy
> the data to my database.
> Then i will make a similar to PortIndex2PGSQL.tcl script to keep on
> updating my database independent of the existing database.

Our database contains exactly what the portindex2postgres.tcl script
produces (note the different naming). You can just generate the data
locally by running it against PortIndex.

You need to have the ports tree in a directory named "ports", then run
the following command from the parent directory:

  port-tclsh /path/to/macports-infra/jobs/portindex2postgres.tcl

This will create a PortIndex.sql in the current directory. Ignore or
remove the macports.conf and sources.conf files that are also created.

> Again and again processing the 15mb json file would be bothersome.
> We definitely need some way of getting a differential json.
> To update only the changes happened because of commit to the port file.
> 
> Or maybe we could make some changes in buildbot that.Whenever there is
> some change in portfile it could update my port table as well with the
> changes.
> Not sure how to do this.But just an idea as of now.

Yes, the idea would be that macports-webapp is updated after every commit.

To achieve that, you could use a job in buildbot, but then the result
needs to be transferred to the macports-webapp. Alternatively, you could
deliver a WebHook from GitHub to do this within macports-webapp. It
largely depends on what kind of data you want to gather and how.
> And regarding working on a seperate branch.Rainer suggested not
> to.Because daily pull requests have to merged by the mentor.

I did not argue against working on a separate branch. I just wanted to
point out that pull request reviews on GitHub might pose a problem as
they cannot depend on each other. It is up to you and your mentors to
work out the details.

> One doubt I still have is that whether  portindex2postgres.tcl again and
> again runs the entire MacPorts port tree on the database ?
> Or just the port that has been changed ?

The PortIndex file itself is generated with the 'portindex' command,
which only updates the data for Portfiles that have a modification time
that is newer than the existing PortIndex file.

The portindex2postgres.sql script merely converts the PortIndex file
from the custom format based on Tcl lists to SQL. The output will always
contain SQL statements with the full data for every port.

Rainer


More information about the macports-dev mailing list