Gsoc 18 Project | Collect build statistics

Vishnu vishnum1998 at gmail.com
Sat May 12 15:26:30 UTC 2018


Hi

"The PortIndex file itself is generated with the 'portindex' command,
which only updates the data for Portfiles that have a modification time
that is newer than the existing PortIndex file."

So where can i see that code that checks the modification time?
So suppose
Our ports tree contains three ports A, B, C last modified time : 1:00 UTC
present port index:
[ A
  B
  C
] @ 1:00 UTC

There was a edit in Port A @ 1:30 UTC
so something will check all the port files modification time and if
relatively newer it will update the port index.

so new  port index:
[ A
  B
  C
] @ 1:30 UTC


"The portindex2postgres.sql script merely converts the PortIndex file
from the custom format based on Tcl lists to SQL. The output will always
contain SQL statements with the full data for every port."

Then after the port index is updated this script runs and flushes the db
then fills it with new portindex data?
or just updates the db with the modified port data?

Which one occurs?

Trying to understand the present mechanism.


So after every commit the entire 15mb portindex being processed and data
being uploaded to the db.
I think this is very inefficient.
rather it would be best if after every commit the buildbot/ webhook updates
the database itself.


Thanks

On 12 May 2018 at 17:47, Rainer Müller <raimue at macports.org> wrote:

> On 2018-05-12 10:34, Vishnu wrote:
> > I am not saying that my db and the exiting Db would be interdependent.
> > Rather i am saying just once in forever I can copy the content to my
> > database.
> > Then do the code to keep on updating it whenever something is added or
> > deleted or modified.
> >
> > Like for exaple maintainers.
> > You already have a table of mainatainers and their ports.So i can copy
> > the data to my database.
> > Then i will make a similar to PortIndex2PGSQL.tcl script to keep on
> > updating my database independent of the existing database.
>
> Our database contains exactly what the portindex2postgres.tcl script
> produces (note the different naming). You can just generate the data
> locally by running it against PortIndex.
>
> You need to have the ports tree in a directory named "ports", then run
> the following command from the parent directory:
>
>   port-tclsh /path/to/macports-infra/jobs/portindex2postgres.tcl
>
> This will create a PortIndex.sql in the current directory. Ignore or
> remove the macports.conf and sources.conf files that are also created.
>
> > Again and again processing the 15mb json file would be bothersome.
> > We definitely need some way of getting a differential json.
> > To update only the changes happened because of commit to the port file.
> >
> > Or maybe we could make some changes in buildbot that.Whenever there is
> > some change in portfile it could update my port table as well with the
> > changes.
> > Not sure how to do this.But just an idea as of now.
>
> Yes, the idea would be that macports-webapp is updated after every commit.
>
> To achieve that, you could use a job in buildbot, but then the result
> needs to be transferred to the macports-webapp. Alternatively, you could
> deliver a WebHook from GitHub to do this within macports-webapp. It
> largely depends on what kind of data you want to gather and how.
> > And regarding working on a seperate branch.Rainer suggested not
> > to.Because daily pull requests have to merged by the mentor.
>
> I did not argue against working on a separate branch. I just wanted to
> point out that pull request reviews on GitHub might pose a problem as
> they cannot depend on each other. It is up to you and your mentors to
> work out the details.
>
> > One doubt I still have is that whether  portindex2postgres.tcl again and
> > again runs the entire MacPorts port tree on the database ?
> > Or just the port that has been changed ?
>
> The PortIndex file itself is generated with the 'portindex' command,
> which only updates the data for Portfiles that have a modification time
> that is newer than the existing PortIndex file.
>
> The portindex2postgres.sql script merely converts the PortIndex file
> from the custom format based on Tcl lists to SQL. The output will always
> contain SQL statements with the full data for every port.
>
> Rainer
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.macports.org/pipermail/macports-dev/attachments/20180512/3249c599/attachment.html>


More information about the macports-dev mailing list