Gsoc 18 Project | Collect build statistics

Mojca Miklavec mojca at macports.org
Sun Apr 29 20:44:53 UTC 2018


On 29 April 2018 at 19:34, Vishnu wrote:
> Hey
>
> I wanted to work on finalising the database. What all final columns in each
> table.

This is a creative process that should in the first place be carefully
crafted by you - of course with sufficient input and feedback from our
side.

> Take your final suggestions.

I'm pretty sure that things will change during the course of the
summer once you figure out that there might be additional
requirements, that you might need additional fields to achieve what
you wanted etc. Django usually lets you transition from one schema to
a slightly different one in a relatively painless way.

But there's one thing that we should probably discuss together with
other members, and that is, to what extent to support:
- historic information about ports (as versions, dependencies,
maintainers etc. change, ports become obsolete, ...)
- the fact that different OS versions might support different ports or
different versions of ports

Vishnu, I would like to suggest to put your Excel schematic to some
easy-to-read-and-edit document in that github repository. This could
be some Markdown table, or anything else. You can search a bit for
programs that let you easily create ERD (Entity Relation Diagram)
diagrams. This document will then serve at least three purposes:
- brainstorming with us about the most reasonable schematic, improvements etc.
- enormously help you when you start coding you app
- as a documentation that will allow anyone else to understand and
improve the program later on without reading the code

> https://www.macports.org/ports.php?by=platform&substr=linux
> How does this work?
> where is the database?
> how are queries relayed?
> where is the actual code for this?

There is a PostgreSQL database on the server run by Clemens and Rainer
in Germany (most likely the same server where you app would eventually
run).

I would be grateful if someone from the infrastructure team could
correct me here in case I'm giving you wrong pointers, but I suspect
that the database gets populated with this script:
    https://github.com/macports/macports-infrastructure/blob/master/jobs/PortIndex2PGSQL.tcl
by converting data from PortIndex to SQL and filling the database with it.

I suspect the rendering is done with
    https://github.com/macports/macports-www/blob/master/ports.php

You may optionally take a glimpse at various repositories at
    https://github.com/macports/
(ignore user repositories), quickly look at what is there and ask if
there's something more that you would like to know.

This can give you some ideas, but I assume you would populate the
database in pretty much the same way as you did during your coding
challenge, and according to the database design developed during
proposal writing and during community bonding period.

> Also for my project what database should i use.
> i recall postgre sql.or something else?

Yes, PostgreSQL is probably the best OpenSource relational database
(or at least it was some years ago), my preference would be to use
that one. SQLite is too limited. MySQL (derivatives?) would be
acceptable, but PG sounds better. Commercial databases are out of
question.

> So should i start creating tickets?
> like ..at least putting putting up my basic milestones.

Yes, I would like to see the tickets with milestones once we have the
repository. Things like "transfer project proposal to documentation"
with a milestone bonding period could also be stuff that ends up
there.

> Also you said :
> (c.3) Plan the API to get the data from the database in JSON format
> (so that someone else could write an independent app with the same
> display functionality).
> More in a separate email, I guess.
>
> I'll do this once i finish my database.

OK.

Things that need to be planned are:
- database schematic (the one from proposal was quite ok already, but
we need to refine to what extent to support historic entries etc.)
- the list of different sites (first the list of all sites with
approximate URLs, then add what content you want to end up on each one
of them)
- ideally the API (probably not that much different from the list of sites)
- potentially refine / go into some more details about different
building blocks use to collect the data for the database

I'll try to explain a bit more about the API ...

Let's just put everything to a (set of) document(s) in the repository.
It could be a markdown or asciidoc or whatever other document that's
easy to edit and can easily be viewed in your repository. As I said, I
feel that this would be the easiest way to collect more feedback,
collect all ideas at a single place, as a reference while you will be
developing and as a documentation and reference that would be useful
to other developers after the GSOC is over.

> I tried to understand a bit about heroku.It'll work when needed.Not an issue
> i guess.

I'm not even sure that heroku is the best choice since I have no
experience with the service. It's just nice to have some temporary
flexible way to show progress (as a functional website rather than
just pure code) and get feedback from others while developing stuff.

Mojca


More information about the macports-dev mailing list