Gsoc 18 Project | Collect build statistics

Vishnu vishnum1998 at gmail.com
Mon Mar 26 12:47:04 UTC 2018


Ok .
I will be writing Milestones seperately very soon.
What is your irc id? and on what channel are you available.i use freenode.
Also rather than irc could we use comments in docs.google to interact.And
simultaneously you can see the changes.That would quicken the process.


I have a good prior expeirence with sql.I had planned to use that instead
of log file.This idea had striked my mind to create table.As it would
really simplfy the statistics section.
But creating tables for so many ports would be good or not .is something i
cant decide.

I can learn django in no time.As i mentioned i am a quick learner.
So i guess it would be wise to make everything in django.

I will try to create the Static Website .If possible will submit that
before the deadline.


I wanted to know if we can chat somewhere.

Thanks
Vishnu

On 26 March 2018 at 17:23, Mojca Miklavec <mojca at macports.org> wrote:

> Dear Vishnu,
>
> First some general remarks, then the answers to your questions below.
>
> - One of the most important things that we should discuss and improve
> is the timeline. This is very little work, but it requires some
> coordination to make sure that they are defined in the most sensible
> way. One option is to discuss over IRC, but email is also fine. I'll
> write about that separately.
>
> - I suggest to write the milestones in a separate section. For example
> something along those lines:
>   May X: The first version of database design is ready, finish the
> first import of portindex.
>   May Y: The prototype website gets deployed at a temporary location
> xyz.com. It is possible to show the most basic properties of any port
> "foo" on a plain html page via xyz.com/index/port/foo and list all
> ports via xyz.com/index/list
>   ...
>   June Z: The website is accepting installation statistics submissions
> from users.
>
> - At the moment your schedule says "June 11-15: Phase 1 Evaluation"
> and no plans to code during those days and no explicit criteria that
> would define whether the project passed or not. It makes sense to put
> evaluation on the schedule, so that you see where it is, but note that
> this doesn't take any of your time, that's when mentors are busy and
> that's when you need to make sure to show results that were agreed on.
> (It can happen that there are valid reasons why the milestones would
> not be met, for example if the plan gets changed during the coding
> since a better solution or idea is found etc., so the milestones are
> not set in stone, but in case of disagreement between a student and
> mentor it helps a lot if it's straightforward to check for outsiders
> or admin as well.)
> I didn't check if those days are weekend, but don't make it sound like
> you would be spending all your time on evaluation. You only need to
> fill in a simple webform some time during the evaluation period,
> that's all.
>
> - Please do take a look at how relational databases work and get
> familiar with some basic Object-oriented programming in case you don't
> know those concepts already. This is of crucial importance for success
> of the project. In case you would work in Django (this is not the only
> option, there are other possibilities as well), you should be familiar
> with
>     https://docs.djangoproject.com/en/2.0/intro/tutorial02/
> (this is part of a slightly longer tutorial). We need a single uniform
> website rather than something scraped from random chunks of scripts
> which do individual tasks without knowing about each other.
>
> - After the proposal gets submitted (because that's the highest
> priority task at the moment), we would still want to verify your
> skills with a simple coding challenge before we request slots (so
> ideally the challenge should be complete cca. one week after the
> deadline for proposal submission). Other organisations would ask for a
> pull request, but in this case it probably makes more sense to
> demonstrate a simple prototype. You may pick a slightly different
> task, but it should demonstrate comparable skillset or ability to
> learn. My suggestion would be to do these two tasks:
> (a) Create a simple (static) website for one single port of your
> choice and include nearly everything that will be part of the final
> product. You can copy some data from a Portfile to help you, but you
> can just as well make up the information, statistics etc. I would
> actually consider this to be part of the proposal. The page should
> demonstrate how the final product will look like (only with fake data)
> and will also help you while developing the actual functionality of
> the website: you would know where is it that you are heading. This
> should not take you more than a few hours to do, you basically need to
> visualize what should already be in the proposal.
> (b) Pick a framework to use and create a super simple hello-port
> application. In case you pick Django, go to
>     https://docs.djangoproject.com/en/2.0/intro/tutorial01/
> or
>     https://devcenter.heroku.com/articles/getting-started-with-
> python#introduction
> or any other tutorial of your choice, install the framework and the
> database, follow the steps towards creating a simple app and modify
> the app in such a way that it will implement *any* super simple
> functionality from the final application. You could for example import
> three ports, create a listing of those three ports with symlink to the
> port page which will do nothing else but print port name and
> description. You may ask for help in case you are struggling with
> installation of dependencies.
> (c - fully optional) If you still have time and motivation, try to
> make the page from (a) look at nice with basic style (you could use
> any existing framework for fontend development or ask a friend to help
> you with this particular one :)
>
> On 26 March 2018 at 11:39, Vishnu wrote:
> >
> > Can you help me with where does that json api work from?where is the code
> > that sends request to the backend.To get that data.
>
> You mean in the planned website or on our buildbot setup?
>
> For the planned website you would need to write it yourself.
>
> For our buildbot setup I honestly have absolutely no idea. The source
> code is here if you want to check:
>     https://github.com/buildbot/buildbot/tree/eight
>
> > Once i get to know that it will simplify my job to send request to
> backend
> > for port
>
> But why would you talk to backend directly? When I said that there is
> a JSON api, it means that you simply fetch a JSON file from the server
> directly, no need to talk to backend, that's what API is made for.
>
> OK, one way to solve this (if you really needed super efficiency)
> would be to copy the complete database and all the additional log
> files to your computer and do the analysis straight on the database,
> but I don't think that's worth the effort.
>
> > Also is there any existing method to show all ports of a maintainer?
> > or to show all ports of a category?say python?
>
> There is a port command that can list you all the ports from a
> particular maintainer or all ports from a particular category, but it
> would be way too inefficient to use that for the website. You need to
> support that in the website app.
>
> Once you read the 15 MB ports.json file (created with portindex2json),
> you'll have all the information about who maintains a particular port
> and what categories the port belongs to and you need to store that
> information. You cannot afford to read the file and iterate through
> all entries to check whether maintainer A maintains that port every
> single time when someone loads the website.
>
> Asking `port` to retrieve that information for you will not only be
> equally slow, but it will also make the website difficult to deploy.
>
> > It would be best if i start working gmy gsoc with 3 most important
> things .
> > 1) Basic Port information through portindex2json
>
> Yes, this is probably the most important piece of information that
> will lay foundation for everything else.
>
> > 2) Port Build history. It would use JSON API
>
> OK. But just to make sure that we are on the same page: you should use
> the JSON api to retrieve information from the build master *once* per
> build, not each time when someone takes a look at your website. You
> should then store the results internally.
>
> > 3) Buildbot code updation to send updates to the logs.
>
> I would say that for the time being, if you implement nr. 2 anyway,
> you could check the buildbot master for any updates and use the same
> JSON api to, say, update the status once per hour or every 10 minutes.
>
> As I already said, I would put a higher priority to statistics
> collection. You should be able to accept and store installation
> statistics submissions by the first milestone, else there will be no
> time to improve the code based on all the problems you might find in
> the process. Point 3 (buildbot code update) is merely an
> implementation detail to allow more efficient collection of build
> statistics and can be implemented later.
>
> > And as you said logs would be ineffective to store .For history of
> builds of
> > the port.I think logfile would be the bestway to store history.
>
> I still cannot imagine an efficient implementation of a scalable
> website with lots of information to be based on data stored in plain
> text files.
>
> You REALLY REALLY should be looking at Object-oriented design of the
> software and have a closer look at relational databases and try to
> understand the basic concepts there (tables, rows, primary index,
> foreign keys etc).
>
> While you can certainly hack something quickly with plain text files,
> the solution will not scale well.
>
> I would suggest you to include the database layout in your proposal:
> how will you store the information you need for the described
> functionality of the website.
>
> > And regarding the syntax . I was thinking of storing the build history in
> > json format in one line in log.txt of a particular port.
> >
> > Also mojca can i know your working hours.or atleast the timezone where
> you
> > live.
>
> I'm at UTC+2. Working hours are hard to define precisely, but I try to
> sleep at night :)
> Umesh is at the same time zone as you are and knows enough details to
> be able to help you (if he has time).
>
> Mojca
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.macports.org/pipermail/macports-dev/attachments/20180326/ce5eed0c/attachment-0001.html>


More information about the macports-dev mailing list