<div dir="ltr"><div dir="ltr"><div><div dir="ltr"><div dir="ltr"><div dir="ltr"><br></div><br><div class="gmail_quote"></div></div></div></div><div><div dir="ltr" class="gmail_attr">On Thu, Mar 28, 2019 at 12:05 PM Mojca Miklavec <<a href="mailto:mojca@macports.org" target="_blank">mojca@macports.org</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-style:solid;border-left-color:rgb(204,204,204);padding-left:1ex">What if there's a server outage?<br></blockquote><div> </div></div><div><div>Then the best way is to use HttpStatusPush to deliver instant updates, and so that any build is not missed due to server failure, we could run our fetching script once per day. The script can easily match if any of the build number present in logs is absent from the database.</div></div><div><div> </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-style:solid;border-left-color:rgb(204,204,204);padding-left:1ex">(3) The database needs to be designed in such a way (and the software<br>
needs to be written in such a way) that frequent updates of the full<br>
portindex2json:<br>
(a) works correctly (ports missing from PortIndex are marked as<br>
gone, no duplicate entries of ports, all info up-to-date)<br>
(b) works super efficiently<br>
(c) works with minimal overhead<br>
If network speed is the bottleneck, make sure that you feed / update<br>
the database from the same machine where the database is running.<br>
Updating via git is super fast, you want to avoid transferring the<br>
full 20MB file over network over and over again. Even if the testing<br>
system is running at strange configurations, suggest the architecture<br>
of how it would ideally be implemented if you can design the system<br>
and architecture yourself.<br></blockquote></div><div><div>For keeping an updated copy of portindex.json this seems a fine pathway:</div><div><div><ul><li>Generate portindex.json file along with Portindex, i.e. run portindex2json.tcl on our own. [ this would also help in our discussion with repology ]</li><li>portindex.json can be stored in the same directory as PortIndex and if we run our web-app on a different machine [ which is the most probable case ] then we could keep web-app's version of portindex.json updated using rsync [ repology is doing the same, not sure though ].</li><li>Then using os.stat on web-app's version of portindex.json, we can continuously check the file's 'last modified' time and can hence, can detect if there are any changes.</li></ul><div>Now as we have an updated copy of portindex.json, we go back to our build history which is constantly receiving updates from the server [ without delay, if everything is fine and with some delay in case of server outage ] and detect which ports had been recently built, and for those ports we would then update the database using portindex.json.</div><div>To ensure things remain in right manner, we can schedule a weekly 'complete syncing of database and portindex.json'.</div><b></b></div></div></div></div><div><div dir="ltr"><div dir="ltr"><div class="gmail_quote"><div><br></div><div> </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-style:solid;border-left-color:rgb(204,204,204);padding-left:1ex">(4) Suggest a way to minimize the data transfer, so that it will only<br>
include the changes rather than the full data set. How to get such<br>
data? What would need to be changed / improved?</blockquote><div><br></div><div>rsync would do exactly this. </div><div><br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-style:solid;border-left-color:rgb(204,204,204);padding-left:1ex">
(5) You won't be getting port renames. What you do get is<br>
"replaced_by" information at best (say, perl5.26 could be replaced_by<br>
perl5.28). When a port is renamed, treat it as a different port, but<br>
the old port could be marked as "inactive" and "replaced_by <which<br>
port>" (if it's not deleted yet). This information is probably not in<br>
PortIndex, either portindex would need to be improved, or you need to<br>
find a different way.<br></blockquote><div> </div><div>Okay! So the name change problem can be handled. We can have a column of "replaced_by" in out table and as long as it is empty/ NULL -> the port is active else it is inactive and has been replaced by a new port.</div><div><br></div><div>Please let me know if these approaches look fine.</div><div><br></div><div>Thank You</div></div></div></div>
</div>
</div>