<div dir="ltr"><div dir="ltr"><div dir="ltr"><div dir="ltr"><div dir="ltr"><div dir="ltr"><div dir="ltr"><div dir="ltr"><div dir="ltr"><div dir="ltr"><div dir="ltr"><div>Hi Mojca,</div><br><div class="gmail_quote"><div dir="ltr" class="gmail_attr">On Mon, Apr 15, 2019 at 9:46 PM Mojca Miklavec <<a href="mailto:mojca@macports.org">mojca@macports.org</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-style:solid;border-left-color:rgb(204,204,204);padding-left:1ex">Given the current state of the app with sufficient complexity, I<br>
believe that it would be wise to introduce some unit tests to be able<br>
to extensively test what happens with data you import, and to prevent<br>
/ detect any breakages in the future.<br></blockquote><div><br></div><div>Thank you. Since, I am currently working on parsing of maintainers I began testing from maintainers only. It helped me make significant improvements to the code which extracted the maintainers ( added to the pull request : <a href="https://github.com/macports-gsoc/macports-gsoc-2019-webapp/pull/1">https://github.com/macports-gsoc/macports-gsoc-2019-webapp/pull/1</a> ). [update: this file has further changed since I updated the pull request, logic remains the same, just the JSON object structure has changed]</div><div><br></div><div><div>I ran the tests and got desired results. I will show the final code and results in around 24 hours after I get done with my viva voce and extra classes, but below I am discussing the approach. Sorry, if this is not the right way or the presentation is not fine.</div></div><div><br></div><div>I created five ports:</div><div><ol><li>portA maintainers {@github gmail.com:test1}<br></li><li>portB maintainers {@github gmail.com:test2} SAME GITHUB, DIFFERENT EMAIL<br></li><li>portC maintainers {@newgithub gmail.com:test2} SAME EMAIL, DIFFERENT GITHUB</li><li>portD maintainers {gmail.com:test2} EMAIL REPEATED WITHOUT GITHUB<br></li><li>portE maintainers {@github} GITHUB REPEATED WITHOUT EMAIL</li></ol>I received 3 unique Github and Email pairs (according to the Logic[1] ) and I am considering each as a different maintainer.</div>[<br>{<br>"github": "github",<br>"name": "test1",<br>"domain": "<a href="http://gmail.com">gmail.com</a>"<br>},<br>{<br>"github": "github",<br>"name": "test2",<br>"domain": "<a href="http://gmail.com">gmail.com</a>"<br>},<br>{<br>"name": "test2",<br>"domain": "<a href="http://gmail.com">gmail.com</a>",<br>"github": "newgithub"<br>}<br>]<div><br></div><div>Now to each maintainer I added all those ports which had GitHub or Email or both same as that of the unique maintainer.</div><div><br></div><pre style="font-family:Menlo;font-size:10.5pt"><span style="background-color:rgb(255,255,255)"><font color="#000000">[<br> {<br> "model": "ports.Maintainer",<br> "pk": 0,<br> "fields": {<br> "github": "github",<br> "name": "test1",<br> "domain": "<a href="http://gmail.com">gmail.com</a>",<br> "ports": [<br> [<br> "portA",<br> "portB",<br> "portD"<br> ]<br> }<br> },<br> {<br> "model": "ports.Maintainer",<br> "pk": 1,<br> "fields": {<br> "github": "github",<br> "name": "test2",<br> "domain": "<a href="http://gmail.com">gmail.com</a>",<br> "ports": [<br> [<br> "portA",<br> "portB",<br> "portD",<br> "portC"<br> "portE"<br> ]<br> }<br> },<br> {<br> "model": "ports.Maintainer",<br> "pk": 2,<br> "fields": {<br> "name": "test2",<br> "domain": "<a href="http://gmail.com">gmail.com</a>",<br> "github": "newgithub",<br> "ports": [<br> [<br> "portE",<br> "portB",<br> "portC"<br> ]<br> }<br> }<br>]</font></span></pre><div><br></div><div> For querying we can now use email/ GitHub and show all the ports for all the maintainers received.</div><div><br></div><div>This should not break because of any inconsistency in the maintainer details. But there is one disadvantage- On the port-detail page, we will now show x maintainers, if the same maintainer provided x different pairs of GitHub and email. However this disadvantage might prove to be helpful in getting rid of the inconsistencies.</div><div><br></div><div>Thank You</div><div><br></div><div>[1]</div>Currently I am using the following Logic for adding maintainers (comparing with already parsed maintainers) :<br><ul><li>If neither the email nor GitHub is repeated: CREATE NEW<br></li><li>If the email and GitHub both are repeated: SKIP<br></li><li>If the email is repeated and not the GitHub handle (provided) : CREATE NEW with inconsistency flag<br></li><li>If the GitHub handle is repeated and not the email address (provided) : CREATE NEW with inconsistency flag<br></li><li>If the Github handle is repeated and email is not provided: SKIP<br></li><li>If the email address is repeated and GitHub is not provided: SKIP<br></li></ul></div></div></div></div></div></div></div></div></div></div></div></div>