GSoC 2019 - trace mode improvements

Mojca Miklavec mojca at macports.org
Sun Apr 7 09:55:36 UTC 2019


Dear Davide,

On Sun, 7 Apr 2019 at 09:40, Davide Gerhard wrote:
>
> Seems that macports-dev@ has not received my proposal (sent yesterday);
> moreover, mailman does not answer to my command or only after some hour.

This is strange and might need to be put to attention of our
infrastructure team. Can you send the logs / responses to
    macports-infra at lists.macports.org
(I don't know if that works, I hope it does. You may CC me.)

I'm now replying to the development mailing list, but you should try
to get subscribed nonetheless.

> Sorry, I am late but I saw only the other day that MacPorts
> participate to the Google Summer of Code.

Maybe we should be doing a better job in marketing :(
Any advice on how the message could have reached you (provided that
you have contributed before) would be greatly appreciated.

Previous contribution to the project definitely brings you "extra
points", but the time left is super scarce, so you need to use it
well.

> Any comment will be appreciate.

I'm not familiar with the topic, so I cannot really advise you, but:

(1) Please get subscribed to the development mailing list.

(2) Please carefully read the archives at
    https://lists.macports.org/pipermail/macports-dev/
in particular those from March and April this year.

(2.a) There is quite a bit of extra information about trace mode on
the list already.

(2.b) The following reply to one of the candidate students would apply
to this proposal in quite some points as well:
    https://lists.macports.org/pipermail/macports-dev/2019-March/040425.html

(3) Please make the (draft) submission fast.

> * Personal Details

/removed/

> I have always been interested in Informatics, Security and Networking
> in particularly developed by the BSD/free-software community.
> This passion has lead me to study information security.
> I took a Bachelor degree in Security of Computer Systems and Networks
> and now I am taking a Master degree on Computer Science.
>
> In the last few months I contributed to macports-ports, mainly in the
> following areas:
>
> - SDR, mainly with @michaelld and @mf2k
> - mercurial and related, mainly with @mojca

This is definitely a good thing, and we definitely value previous
contributors :)
But you should ideally also demonstrate your ability to tackle the
macports base, which is goes somewhat beyond writing the portfiles.

> * Project Idea
>
> ** Goal
>
> The main goal of my GSoC will be to:
>
> - implement a function that auto-detect build dependencies of a new port
>   and print out which are needed; especially useful when the developer
>   is using his everyday installation.
>
> - speed up trace mode: improve performance of the sandbox particularly
>   during the build process, with so many syscalls and I/O.
>
> ** Abstract (with) Methodology
>
> The main argument of my GSoC will be understanding and improving the
> trace mode used on macports and learn the related low level
> functionality of macOS.  This will be done mainly in three steps: first
> understand how it works and which functionality already implements and
> how. Second, trace which files a configuration script read(2)/stat(2) or
> try to access, like during automake/autoconfig or cmake phase, and build
> a complete and minimal dependency tree that will be shown to the
> developer. In this stage, should be pay attention to present only direct
> build dependencies and avoiding which one are non pertinent to macOS
> configuration. Third, analyze the trace mode with dynamic tracing tools,
> like dtrace and flame graph, to identify which functions are slow down
> the process; identify which ones are more important or more slow and
> implement a solution. Re-iterate this path until the execution is
> acceptable for the normal usage. At the end of the process I expect a
> dylib that could be used easily and fast from many other parts, like
> permit fakeroot or phasing out XCode.
>
> * Technical Details
>
> ** Schedule
>
> At this stage, I generally don't trace a strict timetable because some
> steps are more longer than others and I am not conscious of every single
> effort.

The proposal needs clearly defined milestones and make some
deliverable after the first, second, third month. Your draft doesn't
provide any of that yet.

> Phase 1:
>
> - deeply understand macports-base and its architecture;
> - learn darwintracelib1.0 code and try to hack some functionality to
>   better understand which section I should change.

Getting a grasp of which sections to change should ideally be done
before you can write a compelling proposal.

> Phase 2:
>
> - implement the functions that trace the configuration phase and build
>   an approximate dependency tree;
> - try that code with many different ports and languages to detect
>   glitches or cases not considered at the implementation time; verify
>   manually that the build dependency tree is complete and minimal;
> - review the implementation: adding new changes, if they are little, or
>   try new approaches;
> - do many tests with real packages and verify that everything is
>   correct.
>
> Phase 3:
>
> - after understanding the tracelib, I can start to analyze the
>   performance of the injected library; as I already know dtrace and how
>   to plot the results as flame graph, this will be my first way to
>   understand which function/call is slow. Considering that many low
>   level things of macOS are unknown to me, I will be happy to find new
>   way to analyze problem like this;
> - define which functions could be speed-up and which kind of
>   algorithm/structure could be used to improve the performance;
>   generally these changes require caches or new approaches to the
>   problem;
> - implement one improvement at time and create some border line tests to
>   verify that the new code works; after this, identify few packages
>   that use that function a lot and verify the improvement with the tool
>   used during the analysis;
> - reiterate the above statement for every function that was source of
>   bottleneck until acceptable performance is reached; this aspect must
>   be considered reached only with real ports and not with test cases;
> - test if the changes works well with the goal of phase 2 and see if
>   need changes to generalize the usage, like fakeroot or phasing out
>   XCode.
>
>
> * Additional Questions
>
> ** What are your experiences with macOS so far? (How long do you use
>    it, did you switch from Windows/Linux, etc.)
>
> I used for fifteen years only OpenBSD/FreeBSD/Linux (Gentoo, Archlinux
> and Debian).  Now it is 9 months that I am using macOS as main laptop.
>
> ** How long have you been using MacPorts?
>
> 7 months
>
> ** Do you have experience with other package management systems?
>
> Yes
>
> ** How much experience do you have with Tcl and C?
>
> Many years of C but very little with Tcl
>
> ** Will you be available after the project ends?
>
> Yes
>
>
> * Availability
>
> ** Do you plan to go on vacations, have exams, internship or be
>    otherwise absent during the GSOC? If so, when?
>
> probably the second week of August
>
> ** To which other organisations did you send a GSOC proposal?
>
> No one else
>
>
>
> Thank you
> /davide


More information about the macports-dev mailing list