<div dir="ltr"><div dir="ltr"><div dir="ltr"><div dir="ltr"><div dir="ltr"><div dir="ltr">Hi,<div><br></div><div>Thanks for the helpful response ^_^.</div><div><br></div><div>I have been through the code files of porttrace.tcl, tracelib & darwintrace. I understood their high level working.</div><div>I will go through the function __darwintrace_get_filemap() to understand more about compare & swap & will look for more lock free primitives. ^_^. </div><div><br></div><div>Regards,</div></div></div></div></div></div><div dir="ltr">Mihir<br><div><div><div><div><div><br></div></div></div></div></div></div><br><div class="gmail_quote"><div dir="ltr" class="gmail_attr">On Tue, Mar 26, 2019 at 1:19 AM Clemens Lang <<a href="mailto:cal@macports.org" target="_blank">cal@macports.org</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-style:solid;border-left-color:rgb(204,204,204);padding-left:1ex">Hi,<br>

<br>

On Wed, Mar 20, 2019 at 09:06:11PM +0530, Mihir Luthra wrote:<br>

> I am not a master in dealing with low-level system stuff, but as I<br>

> have worked pretty much with unix shell scripting and C language, I<br>

> can connect to points.<br>

> <br>

> I wanted a few tips from you regarding the project:<br>

>      1) What way should I proceed?<br>

>          Should I start with understanding the trace code?<br>

>          If yes, then can you give me a some idea on which files I<br>

>          have to work with and in what order should I start<br>

>          understanding them?<br>

<br>

That's definitely required for this task, yes. The most relevant parts<br>

are the client-side in src/darwintracelib1.0, most notably darwintrace.c<br>

[1]. Check __darwintrace_setup() and __darwintrace_get_filemap() (most<br>

notably the use of compare-and-swap in that function) and<br>

dependency_check(). You should understand why the compare-and-swap in<br>

__darwintrace_get_filemap() is required and look at the different<br>

scenarios that can happen when two threads call this function at the<br>

same time. The same principle of non-blocking synchronization using<br>

compare-and-swap would need to be applied for this project.<br>

<br>

Server-side, look at pextlib1.0/tracelib.c [2]. The main server is<br>

handled in TracelibRunCmd which accepts new connections and processes<br>

pending requests using kqueue(2)/kevent(2) to ensure non-blocking<br>

processing. dep_check() does the query of the registry using SQLite.<br>

<br>

Additionally, port1.0/porttrace.tcl is the somewhat higher-level setup<br>

of the tracelib.c server component from Tcl which eventually calls<br>

'tracelib run' in a separate thread to start the C code.<br>

<br>

>      2) Are there any books or papers or anything which I can refer<br>

>      to?<br>

<br>

I'm not in a position to recommend current papers on this, but a good<br>

starting point might be research on lock-free map-like data structures,<br>

e.g. a Ctrie [4] or related data structures [5].<br>

<br>

>      3) Shall I continue contacting you on mail or should I use some<br>

>      other way?<br>

<br>

The usual way is sending me an email but Cc'ing the macports-dev list,<br>

so that others can also contribute. I've added the list to Cc now.<br>

<br>

<br>

> *******************************************************************************************************************<br>

> A short Report on what I understood and worked upon<br>

> *******************************************************************************************************************<br>

> <br>

> [...]<br>

><br>

> I got some high level understanding on the way code is working<br>

> <br>

> 1)That DYLD_INSERT_LIBRARIES dynamically links the tracelib during build so<br>

> that all file related operations are under our control.<br>

> 2)Now during run time( most probably the desertroot phase) when check is<br>

> being made , in the 3rd possibility, as data is always being fetched<br>

> from server and not cached, we need to implement an efficient way for<br>

> cache storage which as u told by mmap(2) and forming a shared memory<br>

> for all processes and it is spoc for server interaction. Some data<br>

> structure needs to be implemented to reach to all processes fast. but<br>

> we can’t point to the this shared memory as we only get offset info<br>

> from mmap 2. (This point maybe a bit ambiguous I need to research more<br>

> to understand this completely)<br>

<br>

This does not only apply to the destroot phase - we need to limit file<br>

accesses during both the configure and build phases, too, so that the<br>

build systems to not inadvertently find things on the filesystem they<br>

shouldn't see.<br>

<br>

As for the offset portion, the problem is as follows:<br>

<br>

 - You mmap() a file descriptor into memory (where that file descriptor<br>

   comes from remains to be seen, but it can also just be a file on disk<br>

   for all intents and purposes for now). That file descriptor will<br>

   point to the same file in all processes.<br>

 - We do not control the memory layout of the processes our library is<br>

   injected into – hence we cannot assume a fixed address for our cache<br>

   data structure, but must let mmap(2) pick a free address. This means<br>

   that different processes will have the same cache data structure at<br>

   different addresses.<br>

 - Pointers only work if all processes map the chunk of memory at the<br>

   same address. Hence, we can not use pointers in our data structure,<br>

   but instead we can use offsets from the beginning of our shared<br>

   memory block.<br>

 - What was previously *pointer now becomes *(base_address + offset).<br>

   The code that deals with the cache data structure needs to be aware<br>

   of that.<br>

<br>

<br>

> On Wed, Mar 20, 2019 at 3:04 AM Clemens Lang <<a href="mailto:cal@macports.org" target="_blank">cal@macports.org</a>> wrote:<br>

> <br>

> > Hi,<br>

> ><br>

> > On Sun, Mar 17, 2019 at 01:03:00PM +0530, Mihir Luthra wrote:<br>

> > > I wanted to contact you regarding the “Speed Up trace mode”<br>

> > > project. I gave a quick read to the documentation to get the basic<br>

> > > understanding of MacPorts.<br>

> > ><br>

> > > I am giving a small description on what I understood,<br>

> > ><br>

> > > Basically, user may have installed packages or files in past which<br>

> > > reside in certain locations.(e.g. /usr/local) Now when MacPorts is<br>

> > > installing a new package, the name of files may conflict with<br>

> > > those pre installed(by the user) and create unpredictable<br>

> > > dependencies unnecessarily increasing build time or errors. The<br>

> > > function of trace mode is to scan all locations that the package<br>

> > > to be installed will access and hide files which are not needed.<br>

> > > But enabling trace mode makes the build process really long. So<br>

> > > what we have to do is optimise the trace mode library by using<br>

> > > better data structures or improvising it.<br>

> ><br>

> > That's mostly correct. On the techical details, the scanning of the<br>

> > files that are being access happens during the build. We inject a<br>

> > library into the processes started during compilation that overrides<br>

> > all libc functions that deal with files (e.g., open(2), stat(2),<br>

> > readdir(3), etc.) and then decide at runtime whether a file should<br>

> > be shown or hidden. If it should be shown, we call the normal<br>

> > implementation of the function, if it should be hidden we return an<br>

> > error code and set errno. The method we use to inject our code into<br>

> > the build is called DYLD_INSERT_LIBRARIES. On Linux, the same<br>

> > technique is called LD_PRELOAD (which may be easier to google).<br>

> > There are subtle differences between Linux and macOS here, but they<br>

> > don't matter for now.<br>

> ><br>

> > The workflow to check whether a file is shown or hidden is as<br>

> > follows:<br>

> ><br>

> > 1. When the first file is checked, the library connects to a unix<br>

> > socket and obtains a list of access rules.<br>

> > 2. The path of the file is checked against the access rules. This<br>

> > can lead to three results:<br>

> ><br>

> >  a) Access is granted. We do this for files in /usr/include, for<br>

> >  example, since we don't want to hide them.<br>

> >  b) Access is denied. We do this for paths that should never affect<br>

> >  our build, e.g. /usr/local.<br>

> >  c) The path is sent via the unix socket to our server component,<br>

> >  which checks the SQLite database of installed files and ports (the<br>

> >  so-called "registry") to find out which port provides the file. It<br>

> >  then checks whether the port that provides the file is a dependency<br>

> >  of the current build. It sends this information back to the<br>

> >  injected library via the socket, which then decides to allow or<br>

> >  deny access.<br>

> ><br>

> > This last part is what's slow and could benefit from optimization.<br>

> > Note that we do not do any caching of this data, e.g. if a process<br>

> > opens a file in the 2c category 20 times, we ask the server 20 times<br>

> > to do the lookup. Additionally, each new process that gets spawned<br>

> > needs to check for the same files again, because there is no shared<br>

> > data between those processes.<br>

> ><br>

> > In my mind, a good idea to optimize trace mode would be to mmap(2) a<br>

> > piece of shared memory into each of the processes our library gets<br>

> > injected to and build a fast path-to-allow/deny data structure in<br>

> > this shared memory area so that the 2c-type query would only have to<br>

> > be sent to the server once for the entire build and would otherwise<br>

> > be cached.<br>

> ><br>

> > The problem here is that we do not know where in the address space<br>

> > of a process this piece of memory would be, so we cannot simply use<br>

> > pointers but would have to use offsets from the start of the memory<br>

> > region. Additionally, we need to think on how to grow the memory<br>

> > region if it runs out of space (while keeping in mind that multiple<br>

> > processes might attempt to do so at the same time). Also, because we<br>

> > do not know in which state and configuration the processes into<br>

> > which our library gets injected are, we must assume that using locks<br>

> > (e.g. mutexes) may end up causing a deadlock and fail the build.<br>

> > Consequently, this shared data structure must use lock-free<br>

> > primitives, such as compare-and-swap.<br>

> ><br>

> > There may be already implementations of such a data structure out<br>

> > there that we could re-use (or there may be implementations that<br>

> > could be turned into such a data structure with relative ease, e.g.<br>

> > if they are already lock-free but use pointers rather than offsets,<br>

> > we may be able to modify them to use offsets).<br>

> ><br>

> > Note that this projects requires a fair bit of low-level system<br>

> > knowledge.<br>

> ><br>

> > If you have any further questions, feel free to ask.<br>

<br>

[1] <a href="https://github.com/macports/macports-base/blob/master/src/darwintracelib1.0/darwintrace.c" rel="noreferrer" target="_blank">https://github.com/macports/macports-base/blob/master/src/darwintracelib1.0/darwintrace.c</a><br>

[2] <a href="https://github.com/macports/macports-base/blob/master/src/pextlib1.0/tracelib.c" rel="noreferrer" target="_blank">https://github.com/macports/macports-base/blob/master/src/pextlib1.0/tracelib.c</a><br>

[3] <a href="https://github.com/macports/macports-base/blob/master/src/port1.0/porttrace.tcl" rel="noreferrer" target="_blank">https://github.com/macports/macports-base/blob/master/src/port1.0/porttrace.tcl</a><br>

[4] <a href="https://en.wikipedia.org/wiki/Ctrie" rel="noreferrer" target="_blank">https://en.wikipedia.org/wiki/Ctrie</a><br>

[5] <a href="https://arxiv.org/abs/1709.06056?context=cs" rel="noreferrer" target="_blank">https://arxiv.org/abs/1709.06056?context=cs</a><br>

<br>

HTH,<br>

Clemens<br>

</blockquote></div>

</div>