Speed up trace mode (GSoC Project)

Clemens Lang cal at macports.org
Mon Mar 25 20:14:05 UTC 2019


Hi,

On Sun, Mar 24, 2019 at 08:40:30PM +0530, Mihir Luthra wrote:
> I had a few questions regarding Darwin trace library.
> 
> Darwintrace library being injected, most I/O operations get
> reimplemented. If a “single” process is working on files, it more or
> less should call most of these functions again and again.
> Like that particular process may call open, rename, rmdir etc. So lets
> say a process was allowed to open file “X”. (closed it too) The same
> process has to do rmdir now. So it will again call
> __darwintrace_is_in_sandbox().
> If a file has already been checked once, that file path can be stored
> at a place so that all these checks don’t need to be re-performed and
> next time if the same process is calling any of these functions, it
> doesn’t need to call the time taking __darwintrace_is_in_sandbox() and
> should check that list of already checked files first. This should
> optimise the trace mode to some extent.
> Is that right or am I mistaken somewhere?

That's right. The relevant question here is how often does this happen
and how often do common programs used during our build open the same
files multiple times.

System calls (especially those dealing with I/O) are usually a couple of
orders of magnitude slower than memory accesses, but even given that
fact, I wouldn't expect a huge difference with this approach unless a
processes opens the same set of 50 files more than 10 times each during
its lifetime.

You could get a rough estimate of whether this happens using dtruss(1)
or opensnoop(1) by running them on a compiler invocation of a
representative sample of C code and counting how often the compiler
opens the same files.

-- 
Clemens


More information about the macports-dev mailing list