Speed up trace mode (GSoC Project)

Clemens Lang cal at macports.org
Sat Apr 6 17:50:38 UTC 2019


Hi,

On Sat, Apr 06, 2019 at 01:05:08AM +0530, Mihir Luthra wrote:
> > You have the right ideas to solve the problem. Do keep in mind
> > though that CAS will only work up to a word size or a double word
> > size at most, i.e. swapping more than 64 bit with CAS atomically is
> > probably not going to work.
> 
> Got it. I may swap one by one instead if swapping the complete
> structure and its just 4 variables.

That's unfortunately not the same, since while you're swapping the
second value, a different thread could swap the first for a different
value again.

> > I'm not sure whether we will actually need a separate process to
> > increase the allocation size. Consider this proposal:
> >
> >   struct shared_block_mgmt {
> >     size_t last_known_size;
> >     size_t refcnt;
> >     int blockfd;
> >     struct shared_block* block;
> >   };
> >
> >   struct shared_block_mgmt block_mgmt;
> >
> >   struct shared_block {
> >     size_t size;
> >     size_t used;
> >     char memory[];
> >   };
> >
> > This struct would mean that the first `used` bytes of `memory` are
> > in-use by our data structure and everything after that up to `size`
> > bytes is free[1].
> >
> > Any allocation request where used + request_size <= size would succeed
> > and we could change `used` using CAS to ensure that this allocation
> > operation is atomic.
> >
> > Any allocation request where used + request_size > size would trigger
> > growing the memory area. Growing would calculate the size we want to
> > grow to, let's call this `target_size`. Then, it would attempt to grow
> > atomically using:
> >
> >   size_t local_size;
> >   size_t local_used;
> >   size_t target_used;
> >   size_t target_size;
> >   bool needs_resize;
> >   do {
> >     local_used = block_mgmt.block->used;
> >     local_size = block_mgmt.block->size;
> >     target_used = local_used + request_size;
> >     needs_resize = target_used < local_size;
> >     if (needs_resize) {
> >       // growing required
> >       target_size = local_size + BLOCK_GROWTH;
> >       ftruncate(block_mgmt.blockfd, target_size);
> >
> 
> What I was doing is to keep a file created beforehand and append it
> whenever needed to the existing file.
> ftruncate() is much better & cleaner approach I guess.
> 
> 
> >     }
> >   } while (
> >       (needs_resize &&
> >         !CAS(&block_mgmt.block->size, local_size, target_size) &&
> >         !CAS(&block_mgmt.block->used, local_used, target_used)) ||
> >       (!needs_resize &&
> >         !CAS(&block_mgmt.block->used, local_used, target_used)));
> >
> >   // At this point, either resizing was not needed and block->used is
> >   // what we expect it to be (i.e. allocation succeeded), or resizing
> >   // was needed, and we did successfully resize, and after resizing we
> >   // did update block->used (i.e. allocation also succeeded).
> >
> > This will oportunistically call ftruncate with the bigger size on the
> > mmap'd file descriptor, but calling ftruncate in multiple processes at
> > the same time should not be a problem unless the truncation would ever
> > shrink the file (which we can not completely avoid, but make very
> > unlikely by making BLOCK_GROWTH
> 
> 
> Maybe, we can call ftruncate outside the loop and just set size first.
> I dunno if it is possible but if it is we can modify compare and swap to
> behave in the following way:
> If the process detects that used + request_size > size and area needs
> to be grown, instead of swapping the size with its new value, we can
> just increment the old value without caring what is the old value by
> ‘requested_size’. Now in this type of swapping, we just don’t need to
> care what is old value, all we do is increment requested size to it.
> So if 2 or more process do this simultaneously, they just end up
> incrementing the memory correctly. And to keep a margin we can add
> some extra_growth to each.

That should be possible using atomic fetch-and-add [1]. This may end up
increasing the storage by more than we need, but that's not a problem.

The problem I see is that we actually need to do the truncation and
remapping at some point. We should end up in a situation where the size
is actually larger than the file/memory block.

> Although I m not sure if atomic operations allow this, a better
> approach maybe to just share a variable called maxFileMemory which can
> only be swapped if the new value being provided is greater. Before
> mmaping for write or read we may just ensure if current file size
> matches maxFileMemory. This maybe would prevent shrinking.
> 
> Or maybe we somehow can block shrinking by implementing our own
> ftruncate or some other way maybe.

Unfortunately ftruncate(2) is all that we have, and without writing
kernel code I don't see a way to do this.


On Sat, Apr 06, 2019 at 05:50:40PM +0530, Mihir Luthra wrote:
> I was trying to test the code by making changes but I am stuck with
> one issue. If I install from git and set it up I always receive this
> error message. (Please find the screenshots in attachment)
> 
> 1) I tried `make` on base code taken as it is from git without checking out
> to latest version, it showed this error.
> 2) I tried `make` on v2.5.4, it still shows same error.

[...]

> In case of git install, it seems to be some error of that macro
> HAVE_DECL_RL_USERNAME_COMPLETION_FUNCTION.
> Because only if it is false, it should move to next macro and choose
> to use username_completion_function.
> But in make errors, it says rl_username_completion_function is
> declared somewhere and exists. So I guess
> HAVE_DECL_RL_USERNAME_COMPLETION_FUNCTION should have been defined.

This happens when you have your old installation in $PATH while
configuring MacPorts. Use
  export PATH=/bin:/sbin:/usr/bin:/usr/sbin
before running the configure script.

> 3) I tried source install, here the installation completes
> successfully and mostly everything works fine except trace mode. (I
> didn’t make any changes to source code) I tried installing 7-8 ports
> in this case and whenever I used trace mode, it said unable to fetch
> archive.

Can you pastebin a main.log of a build that fails like this?


[1] https://en.wikipedia.org/wiki/Fetch-and-add

-- 
Clemens


More information about the macports-dev mailing list