A friend of mine is currently doing a class assignment—one I too attempted years ago—and in the process of talking about it with him I’ve been going back down memory lane.
When I tried it my partner and I ended up failing miserably because of a few things, least of which being a complete lapse in judgement and deciding not to use source control. We also tried doing some cleverness with segmented best-fit lists, and it basically just ended in tears.
But, I’m older (hah) and wiser (hah hah) and so I’ve got a few more interesting things to say about it now. I might even go and throw up some code snippets on Github and relive my traumatic past. Anyways, onwards.
Basic of memory allocation in ‘nix
So, assume that we’ve got a process running with some heap. In this process, we decide that we want to allocate additional memory on the heap (maybe because we don’t have TCO or something—we don’t want to just extend the stack indefinitely).
Anyways, there’s an ending to the heap segment, and any attempts to access memory past this point are going to result in a segfault. To extend this segment, we use the
sbrk call (here is the wiki article on these). Note that on Windows you’ll be using the
HeapAlloc() and related functions, and never have to actually deal with this sort of nonsense (see here). Note also that in modern ‘nix you’ll probably be using
mmap() instead. For learning, though, let’s ignore those other more reasonable functions.
sbrk and family is not thread-safe, hence the use of other functions. So, unless we’re writing a custom allocator for Ruby or Node (lol), we won’t want to use it.
Malloc, version 0.0
The most brain-dead version of
free() would look like this (ignoring some of the additional functions posted here):
1 2 3 4 5 6 7 8 9
The obvious problem with this, though, is that it will merrily leak memory and eventually fail. It’s also slow, because it has to hit a system call boundary.
This is also not great, because we want to actually track allocations, and this won’t let us do anything. So, where can we stash that information?
Malloc, version 0.5
So, let’s go and update our malloc to do a bit of extra tracking. This is a good way of showing a trick we’ll be making a lot of use of—writing hidden headers for our bookkeeping while still returning a pointer that is useful to the users. This same style of trick, incidentally, is used by Redis!
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24
Ideally you’d want to check for weird edge cases, like passing in a NULL ptr to
free(), or a length to
malloc() so large that it causes the arithmetic on
sbrk() to overflow. In our case, though, we’ll be ignoring those practical issues so we can focus on the allocation theory.
Now, we still have a useless
free(), and you might wonder why we don’t attempt to use
sbrk() with a negative length in order to shrink the heap back down. We can do this, but we’d need to only free allocations at the end of the heap for that to work properly—if you freed an allocation in the middle somewhere naively, you’d end up shrinking the heap into actively used memory.
In order to do the correct thing we’d need to keep a list of deallocated blocks, only deallocate blocks located at the end of the heap, and continue shrinking only as long as the list of deallocated blocks contains a block adjacent to the end of the heap. Such a list, incidentally, we’ll call a free list from here on out.
Anyways, it’d be a pain in the ass, and wouldn’t help us.
Next time, I’ll talk about implementing a free list and basically creating a memoized version of
sbrk(). I’ll also talk about some other ways of improving our allocation library to be faster and more interesting.
Dynamic Storage Allocation: A Survey and Critical Review has a really thorough review of different allocation methods and bookkeeping techniques.