Pooled Allocators For The STL
One of the main complaints about the STL is the perceived lack of memory efficiency. In this article I’ll present a simple pooled allocator for use with the STL containers that allocate single elements at a time, such as list, set or map.
The Core of An STL Allocator
The core functions of an STL allocator are allocate and deallocate. These functions only deal with allocation, not with initialisation. Memory from these functions is then used in placement new operations within the container. This is to allow containers such as vector to allocate large chunks of memory, then do placement new in a loop for optimal efficiency. The declaration for these member functions is as follows:
// allocate a contiguous block of memory pointer allocate(size_type count, std::allocator::const_pointer hint = 0) const; // deallocate a previously allocated block of memory void deallocate(pointer block, size_type count) const throw();
The arguments are described as follows:
count: the number of elements (not bytes) to allocate. The same number that was passed to a call toallocateis passed to the corresponding call todeallocate.hint: the location of a previously allocated (and not yet deallocated) block of memory from this allocator, or null. This pointer can be used to improve data locality if possible.block: an allocated block of memory due for destruction. This will never be null.allocatereturn value: a block of memory with at least enough space forcountelements. If this memory cannot be allocated then throw the appropriate exception (such asstd::bad_alloc).
Rebinding and Preserving State
An allocator must be able to bind itself to an alternate type. This is done by creating an allocator specifically for the desired type, then calling a constructor with the previous allocator as the argument.
// the type of an allocator for type U template<typename U> struct rebind { typedef MyAllocator<U> other; }; // how to construct this allocator from an allocator for type U template<typename U> MyAllocator(MyAllocator<U> const& arg);
If your allocator has state (such as a pool location to allocate from), then the constructor in the code fragment above must do the necessary state management. Unfortunately there is no concept of a template friend class so any member variables must be made public, as will be shown in our pooled allocator implementation.
To test if memory allocated using one allocator can be freed by another, the standard also requires that we implement two operators:
// returns true if the allocators are interchangeable template<typename T, typename U> bool operator==(MyAllocator<T> const& left, MyAllocator<U> const& right); // returns true if the allocators are not interchangeable template<typename T, typename U> bool operator!=(MyAllocator<T> const& left, MyAllocator<U> const& right);
In our pooled allocator implementation this simply compares the pool values: if they are equal (including both null), then memory from one allocator can be deallocated by the other.
Simple Pooling
Here is the interface to the pool class I’ll be using in the example:
// simple pool class that allows single allocations only class Pool { public: Pool(size_t granularity, size_t size); ~Pool(); // allocation/deallocation only (no construction) void* Allocate(); void Deallocate(void* block); // construction/destruction included in the allocation template<typename T> T* Construct(); template<typename T> void Destroy(T* instance); // pool status size_t GetGranularity() const; size_t GetSize() const; size_t GetUsed() const; size_t GetOverflow() const; // gets an allocator that uses this pool PooledAllocator<void> GetAllocator(); };
The functions are used as follows:
Allocate/Deallocate: allocates/deallocates single elements from the pool. The return value fromAllocatepoints to enough memory to storegranularitybytes of data.Deallocatemust not be passed a null pointer.Construct/Destruct: constructs/destructs objects from the pool. Behaves just like operator new/delete except will try to use pooled storage.GetAllocator: returns an STL-compliant pooled allocator that allocates from this pool.
Example Allocator Usage for std::list
Here’s a very short example C++ program that uses the pooled allocator:
#include <iostream> #include "pooled_alloc.h" #include <boost/scoped_ptr.hpp> int main() { // create a pool of 256 32-byte elements boost::scoped_ptr<Pool> pool(new Pool(32, 256)); // use this pool with a std::list of floats PooledList<float>::Type test(pool->GetAllocator()); test.push_back(42.0f); }
Use The Source
Here’s the source to the pool class, the pooled allocator and some helpful structs that act as template typedefs (an annoying ommission from the standard). This code works with GCC 3.1 and Visual Studio 2003, so I’m pretty sure you could get it working with other compilers with minimal (if any) changes.
Obviously I won’t claim that the code is idiot-proof, but I’ve been using it my hobby projects so far without error. In the test cases I’ve tried, using a pooled allocator gains back up to 35% of the insertion/deletion cost compared with the default allocator.
Hey Simon,
Thanks a bunch for this article. I am developing a multithreaded application on a Sun Niagara II box (64 OS “visible” cpus) and I was getting extremely bad performance with the default std::allocator.
(note: your test code will assert on a 64-bit machine).
I have adapted your test to compare the performance on a multi-threaded environment.
Below are the results:
$ uname -a
SunOS local 5.10 Generic_127111-09 sun4v sparc SUNW,SPARC-Enterprise-T5220
$ CC -V
CC: Sun C++ 5.9 SunOS_sparc 2007/05/03
$ ./simple
192:main: 1 threads
201:main: Avg. Factor: 1.07
192:main: 2 threads
201:main: Avg. Factor: 1.03
192:main: 4 threads
201:main: Avg. Factor: 0.92
192:main: 8 threads
201:main: Avg. Factor: 0.48
192:main: 16 threads
201:main: Avg. Factor: 0.17
192:main: 32 threads
201:main: Avg. Factor: 0.23
Contact me if you want the updated code.
Cheers,
-Ipporatis.
Ippokratis
28 Nov 08 at 8:04 pm
Hi,
how do you choose the granularity factor when creating a pool (for example, 32 above)? Is it the supposed node size of the container? So, for example, should I choose 20 for PooledSet (3 pointers + 1 int + 1 bool + padding)
Tx
Tombox
8 Feb 09 at 1:21 pm