In our more than ten years of experience we have written a dozen memory allocators culminating in SmartAlloc. SmartAlloc brings a new level of performance to dynamic object allocation and memory management.




SmartAlloc is optimized to minimize the number of instructions required to allocate an object in the typical case. It is comparable to the best ad-hoc techniques used to accelerate object allocation in many applications. Yet it improves upon these techniques by being able to minimize memory fragmentation. Benchmark results show an average 27% speed improvement relative to the Berkeley memory allocator, and an average 51% improvement relative to the standard Solaris memory allocator.

Memory efficiency

SmartAlloc is designed to make efficient use of storage. By carefully managing fragmentation with efficient algorithms SmartAlloc is able to pack more objects into fewer pages of memory. This allows applications to delay the onset of swapping and can result in substantial performance improvements by reducing the need for swapping I/O.

Cache efficiency

SmartAlloc improves cache performance by ensuring that related objects fit in fewer cache lines and that objects are evenly distributed throughout the cache. In addition, swapping performance is improved by ensuring that related objects fit in fewer pages of memory.

Multi-thread safe

SmartAlloc is fully multi-thread safe. Special care has been taken to minimize contention for thread synchronization semaphores. Most allocations can be completed with a single uncontended lock of a semaphore.

Instrumented for analysis tools

SmartAlloc includes a verison built to generate information on memory usage in a program. This information may then be analyzed to display the memory-usage patterns of a program.



                  Benchmark A         Benchmark B
                 Time   Memory     Time     Memory
SmartAlloc       175ms   500Kb      73s     51,776Kb
Solaris malloc   330     600       103      57,496
BSD malloc       252     875       119      69,879
SVR4 mallocx     419     536       Failed   Failed

All benchmarks were run on a SPARCStation 10 with a single processor, 64Mb of main memory, 128Mb of swap space on a 1Gb disk, and running the Solaris 2.3 operating system. Benchmark B did not complete with the SVR4 malloc(3X) memory allocator.

The benchmark times are solely the time spent in the allocator and do not reflect any improved performance an application may see due to better object layout.

CodeGen, Inc.
PO Box 3357
Oakland, CA 94609