> An example of such an optimization is the arena allocator[1][2] employed by pr...

> An example of such an optimization is the arena allocator[1][2] employed by protobuf.

I work on the protobuf team at Google, so I'm aware of this.

Two things about that:

1. The underlying blocks for the arena allocator still come from the system allocator.

2. Because the arena allocator inhibits the capabilities of standard malloc-debugging tools like ASAN and Valgrind, the protobuf arena allocator includes special ASAN-aware code to mitigate this:

https://github.com/google/protobuf/blob/d64a2d9941c36a7bc2a7...

However, that code is ASAN-specific. It won't help other tools like Valgrind. So yes, different allocators are sometimes warranted for specific patterns like arenas. But if all you want is plain malloc()/free(), you should call malloc()/free().

If you're writing a library, letting the user specify their own allocation callback is also great, since it lets the user do whatever custom bookkeeping/pooling/etc. they want to do. But by default just call malloc()/free() (IMHO).