The only problem is that he is wrong on almost every detail. He has apparently never heard of mmap, which is awkward because mmap is how memory is obtained, nowadays.
And if you want, you know, performance, it needs to know something about threads, pre-allocating chunks to assign to threads, and batching free ops per thread.
Most of what you can find about writing allocators is written by people who don't actually know how; and actually useful allocators are made by people who don't write tutorials about it.
What is he wrong about? He's upfront about limiting the scope to single-threaded arena allocators, but still mentions mmap and threads in passing. For talks that are mostly sales pitches, I think it's a pretty informative introduction to how to select and compare allocators.
mmap is not only to map a file content into a process address space, it can also be used to request memory from the OS using the MAP_ANONYMOUS flag. Early on a malloc-like allocator would use brk/sbrk to get a chunk of "raw" memory, and manage this for the user providing access using malloc/free. Nowadays the same "raw" memory can be obtained using an anonymous map. A raw pool, when unused, can be returned to the OS by unmapping it.
Using an anonymous map provides address space randomization by default (for the raw pool location), whereas brk/sbrk extends a limit and because of this are predictable.
Thread-based allocators are a bit overrated. In apps it’s quite common for objects to migrate across threads, especially from workers to the main thread.
Then it goes to the freeing thread memory pool. Thread-aware mallocs know this and have clever balancing mechanisms for situations like producer-consumer, where one thread allocs a lot and another frees a lot.
Edit: wrong, I re-checked. The solution to this is that marking an object as free is lock-free. The freed objects do not immediately change thread pools.
And if you want, you know, performance, it needs to know something about threads, pre-allocating chunks to assign to threads, and batching free ops per thread.
Most of what you can find about writing allocators is written by people who don't actually know how; and actually useful allocators are made by people who don't write tutorials about it.