>many functions allocate memory and what are they supposed to do when there is n...

kllrnohj · 2025-05-07T02:22:24 1746584544

> Rust unfortunately picked the wrong default here for the sake of convenience, along with the default of assuming a global allocator. [...] Zig for example does it right by having explicit allocators from the start

Rust picked the right default for applications that run in an OS whereas Zig picked the right default for embedded. Both are good for their respective domains, neither is good at both domains. Zig's choice is verbose and useless on a typical desktop OS, especially with overcommit, whereas Rust's choice is problematic for embedded where things just work differently.

Arnavion · 2025-05-07T03:12:25 1746587545

Various kind of "desktop" applications like databases and video games use custom non-global allocators - per-thread, per arena, etc - because they have specific memory allocation and usage patterns that a generic allocator does not handle as well as targeted ones can.

My current $dayjob involves a "server" application that needs to run in a strict memory limit. We had to write our own allocator and collections because the default ones' insistence on using GlobalAlloc infallibly doesn't work for us.

Thinking that only "embedded" cares about custom allocators is just naive.

kllrnohj · 2025-05-07T12:49:07 1746622147

> Thinking that only "embedded" cares about custom allocators is just naive.

I said absolutely no such thing? In my $dayjob working on graphics I, too, have used custom allocators for various things, primarily in C++ though, not Rust. But that in no way makes the default of a global allocator wrong, and often those custom allocators have specialized constraints that you can exploit with custom containers, too, so it's not like you'd be reaching for the stdlib versions probably anyway.

simonask · 2025-05-07T07:56:11 1746604571

I don't see why you would have to write your own - there are plenty of options in the crate ecosystem, but perhaps you found them insufficient?

As a video game developer, I've found the case for custom general-purpose allocators pretty weak in practice. It's exceedingly rare that you really want complicated nonlinear data structures, such as hash maps, to use a bump-allocator. One rehash and your fixed size arena blows up completely.

95% of use cases are covered by reusing flat data structures (`Vec`, `BinaryHeap`, etc.) between frames.

monkeyelite · 2025-05-07T09:27:45 1746610065

> there are plenty of options in the crate ecosystem

Who writes the crates?

simonask · 2025-05-08T07:02:37 1746687757

That's public information. It's up to you to make the choice whether to trust someone, but the least you can do is look at the code and see if it matches what you would have done.

Arnavion · 2025-05-07T08:30:22 1746606622

The allocator we wrote for $dayjob is essentially a buffer pool with a configurable number of "tiers" of buffers. "Static tiers" have N pre-allocated buffers of S bytes each, where N and S are provided by configuration for each tier. The "dynamic" tier malloc's on demand and can provide up to S bytes; it tracks how many bytes it has currently allocated.

Requests are matched against the smallest tier that can satisfy them (static tiers before dynamic). If no tier can satisfy it (static tiers are too small or empty, dynamic tier's "remaining" count is too low), then that's an allocation failure and handled by the caller accordingly. Eg if the request was for the initial buffer for accepting a client connection, the client is disconnected.

When a buffer is returned to the allocator it's matched up to the tier it came from - if it came from a static tier it's placed back in that tier's list, if it came from the dynamic tier it's free()d and the tier's used counter is decremented.

Buffers have a simple API similar to the bytes crate - "owned buffers" allow &mut access, "shared buffers" provide only & access and cloning them just increments a refcount, owned buffers can be split into smaller owned buffers or frozen into shared buffers, etc.

The allocator also has an API to query its usage as an aggregate percentage, which can be used to do things like proactively perform backpressure on new connections (reject them and let them retry later or connect to a different server) when the pool is above a threshold while continuing to service existing connections without a threshold.

The allocator can also be configured to allocate using `mmap(tempfile)` instead of malloc, because some parts of the server store small, infrequently-used data, so they can take the hit of storing their data "on disk", ie paged out of RAM, to leave RAM available for everything else. (We can't rely on the presence of a swapfile so there's no guarantee that regular memory will be able to be paged out.)

As for crates.io, there is no option. We need local allocators because different parts of the server use different instances of the above allocator with different tier configs. Stable Rust only supports replacing GlobalAlloc; everything to do with local allocators is unstable, and we don't intend to switch to nightly just for this. Also FWIW our allocator has both a sync and async API for allocation (some of the allocator instances are expected to run at capacity most of the time, so async allocation with a timeout provides some slack and backpressure as opposed to rejecting requests synchronously and causing churn), so it won't completely line up with std::alloc::Allocator even if/when that does get stabilized. (But the async allocation is used in a localized part of the server so we might consider having both an Allocator impl and the async direct API.)

And so because we need local allocators, we had to write our own replacements of Vec, Queue, Box, Arc, etc because the API for using custom A with them is also unstable.

michalsustr · 2025-05-07T06:54:15 1746600855

Did you publish these by any chance?

Arnavion · 2025-05-07T08:49:49 1746607789

Sorry, the code is closed source.

johnisgood · 2025-05-06T23:27:43 1746574063

> Zig for example does it right by having explicit allocators from the start

Odin has them, too, optionally (and usually).

smj-edison · 2025-05-06T20:25:24 1746563124

> Rust unfortunately picked the wrong default here

I partially disagree with this. Using Zig style allocators doesn't really fit with Rust ergonomics, as it would require pretty extensive lifetime annotations. With no_std, you absolutely can roll your own allocation styles, at the price of more manual lifetime annotations.

I do hope though that some library comes along that allows for Zig style collections, with the associated lifetimes... (It's been a bit painful rolling my own local allocator for audio processing).

Arnavion · 2025-05-06T21:25:15 1746566715

Explicit allocators do work with Rust, as evidenced by them already working for libstd's types, as I said. The mistake was to not have them from day one which has caused most code to assume GlobalAlloc.

As long as the type is generic on the allocator, the lifetimes of the allocator don't appear in the type. So eg if your allocator is using a stack array in main then your allocator happens to be backed by `&'a [MaybeUninit<u8>]`, but things like Vec<T, A> instantiated with A = YourAllocator<'a> don't need to be concerned with 'a themselves.

Eg: https://play.rust-lang.org/?version=nightly&mode=debug&editi... do_something_with doesn't need to have any lifetimes from the allocator.

If by Zig-style allocators you specifically mean type-erased allocators, as a way to not have to parameterize everything on A:Allocator, then yes the equivalent in Rust would be a &'a dyn Allocator that has an infectious 'a lifetime parameter instead. Given the choice between an infectious type parameter and infectious lifetime parameter I'd take the former.

smj-edison · 2025-05-06T21:55:47 1746568547

Ah, my bad, I guess I've been misunderstanding how the Allocator proposal works all along (I thought it was only for 'static allocators, this actually makes a lot more sense!).

I guess all that to say, I agree then that this should've been in std from day one.

steveklabnik · 2025-05-07T00:00:48 1746576048

The problem is, everything should have been there since day 1. It’s still unclear which API Rust should end up with, even today, which is why it isn’t stable yet.

JBits · 2025-05-07T01:34:02 1746581642

Looking forward to the API when it's stabilised. Have there been any updates on the progress of allocators of this general area of Rust over the past year?

steveklabnik · 2025-05-07T01:41:26 1746582086

I haven’t paid that close of attention, but there have been two major APIs that people seem to be deciding between. We’ll see.

imtringued · 2025-05-07T09:39:12 1746610752

>Return an AllocationError. Rust unfortunately picked the wrong default here for the sake of convenience, along with the default of assuming a global allocator. It's now trying to add in explicit allocators and allocation failure handling

Going from panic to panic free in Rust is as simple as choosing 'function' vs 'try_function'. The actual mistakes in Rust were the ones where the non-try version should have produced a panic by default. Adding Box::try_new next to Box::new is easy.

There are only two major applications of panic free code in Rust: critical sections inside mutexes and unsafe code (because panic safety is harder to write than panic free code). In almost every other case it is far more fruitful to use fuzzing and model checking to explicitly look for panics.

estebank · 2025-05-07T14:54:03 1746629643

In order to have true ergonomic no_panic code in Rust you'd need to be able to have parametricity on the panic behavior: have a single Box::new that can be context determined to be panicky or Result based. It has to be context determined and not explicitly code determined so that the top most request for the no_panic version to be propagated all the way down to stdlib through the entire stack. If you squint just a bit, you can see this is the same as maybe async, and maybe const, and maybe allocate, and maybe wrapping/overflowing math, etc. So there's an option to just add try_ methods on the entire stdlib, which all the code between your API and the underlying API need to use/expose, or push for a generic language level mechanism for this. Which then complicates the language, compiler and library code further. Or do both.