Hacker News new | past | comments | ask | show | jobs | submit | more zozbot234's comments login

> Awesome! Reminds me of the good old days of QuickBasic and SCREEN 13, when you could write very small programs with fullscreen graphics.

That's a very inefficient approach nowadays. Modern hardware uses accelerated graphics throughout, including for simple 2D rendering the sort of which you would've written in QuickBasic back in the day. Even more complex 2D (where the 3D-render pipeline doesn't help as much) is generally best achieved by resorting to GPU-side compute, as seen e.g. in the Linebender Vello project. This is especially relevant at higher resolutions, color depths and frame rates, where the approach of pushing pixels via the CPU becomes even more clearly an unworkable one.


CPUs are also not slower than they were in the days of software-rendered Quake, so you can render things in software if you don't want to add a whole bunch of complexity to your software stack.

Yeah, a 4K display is about 8 megapixels so at 60fps you need to write 480M pixels per second. That's feasible with a single CPU depending on the complexity of the rendering. Multi-core can get more work done per pixel. You'd still be writing highly optimized code to render fairly simple things if they require a full screen update.

That's assuming you rewrite the whole screen every frame. Most productivity apps don't. Windows (and X11) had this whole infrastructure about managing dirty regions.

Computers with GPUs are now fast enough to blast through recomputing the entire screen's contents every frame, but perhaps for efficiency reasons we still shouldn't?


> You lowering your tax rate and giving that money to charity isn't magicking more money into the world, it is just a different allocation.

This is only ever true if you assume that government tax spending is 100% efficient, with nary a fraction of a cent being wasted. I don't think that's a safe assumption.


No. The assumption is that charity and government have roughly equivalent efficiency. Both government and charities have (wildly varying) overhead and government agencies may enjoy economies of scale that charities do not. Yet another area of the world that contains a surprising amount of detail.

> The problem with leaving everything to private charity is that the wealthy people doing the donating dictate what counts as "public good" without you and I having any say over it.

The thing about public goods is that people tend to agree pretty closely about what they are. The wealthiest person in the world benefits from, e.g. clean air just as much as you do. You should be a lot more worried about wealthy folks who don't donate to charity and just spend the money on big luxury yachts and the like, because these folks are essentially free-riding on everyone else.


> The thing about public goods is that people tend to agree pretty closely about what they are.

Is there some data that shows that?

> The wealthiest person in the world benefits from, e.g. clean air just as much as you do.

We can find public goods in common for many groups, but that's actually a bad example. Wealthy people care about clean air in their neighborhood; pollution is therefore concentrated in poor areas. They don't site the new incerator (or drug treatment facility) on the Upper East Side of Manhatten.

Many needs are specific to poverty. For example, wealthy people are not subject to malaria; they are no illiterate; they don't need toilets or labor rights; they can afford college for their kids regardless of tuition; they have unlimited access to safe, fresh, healthy food. They don't need more available and less expensive health care, so they donate to cancer research and high-tech therapy and not to the medical clinic in the poor neighborhood.


Given (at least the USA's) increasingly polarized population, I don't think it's at all true that people agree closely about what should be funded, and I'll admit that fact makes my argument weaker: The danger of a particular wealthy person "donating to evil" is similar to the danger that the majority of the country votes to "fund evil."

I also agree that wealthy folks spending their wealth on luxury yachts while the public suffers is also something to worry about. Who knew? Gargantuan wealth inequalities are mostly downside for everyone but the wealthy!


Shouldn't we be a lot more worried about how political polarization might impact government choices, compared to private sector ones? Private actors who spend their own money have to pay for their own choices and are accountable to themselves in a way that political operatives fundamentally don't. I see a lot more potential for 'evil' on the political/state actor side.

> There's a reason they're marked as secondary beneficiaries on all my accounts.

Strictly speaking, the foundation discourages individuals from donating directly to them, mostly because the tax treatment of giving that way isn't necessarily favorable. They've set up Gates Philanthropy Partners as a 501(c)(3) charity which is aligned to the same philanthropic goals.

(Of course there's also many other worthwhile players in the broader EA space.)


> It seems like most "effective altruists" want to do things that help "humanity" but don't help "people" -- so developing technology to explore the stars is on the table, but fighting poverty is not.

You seem to have very weird ideas about how EA funding works in practice. Long-termism is flashy and peculiar so it gets a lot of excess visibility, but "fighting poverty" tends to get the bulk of EA money, and the most controversial cause that still gets real sizeable funding seems to be animal welfare.


Well all we in HN-adjacent spaces hear about is the EA people getting rich so they can build a RoccoBasilisc-countering super weapon or something.

That's the e/acc folks actually. Different acronym.

Isn't this essentially the generic typestate pattern in Rust? In my view there is a pretty obvious connection between that particular pattern and how other languages implement OO inheritance, though in all fairness I don't think that connection is generally acknowledged.

(For one thing, it's quite obvious to see that the pattern itself is rather anti-modular, and the ways generic typestate is used are also quite divergent from the usual style of inheritance-heavy OO design.)


Rust actually allows one to express "family tree" object inheritance quite cleanly via the generic typestate pattern. It isn't "garbage", it totally has its uses. It is however quite antithetical to modularity: the "inheritance hierarchy" can only really be understood as a unit, and "extensibility" for such a hierarchy is not really well defined. Hence why in practice it mostly gets used in cases where the improved static checking made possible by the "typestate" pattern can be helpful, which has remarkably little to do with "OOP" design as generally understood.

That logo went away when they stopped supporting proper S3 sleep.

People who say "Rust compiling is so slow" have never experienced what building large projects was like in the mid-1990s or so. It's totally fine. Besides, there's also https://xkcd.com/303/

Or maybe they have experienced what it was like and they don't want to go back.

Not really relevant. The benchmark is how other language toolchains perform today, not what they failed to do 30 years ago. I don't think we'd find it acceptable to go back to mid-'90s build times in other languages, so why should we be ok with it with something like Rust?

> The one thing that sold me on Rust (going from C++) was that there is a single way errors are propagated: the Result type. No need to bother with exceptions

This isn't really true since Rust has panics. It would be nice to have out-of-the-box support for a "no panics" subset of Rust, which would also make it easier to properly support linear (no auto-drop) types.


I wish more people (and crate authors) would treat panic!() as it really should be treated: only for absolutely unrecoverable errors that indicate that some sort of state is corrupted and that continuing wouldn't be safe from a data- or program-integrity perspective.

Even then, though, I do see a need to catch panics in some situations: if I'm writing some sort of API or web service, and there's some inconsistency in a particular request (even if it's because of a bug I've written), I probably really would prefer only that request to abort, not for the entire process to be torn down, terminating any other in-flight requests that might be just fine.

But otherwise, you really should just not be catching panics at all.


> I probably really would prefer only that request to abort, not for the entire process to be torn down,

This is a sign you are writing an operating system instead of using one. Your web server should be handling requests from a pool of processes - so that you get real memory isolation and can crash when there is a problem.


Even if you used a pool of processes, that's still not one process per request, and you still don't want one request crashing to tear down unrelated requests.

I question both things. I would first of all handle each request in its own process.

If there was a special case that would not work, then the design dictates that requests are not independent and there must be risk of interference (they are in the same process!)

What I definitely do not want is a bug ridden “crashable async sub task” system built in my web program.


This is simply a wrong idea about how to write web servers. You're giving up scalability massively, only to gain a minor amount of safety - one that is virtually irrelevant in a memory safe language, which you should anyway use. The overhead of process-per-request, or even thread-per-request, is absurd if you're already using a memory safe language.

> You're giving up scalability massively

you’re vastly over estimating the overhead of processes and number of simultaneous web connections.

> only to gain a minor amount of safety

What you’re telling me is performance (memory?) is such a high priority you’re willing to make correctness and security tradeoffs.

And I’m saying thats ok, one of those is crashing might bring down more than one request.

> one that is virtually irrelevant in a memory safe language

Your memory safe language uses C libraries in its process.

Memory safe languages have bugs all the time. The attack surface is every line of your program and runtime.

Memory is only one kind of resource and privilege. Process isolation is key for managing resource access - for example file descriptors.

Chrome is a case study if these principles. Everybody thought isolating JS and HTML pages should be easy - nobody could get it right and chrome instead wrapped each page in a process.


Please find one web server being actively developed using one process per request.

Handling thousands of concurrent requests is table stakes for a simple web server. Handling thousands of concurrent processes is beyond most OSs. The context switching overhead alone would consume much of the CPU of the system. Even hundreds of processes will mean a good fraction of the CPU being spent solely on context switching - which is a terrible place to be.


> Handling thousands of concurrent processes is beyond most OS

It works fine on Linux - the operating system for the internet. Have you tried it?

> good fraction of the CPU being spent solely on context switching

I was waiting for this one. Threads and processes do the same amount of context switching. The overhead of processes switch is a little higher. The main cost is memory.


> Threads and processes do the same amount of context switching.

Yes, therefore real webservers use a limited amount of threads/processes (in the same ballpark as a number of CPU cores). Modern approach is to use green threads which are really cheap to switch, it is like store registers, read registers and jmp.

> The main cost is memory.

The main cost is scheduling, not switching per se. Preemptive multitasking needs to deal with priorities to not waste time, and algorithms that do it are O(N) mostly. All these O(N) calculations needs to be completed multiple times per second, the higher the frequency of switching the more work to do. When you have thousands of processes it is the main cost. If you have tens of thousands it starts to bite hard.


> The main cost is scheduling, not switching per se. Preemptive multitasking needs to deal with priorities to not waste time, and algorithms that do it

The person I am having a conversation with is advocating for threads instead of processes. How do you think threads work?

> Modern approach is to use green threads which are really cheap to switch, it is like store registers, read registers and jmp.

That’s certainly the popular approach. As I said at the beginning this approach is making a mini operating system with more bugs and less security rather than leveraging the capabilities of your operating system.

Once again, im waiting to here about your experience of maxing out processes and after that having to switch to green threads.


> The person I am having a conversation with is advocating for threads instead of processes. How do you think threads work?

I was certainly not, I explicitly said that thread-per-request is as bad as process-per-request. I could even agree that it's the worse of both worlds to some extent - none of the isolation, almost all of the overhead (except if you're using a language with a heavy runtime, like Java, where spawning a new JVM has a huge cost compared to a new thread in an existing JVM).

Modern operating systems provide many mechanisms for doing async IO specifically to prevent the need for spawning and switching between thousands of processes. Linux in particular has invested heavily in this, from select, to poll, to epoll, and now unto io_uring.

OS process schedulers are really a poor tool for doing massively parallel IO. They are a general purpose algorithm that has to keep in mind many possible types of heterogeneous processes, and has no insight into the plausible behaviors of those. For a constrained problem like parallel IO, it's a much better idea to use a purpose-built algorithm and tool. And they have simply not been optimized with this kind of scale in mind, because it's much more important and common use case to run quickly for a small number of processes than it is to scale up to thousands. There's a reason typical ulimit configurations are limited to around 1000 threads/processes per system for all common distros.


> Linux in particular has invested heavily in this, from select, to poll, to epoll, and now unto io_uring.

Correction. People who wanted to do async IO went and added additional support for it. The primary driver is node.js.

> And they have simply not been optimized with this kind of scale in mind,

yes, processes do not sacrifice security and reliability. That’s the difference.

The fallacy here is assuming that a process is just worse for hand wavy reasons and that your language feature has fa secret sauce.

If it’s not context switching then that means you have other scheduling problems because you cannot be pre-empted.

> There's a reason typical ulimit configurations are limited to around 1000 threads/processes per system

STILL waiting to hear about your experience of maxing out Linux processes on a web server - and then fixing it with green threads.

I suspect it hasn’t happened.


> The person I am having a conversation with is advocating for threads instead of processes. How do you think threads work?

Are they? I looked back and I've found this quote of them: "The overhead of process-per-request, or even thread-per-request, is absurd if you're already using a memory safe language." Doesn't seem as an advocacy for thread-per-request to me.

> As I said at the beginning this approach is making a mini operating system with more bugs and less security rather than leveraging the capabilities of your operating system.

Lets look at Apache for example. It starts a few processes and/or threads, but then each thread deals with a lot of connections. The threads Apache starts are for spreading work over several CPUs and maybe to overcome some limits of select/poll/epoll. The main approach is to track a state of a connection, and when something happens on a socket, Apache find the state of the connection and deals with events on the socket. Then it stores the new state and moves to deal with other sockets in the same manner.

It is like green threads but without green threads. Green threads streamlines all this state keeping by allowing each connection to have it's own stack. And I'd say it is easier to do right than to write a finite automata for HTTP/HTTPS.

> Once again, im waiting to here about your experience of maxing out processes and after that having to switch to green threads.

Oh, I didn't. A long long time ago I was reading stuff on networking. All of it was in one opinion: 10k kernel tasks maybe a tolerable solution, but 100k is bad. IIRC Apache had a document describing its internal architecture and explaining why it is as it is.

So I wouldn't even try to start thousands of threads. I mean I tried to start 1000s of processes when I was young and learned about fork-bombs, and this experience confirmed it for me, that 1000s of processes is not a really good idea.

Moreover I completely agree with them: if you use a memory-safe language, then it is strange to pay costs for preemptive multitasking just to have separate virtual address spaces. I mean, it will be better to get a virtual machine with JIT compiler, and run code for different connection on different instances of a virtual machine. O(1) complexity of cooperative switching will beat O(N) complexity of preemptive switching. To my mind hardware memory management is overrated.


> Lets look at Apache for example

Apache has years of engineering work - and almost weekly patches to fix issues related to security. Many of these security issues would go away if they were not using special technique to optimize performance.

But the best part of the web is its modular. So now your application doesn’t need to that. It can leverage those benefits without complexity cascade.

For example, Apache can manage more connections than your application needs running processes for.

> I was reading stuff on networking….

That’s exactly my point. Too many people are repeating advice from Google or Facebook and not actually thinking about real problems they face.

Can you serve more requests using specialized task management? Yes. You can make a mini-OS with fewer features to squeeze out more scheduling performance and that’s what some big companies did.

But you will pay for that with reduced security and reliability. To bring it back to my original complaint - you must accept that a crash can bring down multiple requests.

And it’s an insane default to design Rust around. It’s especially confusing to make all these arguments about how “unsafe” languages are, but then ignore OS safety in hopes of squeezing out a little more perf.

> So I wouldn't even try to start thousands of threads.

Please try it before arguing it doesn’t work. Fork bombing is recursive and unrelated.

> if you use a memory-safe language, then it is strange to pay costs for preemptive multitasking just to have separate virtual address spaces

Then why do these “memory-safe” languages need constant security patches? Why does chrome need to wrap each page’s JS in its own process?

In theory you’re right. If they are actually memory-safe then you don’t need to consider address spaces. But in practice the attack surface is massive and processes give you stronger invariants.


We did that at Dropbox in Python for a while. Though they switched to async after I left.

> you’re vastly over estimating the overhead of processes and number of simultaneous web connections.

It's less the actual overhead of the process but the savings you get from sharing. You can reuse database connections, have in-memory caches, in-memory rate limits and various other things. You can use shared memory which is very difficult to manage or an additional common process, but either way you are effectively back to square one with regards to shared state that can be corrupted.


You certainly can get savings. I question how often you need that.

I just said one of the costs of those saving is crashing may bring down multiple requests - and you should design with that trade off.


> only for absolutely unrecoverable errors

Unfortunately even the Rust core language doesn't treat them this way.

I think it's arguably the single biggest design mistake in the Rust language. It prevents a ton of useful stuff like temporarily moving out of mutable references.

They've done a shockingly good job with the language overall, but this is definitely a wart.


Using a Rust lib from Swift on macOS I definitely want to catch panics - to access security scoped resources in Rust I need the Rust code to execute in process (I believe) but I’d also like it not to crash the entire app.

would you consider panics acceptable when you think it cannot panic in practice? e.g. unwraping/expecting a value for a key in a map when you inserted that value before and know it hasn't been removed?

you could have a panic though, if you wrongly make assumptions


Obviously yes. For the same reason it's acceptable that myvec[i] panics (it will panic if i is out of bounds - but you already figured out that i is in bounds) and a / b panic for a and b integers (it will panic if b is zero, but if your code is not buggy you already tested if b is zero prior to dividing right?)

Panic is absolutely fine for bugs, and it's indeed what should happen when code is buggy. That's because buggy code can make absolutely no guarantees on whether it is okay to continue (arbitrary data structures may be corrupted for instance)

Indeed it's hard to "treat an error" when the error means code is buggy. Because you can rarely do anything meaningful about that.

This is of course a problem for code that can't be interrupted.. which include the Linux kernel (they note the bug, but continue anyway) and embedded systems.

Note that if panic=unwind you have the opportunity to catch the panic. This is usually done by systems that process multiple unrelated requests in the same program: in this case it's okay if only one such request will be aborted (in HTTP, it would return a 5xx error), provided you manually verify that no data structure shared by requests would possibly get corrupted. If you do one thread per request, Rust does this automatically; if you have a smaller threadpool with an async runtime, then the runtime need to catch panics for this to work.


> Note that if panic=unwind you have the opportunity to catch the panic.

And now your language has exceptions - which break control flow and make reasoning about a program very difficult - and hard to optimize for a compiler.


Yeah, but this isn't the only bad thing about unwinding. Much worse than just catching panics is the fact that a panic in a thread takes down only that thread (except if it is in the main thread). If your program is multithreaded, panic=unwind makes it much harder to understand how it reacts to errors, unless you take measures to shut down the program if any thread panic (which again, requires catch_unwind if you have unwinding). Also: that's why locks in Rust have poisoning, they exist so that panics propagate between threads: if a thread panics while holding a lock, any other thread attempting to acquire this lock will panic too (which is better than a deadlock for sure)

And that's why my programs get compiled with panic=abort, that makes panics just quit the program, with no ability to catch them, and no programs in zombie states where some threads panicked and others keep going on.

But see, catch_panic is an escape hatch. It's not meant to be used as a general error handling mechanism and even when doing FFI, Rust code typically converts exceptions in other languages into Results (at a performance cost, but who cares). But Rust needs a escape right, it is a low level language.

And there is at least one case where the catch_unwind is fully warranted: when you have an async web server with multiple concurrent requests and you need panics to take down only a single request, and not the whole server (that would be a DoS vector). If that weren't possible, then async Rust couldn't have feature parity with sync Rust (which uses a thread-per-request model, and where panics kill the thread corresponding to the request)


> when you have an async web server with multiple concurrent requests and you need panics to take down only a single reques

Addressed in sibling thread - it’s a poor default to design Rust around.


Not the same person, but I first try and figure out an API that allows me to not panic in the first place.

Panics are a runtime memory safe way to encode an invariant, but I will generally prefer a compile time invariant if possible and not too cumbersome.

However, yes I will panic if I'm not already using unsafe and I can clearly prove the invariant I'm working with.


I don't speak for anyone else but I'm not using `unwrap` and `expect`. I understand the scenario you outlined but I've accepted it as a compromise and will `match` on a map's fetching function and will have an `Err` branch.

I will fight against program aborts as hard as I possibly can. I don't mind boilerplate to be the price paid and will provide detailed error messages even in such obscure error branches.

Again, speaking only for myself. My philosophy is: the program is no good for me dead.


> the program is no good for me dead

That may be true, but the program may actually be bad for you if it does something unexpected due to an unforeseen state.


Agreed, that's why I don't catch panics either -- if we get to that point I'm viewing the program as corrupted. I'm simply saying that I do my utmost to never use potentially panicking Rust API and prefer to add boilerplate for `Err` branching.

So what do you do in the error branch if something like out-of-bounds index happens? Wrap and propagate the error to the caller?

Usually yes. But I lean much more to writing library-like code, I admit.

When I have to make a decision on an app-level, it becomes a different game though. I don't have a clear-and-cut answer for that.


This implies that every function in your library that ever has to do anything that might error out - e.g. integer arithmetic or array indexing - has to be declared as returning the corresponding Result to propagate the error. Which means that you are now imposing this requirement (to check for internal logic bugs in library code) onto the user of your library.

Well, I don't write as huge a code as this though, nor does it have as many layers.

Usually I just use the `?` and `.map_err` (or `anyhow` / `thiserror`) to delegate and move on with life.

I have a few places where I do pattern-matches to avoid exactly what you described: imposing the extra internal complexity to users. Which is indeed a bad thing and I am trying to fight it. Not always succeeding.


Honestly, I don't think libraries should ever panic. Just return an UnspecifiedError with some sort of string. I work daily with rust, but I wish no_std and an arbitrary no_panic would have better support.

Example docs for `foo() -> Result<(), UnspecifiedError>`:

    # Errors

    `foo` returns an error called `UnspecifiedError`, but this only
    happens when an anticipated bug in the implementation occurs. Since
    there are no known such bugs, this API never returns an error. If
    an error is ever returned, then that is proof that there is a bug
    in the implementation. This error should be rendered differently
    to end users to make it clear they've hit a bug and not just a
    normal error condition.
Imagine if I designed `regex`'s API like this. What a shit show that would be.

If you want a less flippant take down of this idea and a more complete description of my position, please see: https://burntsushi.net/unwrap/

> Honestly, I don't think libraries should ever panic. Just return an UnspecifiedError with some sort of string.

The latter is not a solution to the former. The latter is a solution to libraries having panicking branches. But panics or other logically incorrect behavior can still occur as a result of bugs.


Funny that as a user of this library, I would just unwrap this, and it results in the same outcome as if library panicked.

Yes. A panic is the right thing to do, and it's just fine if the library does it for you.

My main issue with panics is poor interop across FFI boundaries.

This is like saying, "my main issue with bugs is that they result in undesirable behavior."

Panicking should always be treated as a bug. They are assertions.


This is already a thing, I do this right now. You configure the linter to forbid panics, unwraps, and even arithmetic side effects at compile time.

You can configure your lints in your workspace-level Cargo.toml (the folder of crates)

“””

[workspace.lints.clippy]

pedantic = { level = "warn", priority = -1 }

# arithmetic_side_effects = "deny"

unwrap_used = "deny"

expect_used = "deny"

panic = "deny"

“””

then in your crate Cargo.toml “””

[lints]

workspace = true

“””

Then you can’t even compile the code without proper error handling. Combine that with thiserror or anyhow with the backtrace feature and you can yeet errors with “?” operators or match on em, map_err, map_or_else, ignore them, etc

[1] https://rust-lang.github.io/rust-clippy/master/index.html#un...


The issue with this in practice is that there are always cases where panics are absolutely the correct course of action. When program state is bad enough that you can't safely continue, you need to panic (and core dump in dev). Otherwise you are likely just creating an integrity minefield for you to debug later.

Not saying there aren't applications where using these lints couldn't be alright (web servers maybe), but at least in my experiences (mostly doing CLI, graphics, and embedded stuff) trying to keep the program alive leads to more problems than less.


The comment you're replying to specifically wanted "no panics" version of rust.

It's totally normal practice for a library to have this as a standard.


Indent by 4 spaces to get code blocks on HN.

    Like
    this


You only need 2. https://news.ycombinator.com/formatdoc

> Text after a blank line that is indented by two or more spaces is reproduced verbatim. (This is intended for code.)


  Thank
  you

But can deny the use of all operations that might panic like indexing an array?

There's a lint for indexing an array, but not for all maybe-panicking operations. For example, the `copy_from_slice` method on slices (https://doc.rust-lang.org/std/primitive.slice.html#method.co...) doesn't have a clippy lint for it, even though it will panic if given the wrong length.

Yes, looks like you can, try indexing_slicing

https://rust-lang.github.io/rust-clippy/master/#indexing_sli...


It's pretty difficult to have no panics, because many functions allocate memory and what are they supposed to do when there is no memory left? Also many functions use addition and what is one supposed to do in case of overflow?

>many functions allocate memory and what are they supposed to do when there is no memory left?

Return an AllocationError. Rust unfortunately picked the wrong default here for the sake of convenience, along with the default of assuming a global allocator. It's now trying to add in explicit allocators and allocation failure handling (A:Allocator type param) at the cost of splitting the ecosystem (all third-party code, including parts of libstd itself like std::io::Read::read_to_end, only work with A=GlobalAlloc).

Zig for example does it right by having explicit allocators from the start, plus good support for having the allocator outside the type (ArrayList vs ArrayListUnmanaged) so that multiple values within a composite type can all use the same allocator.

>Also many functions use addition and what is one supposed to do in case of overflow?

Return an error ( https://doc.rust-lang.org/stable/std/primitive.i64.html#meth... ) or a signal that overflow occurred ( https://doc.rust-lang.org/stable/std/primitive.i64.html#meth... ). Or use wrapping addition ( https://doc.rust-lang.org/stable/std/primitive.i64.html#meth... ) if that was intended.

Note that for the checked case, it is possible to have a newtype wrapper that impls std::ops::Add etc, so that you can continue using the compact `+` etc instead of the cumbersome `.checked_add(...)` etc. For the wrapping case libstd already has such a newtype: std::num::Wrapping.

Also, there is a clippy lint for disallowing `+` etc ( https://rust-lang.github.io/rust-clippy/master/index.html#ar... ), though I assume only the most masochistic people enable it. I actually tried to enable it once for some parsing code where I wanted to enforce checked arithmetic, but it pointlessly triggered on my Checked wrapper (as described in the previous paragraph) so I ended up disabling it.


> Rust unfortunately picked the wrong default here for the sake of convenience, along with the default of assuming a global allocator. [...] Zig for example does it right by having explicit allocators from the start

Rust picked the right default for applications that run in an OS whereas Zig picked the right default for embedded. Both are good for their respective domains, neither is good at both domains. Zig's choice is verbose and useless on a typical desktop OS, especially with overcommit, whereas Rust's choice is problematic for embedded where things just work differently.


Various kind of "desktop" applications like databases and video games use custom non-global allocators - per-thread, per arena, etc - because they have specific memory allocation and usage patterns that a generic allocator does not handle as well as targeted ones can.

My current $dayjob involves a "server" application that needs to run in a strict memory limit. We had to write our own allocator and collections because the default ones' insistence on using GlobalAlloc infallibly doesn't work for us.

Thinking that only "embedded" cares about custom allocators is just naive.


> Thinking that only "embedded" cares about custom allocators is just naive.

I said absolutely no such thing? In my $dayjob working on graphics I, too, have used custom allocators for various things, primarily in C++ though, not Rust. But that in no way makes the default of a global allocator wrong, and often those custom allocators have specialized constraints that you can exploit with custom containers, too, so it's not like you'd be reaching for the stdlib versions probably anyway.


I don't see why you would have to write your own - there are plenty of options in the crate ecosystem, but perhaps you found them insufficient?

As a video game developer, I've found the case for custom general-purpose allocators pretty weak in practice. It's exceedingly rare that you really want complicated nonlinear data structures, such as hash maps, to use a bump-allocator. One rehash and your fixed size arena blows up completely.

95% of use cases are covered by reusing flat data structures (`Vec`, `BinaryHeap`, etc.) between frames.


> there are plenty of options in the crate ecosystem

Who writes the crates?


That's public information. It's up to you to make the choice whether to trust someone, but the least you can do is look at the code and see if it matches what you would have done.

The allocator we wrote for $dayjob is essentially a buffer pool with a configurable number of "tiers" of buffers. "Static tiers" have N pre-allocated buffers of S bytes each, where N and S are provided by configuration for each tier. The "dynamic" tier malloc's on demand and can provide up to S bytes; it tracks how many bytes it has currently allocated.

Requests are matched against the smallest tier that can satisfy them (static tiers before dynamic). If no tier can satisfy it (static tiers are too small or empty, dynamic tier's "remaining" count is too low), then that's an allocation failure and handled by the caller accordingly. Eg if the request was for the initial buffer for accepting a client connection, the client is disconnected.

When a buffer is returned to the allocator it's matched up to the tier it came from - if it came from a static tier it's placed back in that tier's list, if it came from the dynamic tier it's free()d and the tier's used counter is decremented.

Buffers have a simple API similar to the bytes crate - "owned buffers" allow &mut access, "shared buffers" provide only & access and cloning them just increments a refcount, owned buffers can be split into smaller owned buffers or frozen into shared buffers, etc.

The allocator also has an API to query its usage as an aggregate percentage, which can be used to do things like proactively perform backpressure on new connections (reject them and let them retry later or connect to a different server) when the pool is above a threshold while continuing to service existing connections without a threshold.

The allocator can also be configured to allocate using `mmap(tempfile)` instead of malloc, because some parts of the server store small, infrequently-used data, so they can take the hit of storing their data "on disk", ie paged out of RAM, to leave RAM available for everything else. (We can't rely on the presence of a swapfile so there's no guarantee that regular memory will be able to be paged out.)

As for crates.io, there is no option. We need local allocators because different parts of the server use different instances of the above allocator with different tier configs. Stable Rust only supports replacing GlobalAlloc; everything to do with local allocators is unstable, and we don't intend to switch to nightly just for this. Also FWIW our allocator has both a sync and async API for allocation (some of the allocator instances are expected to run at capacity most of the time, so async allocation with a timeout provides some slack and backpressure as opposed to rejecting requests synchronously and causing churn), so it won't completely line up with std::alloc::Allocator even if/when that does get stabilized. (But the async allocation is used in a localized part of the server so we might consider having both an Allocator impl and the async direct API.)

And so because we need local allocators, we had to write our own replacements of Vec, Queue, Box, Arc, etc because the API for using custom A with them is also unstable.


Did you publish these by any chance?

Sorry, the code is closed source.

> Zig for example does it right by having explicit allocators from the start

Odin has them, too, optionally (and usually).


> Rust unfortunately picked the wrong default here

I partially disagree with this. Using Zig style allocators doesn't really fit with Rust ergonomics, as it would require pretty extensive lifetime annotations. With no_std, you absolutely can roll your own allocation styles, at the price of more manual lifetime annotations.

I do hope though that some library comes along that allows for Zig style collections, with the associated lifetimes... (It's been a bit painful rolling my own local allocator for audio processing).


Explicit allocators do work with Rust, as evidenced by them already working for libstd's types, as I said. The mistake was to not have them from day one which has caused most code to assume GlobalAlloc.

As long as the type is generic on the allocator, the lifetimes of the allocator don't appear in the type. So eg if your allocator is using a stack array in main then your allocator happens to be backed by `&'a [MaybeUninit<u8>]`, but things like Vec<T, A> instantiated with A = YourAllocator<'a> don't need to be concerned with 'a themselves.

Eg: https://play.rust-lang.org/?version=nightly&mode=debug&editi... do_something_with doesn't need to have any lifetimes from the allocator.

If by Zig-style allocators you specifically mean type-erased allocators, as a way to not have to parameterize everything on A:Allocator, then yes the equivalent in Rust would be a &'a dyn Allocator that has an infectious 'a lifetime parameter instead. Given the choice between an infectious type parameter and infectious lifetime parameter I'd take the former.


Ah, my bad, I guess I've been misunderstanding how the Allocator proposal works all along (I thought it was only for 'static allocators, this actually makes a lot more sense!).

I guess all that to say, I agree then that this should've been in std from day one.


The problem is, everything should have been there since day 1. It’s still unclear which API Rust should end up with, even today, which is why it isn’t stable yet.

Looking forward to the API when it's stabilised. Have there been any updates on the progress of allocators of this general area of Rust over the past year?

I haven’t paid that close of attention, but there have been two major APIs that people seem to be deciding between. We’ll see.

>Return an AllocationError. Rust unfortunately picked the wrong default here for the sake of convenience, along with the default of assuming a global allocator. It's now trying to add in explicit allocators and allocation failure handling

Going from panic to panic free in Rust is as simple as choosing 'function' vs 'try_function'. The actual mistakes in Rust were the ones where the non-try version should have produced a panic by default. Adding Box::try_new next to Box::new is easy.

There are only two major applications of panic free code in Rust: critical sections inside mutexes and unsafe code (because panic safety is harder to write than panic free code). In almost every other case it is far more fruitful to use fuzzing and model checking to explicitly look for panics.


In order to have true ergonomic no_panic code in Rust you'd need to be able to have parametricity on the panic behavior: have a single Box::new that can be context determined to be panicky or Result based. It has to be context determined and not explicitly code determined so that the top most request for the no_panic version to be propagated all the way down to stdlib through the entire stack. If you squint just a bit, you can see this is the same as maybe async, and maybe const, and maybe allocate, and maybe wrapping/overflowing math, etc. So there's an option to just add try_ methods on the entire stdlib, which all the code between your API and the underlying API need to use/expose, or push for a generic language level mechanism for this. Which then complicates the language, compiler and library code further. Or do both.

>what are they supposed to do when there is no memory left

Well on Linux they are apparently supposed to return memory anyway and at some point in the future possibly SEGV your process when you happen to dereference some unrelated pointer.


You can tell Linux that you don't want overcommit. You will probably discover that you're now even more miserable and change it back, but it's an option.

Whenever I switch off overcommitting, every program on my system (that I'm using) dies, one by one, over the course of 2–5 seconds, followed by Xorg. It's quite pretty.

I did that and even with enormous amounts of free memory, Chrome and other Chromium browsers just die.

They require overcommit just to open an empty window.


Don't know about your parent poster but I didn't take it 100% literally. Obviously if there's no memory left then you crash; the kernel would likely murder your program half a second later anyway.

But for arithmetics Rust has non-aborting bound checking API, if my memory serves.

And that's what I'm trying hard to do in my Rust code f.ex. don't frivolously use `unwrap` or `expect`, ever. And just generally try hard to never use an API that can crash. You can write a few error branches that might never get triggered. It's not the end of the world.


Dealing with integer overflow is much more burdensome than dealing with allocation failure, IME. Relatively speaking, allocation failure is closer to file descriptor limits in terms of how it effects code structure. But then I mostly use C when I'm not using a scripting language. In languages like Rust and C++ there's alot of hidden allocation in the high-level libraries that seem to be popular, perhaps because the notion that "there's nothing you can do" has infected too many minds.

Of course, just like with opening files or integer arithmetic, if you don't pay any attention to handling the errors up front when writing your code, it can be an onerous if not impossible to task to refactor things after the fact.


Oh I agree, don't get me wrong. Both are pretty gnarly.

I was approaching these problems strictly from the point of view of what can Rust do today really, nothing else. To me having checked and non-panicking API for integer overflows / underflows at least gives you some agency.

If you don't have memory, well, usually you are cooked. Though one area where Rust can become even better there is to give us some API to reserve more memory upfront, maybe? Or I don't know, maybe adopt some of the memory-arena crates in stdlib.

But yeah, agreed. Not the types of problems I want to have anymore (because I did have them in the past).


In C I simply use -fsanitize=signed-integer-overflow if I expect no overflow and checked arithmetic when I need to handle overflow. I do not think this is worse than in any other languages and seems less annoying than Rust. If I am lazy, I let allocation failure trap on null pointer dereference which is also safe, out-of-bounds accesses are avoided by -fsanitize=bounds (I avoid pointer arithmetic and unsafe casts where I can and essentially treat it like Rust's "unsafe").

Rust provides a default integer of each common size and signedness, for which overflow is prohibited [but this prohibition may not be enforced in release compiled binaries depending on your chosen settings for the compiler, in this case what happens is not promised but today it will wrap - it's wrong to write code which does this on purpose - see the wrapping types below if you want that - but it won't cause UB if you do it anyway]

Rust also provides Wrapping and Saturating wrapper types for these integers, which wrap (255 + 1 == 0) or saturate (255 + 1 == 255). Depending on your CPU either or both of these might just be "how the computer works anyway" and will accordingly be very fast. Neither of them is how humans normally think about arithmetic.

Furthermore, Rust also provides operations which do all of the above, as well as the more fundamental "with carry" type operations where you get two results from the operation and must write your algorithms accordingly, and explicitly fallible operations where if you would overflow your operation reports that it did not succeed.


Additions are easy. By default they are wrapped, and you can make them explicit with checked_ methods.

Assuming that you are not using much recursion, you can eliminate most of the heap related memory panics by adding limited reservation checks for dynamic data, which is allocated based on user input/external data. You should also use statically sized types whennever possible. They are also faster.


Wrapping on overflow is wrong because this is not the math we expect. As a result, errors and vulnerabilities occur (look at Linux kernel for example).

It depends on the context. Of course the result may cause vulnerabilities if the program logic in bad context depends on it. But yeah, generally I would agree.

> what are they supposed to do when there is no memory left?

You abandon the current activity and bubble up the error to a stage where that effort can be tossed out or retried sometime later. i.e. Use the same error handling approach you would have to use for any other unreliable operation like networking.


> Also many functions use addition and what is one supposed to do in case of overflow?

Honestly this is where you'd throw an exception. It's a shame Rust refuses to have them, they are absolutely perfect for things like this...


I'm confused by this, because a panic is essentially an exception. They can be thrown and caught (although it's extremely discouraged to do so).

The only place where it would be different is if you explicitly set panics to abort instead of unwind, but that's not default behavior.


`panic` isn’t really an error that you have to (or can) handle, it’s for unrecoverable errors. Sort of like C++ assertions.

Also there is the no_panic crate, which uses macros to require the compiler to prove that a given function cannot panic.


You can handle panics. It’s for unrecoverable errors, but internally it does stack unwinding by default like exceptions in C++.

You see this whenever you use cargo test. If a single test panics, it doesn’t abort the whole program. The panic is “caught”. It still runs all the other tests and reports the failure.


> but internally it does stack unwinding by default

Although as a library vendor, you kind have to assume your library could be compiled into an app configured with panic=abort, in which case it will not do that


Well, kinda. It's more similar to RuntimeException in Java, in that there are times where you do actually want to catch and recover from them.

But on those places, you better know exactly what you are doing.


I would say that Segmentation Fault is better comparison with C++ :-D

that's kind of a thing with https://docs.rs/no-panic/latest/no_panic/ or no std and custom panic handlers.

not sure what the latest is in the space, if I recall there are some subtleties


That's a neat hack, but it would be a lot nicer to have explicit support as part of the language.

That's going to be difficult because the language itself requires panic support to properly implement indexing, slicing, and integer division. There are checked methods that can be used instead, but to truly eliminate panics, the ordinary operators would have to be banned when used with non-const arguments, and this restriction would have to propagate to all dependencies as well.

Yes that’s right. The feature really wants compiler support for that reason. The simplest version wouldn’t be too hard to implement. Every function just exports a flag on whether or not it (or any callees) can panic. Then we have a nopanic keyword which emits a compiler error if the function (or any callee) panics.

It would be annoying to use - as you say, you couldn’t even add regular numbers together or index into an array in nopanic code. But there are ways to work around it (like the wrapping types).

One problem is that implicit nopanic would add a new way to break semver compatibility in APIs. Eg, imagine a public api that just happens to not be able to panic. If the code is changed subtly, it could easily start panicing again. That could break callers, so it has to be a major version bump. You’d probably have to require explicit nopanic at api boundaries. (Else assume all public functions from other crates can panic). And because of that, public APIs like std would need to be plastered with nopanic markers everywhere. It’s also not clear how that works through trait impls.


Yeah, this is how it works with no_std.

No? https://godbolt.org/z/jEc36vP3P

As far as I can tell, no_std doesn't change anything with regard to either the usability of panicking operators like integer division, slice indexing, etc. (they're still usable) nor on whether they panic on invalid input (they still do).


The problem is with false positives. Even if you clearly see that some function will never panic (but it uses some feature which may panic), compiler might not always see that. If compiler says that there are no panics, then there are no panics, but is it enough to add as part of the language if you need to mostly avoid using features that might panic?

I do not want a library to panic though, I want to handle the error myself.

Let's say the library panics because there was an out-of-bounds array access on some internal (to that library) array due to a bug in their code. How will you handle this error yourself, and how is the library supposed to propagate it to you in the first place without unwinding?

Ensure all bounds and invariants are checked, and return Result<T, E> or a custom error or something. As I said, I do not want a library to panic. It should be up to the user of the library. When I write libraries, I make sure that the users of the library are able to handle the errors themselves. Imagine using some library, but they use assert() or panic() instead of returning an error for you to handle, that would frustrate me.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: