> The lifetime of an Arc isn’t unknowable, it’s determined by where and how you ...

Aurornis · on Sept 8, 2023

> But that's nearly the opposite of what the borrow checker tries to do by statically bounding objects, at compile time.

Arc isn't an end-run around the borrow checker. If you need mutable references to the data inside of Arc, you still need to use something like a Mutex or Atomic types as appropriate.

> The degree to which a big language runtime and GC weren't a boogeyman for some problem spaces was really eye-opening.

I have the opposite experience, actually. I was an early adopter of Go and championed Garbage Collection for a long time. Then as our Go platforms scaled, we spent increasing amounts of our time playing games to appease the garbage collector, minimize allocations, and otherwise shape the code to be kind to the garbage collector.

The Go GC situation has improved continuously over the years, but it's still common to see libraries compete to reduce allocations and add complexity like pools specifically to minimize GC burden.

It was great when we were small, but as the GC became a bigger part of our performance narrative it started to feel like a burden to constantly be structuring things in a way to appease the garbage collector. With Rust it's nice to be able to handle things more explicitly and, importantly, without having to explain to newcomers to the codebase why we made a lot of decisions to appease the GC that appear unnecessarily complex at first glance.

neonsunset · on Sept 8, 2023

There's a good chance this is rather a Go issue than a GC one. People get fooled by Go's pretense to be a high level C replacement. It is highly inadequate at performing this role at best.

The reason for that is the compiler quality, the design tradeoffs and Go's GC implementation throughput are simply not there for it to ever be a good general purpose systems-programming-oriented language.

Go receives undeserved hype, for use cases C# and Java are much better at due to their superior GC implementations and codegen quality (with C# offering better lower level features like structs+generics and first-class C interop).

lowbloodsugar · on Sept 8, 2023

Java GC has a non trivial overhead. I’ve moved workloads from Java to rust and gotten a 30x improvement from lack of GC. Likewise I’ve gotten 10x improvement in Java by preallocating objects and reusing then to avoid GC. (Fucking google and the cult of immutable objects). Guess what, lots of things that “make it harder to introduce bugs” make your shit run a lot slower too.

neonsunset · on Sept 8, 2023

This is not an improvement from lack of GC per se but rather from zero cost abstractions (everything is monomorphised, no sin such as type erasure) first and foremost, and yes, deterministic memory management. Java is the worse language if you need to push performance to the limit since it does not offer convenient lower level language constructs to do so (unlike C#), but at reaching 80th percentile of performance, it is by far the best one.

But yes, GC is very much not free and is an explicit tradeoff vs compile time + manual memory management.

kunley · on Sept 8, 2023

As an ops guy for decades, it makes me laugh to hear claims about Java GC superiority. Please go back in time and fix all the crashes and OOMs caused by enterprise JVM, as opposed to near-zero problems with the Go deployments.

Making stong statements without a backup in hard facts is a sign of zealotry...

neonsunset · on Sept 8, 2023

I assure you if that code was to be ported to Go 1:1, Go GC would simply crawl to a halt. Write code badly enough and no matter how good hardware and software is, it won't be able to cope at some point. Even a good tool will give, if you beat it down hard enough.

For example, you may be interested in this read: https://blog.twitch.tv/en/2019/04/10/go-memory-ballast-how-i...

Issues like these simply don't happen with GCs in modern JVM implementations or .NET (not saying they are perfect or don't have other shortcomings, but the sheer amount of developer hours invested in tuning and optimizing them far outstrips Go).

SubjectToChange · on Sept 9, 2023

it makes me laugh to hear claims about Java GC superiority. Please go back in time and fix all the crashes and OOMs caused by enterprise JVM,

I don’t see how running into an OOM problem is necessarily a problem with the GC. That said, Java is a memory intensive language, it’s a trade off that Java is pretty up front about.

I don’t have a horse in this race but I would be quite surprised if Go’s GC implementation could even hold a candle to the ones found in C# and Java. They have spent literally decades of research and development, and god knows how much money (likely north of $1b), optimizing and refining their GC implementations. Go just simply lacks any of the sort of maturity and investment those languages have.

kunley · on Sept 9, 2023

Java has billions spent on marketing and lobbying.

Since the advent of Java in mid-90s I hear about superiority of its VM, yet my observations from the ops PoV claim otherwise. So I suspect a huge hoax...

Hey btw, you're saying "Java is _memory intensive_", like it would magically explain everything. Let's get to that more deeply. Why is it so, dear Watson? Have you compared the memory consumption of the same algo and pretty much similar data structures between languages? Why Java has to be such a memory hog? Why also its class loading is so slow? Are these a qualities of superior VM design and zillions of man-hour invested? huh?

By the way, if the code implementing functionality X needs N times more memory than the other language with gc, then however advanced that gc would be (need to find a proof for that btw), it wouldn't catch up speedwise, because it simply needs to move around more. So simple.

SubjectToChange · on Sept 10, 2023

Java has billions spent on marketing and lobbying.

Marketing is not a silver bullet for success and the tech industry is full of examples of exactly that. The truth is that Sun was able to promote Java so heavily because it was found to be useful.

Since the advent of Java in mid-90s I hear about superiority of its VM, yet my observations from the ops PoV claim otherwise.

The landscape of the 90s certainly made a VM language appealing. And compared to the options of that day it's hardly any wonder.

So I suspect a huge hoax...

It's you verses a plurality, if not majority, of the entire enterprise software market. Of course that's not to say that Java doesn't have problems or that the JVM is perfect, but is it so hard to believe that Java got something right? Is it honestly more believable that everyone else is caught up in a collective delusion?

Hey btw, you're saying "Java is _memory intensive_", like it would magically explain everything.

It's not that Java is necessarily memory intensive, but that a lot of Java performance tuning is focused towards optimizing throughput performance, not memory utilization. Cleaning out a large heap occasionally is in general better than cleaning out a smaller one more frequently.

By the way, if the code implementing functionality X needs N times more memory than the other language with gc, then however advanced that gc would be (need to find a proof for that btw), it wouldn't catch up speedwise, because it simply needs to move around more. So simple

It's not so simple. First of all, the choice of a large heap is not mandated by Java, it's a trade off that developers are making. Second of all, GC performance issues only manifest when code is generating a lot of garbage, and believe it or not, Java can be written to vastly minimize the garbage produced. And last of all, Java GCs like Shenandoah have a max GC pause time of less than 1ms for heaps up to 16TiB.

Anyway, at the end of the day no one is going to take Go away from you. Personally I don't have a horse in this race. That said, the fact is that Java GCs are far more configurable, sophisticated, and advanced than anything Go has (and likely ever will). IMO, Go came at a point in time where there was a niche to exploit, but that niche is shrinking.

kunley · on Sept 10, 2023

I would like to answer your points more deeply, not having much time for it now.

But I think you are avoiding a direct answer to the question why Java needs so much memory in the first place. You say about "developer's choice for a big heap", first I don't think it is their choice, but the consequence of the fact that such a big heap is needed at all, for a typical code. Why?

Let's code a basic https endpoint using typical popular framework returning some simple json data. Usually stuff. Why it will be consuming 5x - 10x more memory for Java? And, if one says it's just unrealistic microbenchmark, things go worse when coding more real stuff.

Btw,having more knobs for a gc is not necessarily a good thing, if it means that there are no fire-and-forget good defaults. If an engineer needs continously to get his head around these knobs to have a non-crashing app, then we have problem. Or rather - ops have a problem, and some programmers are, unfortunately, disconnected from the ops realm. Have you been working together with ops guys? On prod, ofc?

cogman10 · on Sept 8, 2023

Honestly, the biggest stumbling block for rust and async is the notion of memory pinning.

Rust will do a lot of invisible memory relocations under the covers. Which can work great in single threaded contexts. However, once you start talking about threading those invisible memory moves are a hazard. The moment shared memory comes into play everything just gets a whole lot harder with the rust async story.

Contrast that with a language like java or go. It's true that the compiler won't catch you when 2 threads access the same shared memory, but at the same time the mental burden around "Where is this in memory, how do I make sure it deallocates correctly, etc" just evaporates. A whole host of complex types are erased and the language simply cleans up stuff when nothing references it.

To me, it seems like GCs simply make a language better for concurrency. They generally solve a complex problem.

LegionMammal978 · on Sept 8, 2023

> Rust will do a lot of invisible memory relocations under the covers.

I don't think it's quite accurate to point to "invisible memory relocations" as the problem that pinning solves. In most cases, memory relocations in Rust are very explicit, by moving an owned value when it has no live references (if it has any references, the borrow checker will stop you), or calling mem::replace() or mem::swap(), or something along those lines.

Instead, the primary purpose of pinning is to mark these explicit relocations as unsafe for certain objects (that are referenced elsewhere by raw pointer), so that external users must promise not to relocate certain objects on pain of causing UB with your interface. In C/C++, or indeed in unsafe Rust, the same idea can be more trivially indicated by a comment such as /* Don't mess with this object until such-and-such other code is done using it! */. All pinning does is to enforce this rule at compile time for all safe code.

SkiFire13 · on Sept 8, 2023

Memory pinning in Rust is not a problem that has to do with concurrency because the compiler will never relocate memory when something is referencing it. The problem is however with how stackless coroutines in general (even single-threaded ones, like generators) work. They are inherently self-referential structures, and Rust's memory model likes to pretend such structures don't exist, so you need library workarounds like `Pin` to work with them from safe code (and the discussion on whether they are actually sound is still open!)

TwentyPosts · on Sept 8, 2023

>(and the discussion on whether they are actually sound is still open!) Do you have a reference for this? Frankly, maybe I shouldn't ask since I still don't even understand why stackless coroutines are necessarily self-referential, but I am quite curious!

SkiFire13 · on Sept 9, 2023

See for example https://github.com/rust-lang/rust/issues/63818 and https://github.com/rust-lang/rfcs/pull/3467

Basically the problem is that async blocks/fns/generators need to create a struct that holds all the local variables within them at any suspension/await/yield point. But local variables can contain references to other local variables, so there are parts of this struct that reference other parts of this struct. This creates two problems:

- once you create such self-references you can no longer move this struct. But moving a struct is safe, so you need some unsafe code that "promises" you this won't happen. `Pin` is a witness of such promise.

- in the memory model having an `&mut` reference to this struct means that it is the only way to access it. But this is no longer true for self referential structs, since there are other ways to access its contents, namely the fields corresponding to those local variables that reference other local variables. This is the problem that's still open.

celeritascelery · on Sept 9, 2023

> I still don't even understand why stackless coroutines are necessarily self-referential, but I am quite curious!

Because when stackless coroutines run they don’t have access to the stack that existed when they were created. everything that used to be on the stack needs to get packaged up in a struct (this is what `async fn` does). However now everything that used to point to something else on the stack (which rust understands and is fine with) now points to something else within the “impl Future” struct. Hence you have self referential structs.

mike_hearn · on Sept 8, 2023

Interestingly, the newest Java memory feature (Panama FFI/M) actually can catch you if threads race on a memory allocation. They have done a lot of rather complex and little appreciated work to make this work in a very efficient way.

The new api lets you allocate "memory segments", which are byte arrays/C style structs. Such segments can be passed to native code easily or just used directly, deallocated with or without GC, bounds errors are blocked, use-after-free bugs are blocked, and segments can also be confined to a thread so races are also blocked (all at runtime though).

Unfortunately it only becomes available as a finalized non-preview API in Java 22, which is the release after the next one. In Java 21 it's available but behind a flag.

https://openjdk.org/jeps/8310626

dilippkumar · on Sept 8, 2023

> In the same sense that the lifetime of an object in a GC'd system has a lower bound of, "as long as it's referenced", sure.

These are not the same.

The problem with GC'd systems is that you don't know when the GC will run and eat up your cpu cycles. It is impossible to determine when the memory will actually be freed in such systems. With ARC, you know exactly when you will release your last reference and that's when the resource is freed up.

In terms of performance, ARC offers massive benefits because the memory that's being dereferenced is already in the cache. It's hard to understate how big of a deal this is. There's a reason people like ARC and stay away from GC when performance actually begins to matter. :)

nemetroid · on Sept 8, 2023

> With ARC, you know exactly when you will release your last reference and that's when the resource is freed up.

It's more like "you notice when it happens". You don't know in advance when the last reference will be released (if you did, there would be no point in using reference counting).

> In terms of performance, ARC offers massive benefits because the memory that's being dereferenced is already in the cache.

It all depends on your access patterns. When ARC adjusts the reference counter, the object is invalidated in all other threads' caches. If this happens with high frequency, the cache misses absolutely demolish performance. GC simply does not have this problem.

> There's a reason people like ARC and stay away from GC when performance actually begins to matter.

If you're using a language without GC built in, you usually don't have a choice. When performance really begins to matter, people reach for things like hazard pointers.

dilippkumar · on Sept 8, 2023

> It's more like "you notice when it happens". You don't know in advance when the last reference will be released

A barista knows when a customer will pay for coffee (after they have placed their order). A barista does not know when that customer will walk in through the door.

> (if you did, there would be no point in using reference counting).

There’s a difference between being able to deduce when the last reference is dropped (for example, by profiling code) and not being able to tell anything about when something will happen.

A particular developer may not know when the last reference to an object is dropped, but they can find out. Nobody can guess when GC will come and take your cycles away.

> The cache misses absolutely demolish performance

With safe Rust, you shouldn’t be able to access memory that has been freed up. So cache misses on memory that has been released is not a problem in a language that prevents use-after-free bugs :)

> If you’re using a language without GC built in, you usually don’t have a choice.

I’m pretty sure the choice of using Rust was made precisely because GC isn’t a thing (in all places that love and use rust that is)

nemetroid · on Sept 8, 2023

> A barista knows when a customer will pay for coffee (after they have placed their order). A barista does not know when that customer will walk in through the door.

Sorry, no chance of deciphering that.

> There’s a difference between being able to deduce when the last reference is dropped (for example, by profiling code) and not being able to tell anything about when something will happen.

> A particular developer may not know when the last reference to an object is dropped, but they can find out.

The developer can figure out when the last reference to the object is dropped in that particular execution of the program, but not in the general sense, not anymore than they can in a GC'd language.

The only instance where they can point to a place in the code and with certainty say "the reference counted object that was created over there is always destroyed at this line" is in cases where reference counting was not needed in the first place.

> With safe Rust, you shouldn’t be able to access memory that has been freed up. So cache misses on memory that has been released is not a problem in a language that prevents use-after-free bugs :)

I'm not sure why you're talking about freed memory.

Say that thread A is looking at a reference-counted object. Thread B looks at the same object, and modifies the object's reference counter as part of doing this (to ensure that the object stays alive). By doing so, thread B has invalidated thread A's cache. Thread A has to spend time reloading its cache line the next time it accesses the object.

This is a performance issue that's inherent to reference counting.

> I’m pretty sure the choice of using Rust was made precisely because GC isn’t a thing (in all places that love and use rust that is)

Wanting to avoid "GC everywhere", yes. But Rust/C++ programs can have parts that would be better served by (tracing) garbage collection, but where they have to make do with reference counting, because garbage collection is not available.

rcxdude · on Sept 9, 2023

GC generally optimises for throughput over latency. But there is also another cost: high-throughput GC usually uses more memory (sometimes 2-3x as much!). Arc keeps your memory usage low and can keep your latency more consistent, but it will often sacrifice throughput compared to a GC tuned for it. (Of course, stack allocation, where possible, beats them all, which is why rust and C++ tend to win out over java in throughput even if the GC has an advantage over reference counting, because java has to GC a lot more than other languages due to no explicit stack allocation)

riku_iki · on Sept 8, 2023

> In terms of performance, ARC offers massive benefits

but it also has big disadvantage, that it communicates to actual malloc for memory management, which is usually much less performant than GC from various reasons.

dilippkumar · on Sept 8, 2023

> which is usually much less performant than GC from various reasons.

Can you elaborate?

I've seen a couple of malloc implementations, and in all of them, free() is a cheap operation. It usually involves setting a bit somewhere and potentially merging with an adjacent free block if available/appropriate.

malloc() is the expensive call, but I don't see how a GC system can get around the same costs for similar reasons.

What am I missing?

mrkline · on Sept 9, 2023

- Like others have said, both malloc()/free() touch a lot of global state, so you either have contention between threads, or do as jemalloc does and keep thread-local pools that you occasionally reconcile.

- A moving (and ideally, generational) GC means that you can recompact the heap, making malloc() little more than a pointer bump.

- This also suggests subsequent allocations will have good locality, helping cache performance.

Manual memory management isn't magically pause-free, you just get to express some opinion about where you take the pauses. And I'll contend that (A) most programmers aren't especially good at choosing when that should be, and (B) lots (most?) software cares about overall throughput, so long as max latency stays under some sane bound.

riku_iki · on Sept 8, 2023

> Can you elaborate?

I've seen some benchmarks, but can't find them now, so maybe I am wrong about this.

> free() is a cheap operation. It usually involves setting a bit somewhere and potentially merging with an adjacent free block if available/appropriate.

there is some tree like structure somewhere, which then would allow to locate this block for "malloc()", this structure has to be modified in parallel by many concurrent threads, which likely will need some locks, meaning program operates outside of CPU cache.

In JVM for example, GC is integrated into thread models, so they can have heap per thread, and also "free()" happens asynchronously, so doesn't block calling code. Additionally, malloc approaches usually suffer from memory fragmentation, while JVM GC is doing compactions all the time in background, tracks memory blocks generations, and many other optimizations.

tcfhgj · on Sept 8, 2023

Sub (ignore pls)