Right, but GC encourages you to not think about memory at all until the program ...

titzer · on Sept 21, 2021

GC will not fix trashy programming. The problem is that many GC'd languages have adopted a style guide that commits to a lot of unnecessary allocations. For example, in Java, you can't parse an integer out of the middle of a string without allocating in-between. Ditto with lots of other common operations. Java has oodles of trashy choices. With auto-boxing, allocations are hidden. Without reified (let's say, type-specialized) generics, all the collection classes carry extra overhead for boxing values.

I write almost all of my code in Virgil these days. It is fully garbage-collected but nothing forces you into a trashy style. E.g. I use (and reuse) StringBuilders, DataReaders, and TextReaders that don't create unnecessary intermediate garbage. It makes a big difference.

Sometimes avoiding allocation means reusing a data structure and "resetting" or clearing its internal state to be empty. This works if you are careful about it. It's a nightmare if you are not careful about it.

I'm not going back to manual memory management, and I don't want to think about ownership. So GC.

edit: Java also highly discourages reimplementing common JDK functionality, but I've found building a customized datastructure that fits exactly my needs (e.g. an intrusive doubly-linked list) can work wonders for performance.

jjoonathan · on Sept 21, 2021

> many GC'd languages have adopted a style guide that commits to a lot of unnecessary allocations.

Oh, that too. I forgot to rant about that.

> Virgil

Unfortunately I'd rather live with a crummy language that has strong ecosystem, tooling, and developer availability, so I'll never really know. It does sound nice, though.

pjmlp · on Sept 21, 2021

Yeah, but that was one of Java's 1.0 mistakes, that thankfully Go, .NET, D, Swift, among others, did not make.

Now lets see if Valhalla actually happens.

josephg · on Sept 21, 2021

> Right, but GC encourages you to not think about memory at all

I’ve come to a new obvious realisation with this sort of thing recently: if you care about some metric, make a test for it early and run it often.

If you care about correctness, grow unit tests and run them at least every commit.

If you care about performance, write a benchmark and run it often. You’ll start noticing what makes performance improve and regress, which over time improves your instincts. And you’ll start finding it upsetting when a small change drops performance by a few percent.

If you care about memory usage, do the same thing. Make a standard test suite and measure it regularly. Ideally write the test as early as possible in the development process. Doing things in a sloppy way will start feeling upsetting when it makes the metric get worse.

I find when I have a clear metric, it always feels great when I can make the numbers improve. And that in turn makes it really effortless bring my attention to performance work.

tptacek · on Sept 22, 2021

Not so much. Here we have an example of a memory pressure problem that's evident only under high load in realistic environments. This is a classic problem with performance engineering: it's usually difficult to do realistic automated load testing. Instead, you end up running lab experiments, which are time-consuming to set up.

The whole post is essentially about how tricky it was to surface the problems their customers were seeing in the field. I'd resist the urge to respond to that with a platitude about automated testing.

josephg · on Sept 22, 2021

Yes it can be difficult to do realistic automated load testing. But I suppose I see this as more evidence that if you're going to do load testing, do it right! In complex systems you often need real world usage data, or your metrics won't predict reality.

I've been running into this a lot writing software for collaborative editing. Randomly generated editing traces work fine for correctness testing. But doing performance testing with random traces is unrepresentative. The way people move their cursors around a text box while editing is idiosyncratic. Lots of optimizations make performance worse with random editing histories, but improve performance for real world data sets.

tptacek · on Sept 21, 2021

Plenty of C programs do the equivalent of ioutil.ReadAll; it's not a GC thing.

jjoonathan · on Sept 21, 2021

"Leak everything because we can get away with it here" is a fine memory management strategy. "Why does my program keep getting killed?" isn't.

tptacek · on Sept 21, 2021

This has nothing to do with leaking (nothing "leaked"; it's a garbage-collected runtime). It's about memory pressure, which, I promise you, is a very real perf problem in C programs, and why we memory profile them. The difference between incremental and one-shot reads is not a GC vs. non-GC thing.