Hacker News new | past | comments | ask | show | jobs | submit login

Sorry to nit, but this is important. Parent was saying that reads are cheap, which is true. Writes can be expensive even if uncontended, because they invalidate cache lines. I guess you could say they contend with unrelated data but that would stretch the definition a bit.

So what does this mean in practice? In my view, the way to think about it is that atomic writes have non-local side effects. But since atomics are necessary for synchronization, and involves both reads and writes, we should compartmentalize and minimize synchronization as much as possible, to avoid these gnarly issues creeping up and tanking real world performance.

Arc<T> (and it’s relatives in other languages) constitute textbook violations of this rule. In Rust they are everywhere in non-trivial code, including in the async runtimes themselves. Of course, they also violate (or evade if you’re generous) ownership principles of idiomatic Rust, (or “hello world-Rust”, if you will). I think we need to take a hard look as an industry at ref counting as a silver bullet escape hatch to shared data.




> Writes can be expensive even if uncontended, because they invalidate cache lines.

This isn't expensive if cache lines are uncontended, though.

> I guess you could say they contend with unrelated data but that would stretch the definition a bit.

I think you might be talking about "false sharing." This is real contention on the cache line due to co-location of apparently unrelated variables.

> Arc<T> (and it’s relatives in other languages) constitute textbook violations of this rule.

Definitely!

> In Rust they are everywhere in non-trivial code

Ehh.. only the hot ones matter. Most are not actually contended much, and the article's solution (unshared clone) is a very reasonable approach to scale these without an API change.


> I think you might be talking about "false sharing." This is real contention on the cache line due to co-location of apparently unrelated variables.

You’re right. And cache lines are quite small, so this is probably less common. Yet, it’s another potential source of perf regressions in concurrent code, as if it wasn’t incredibly complex already.

> Ehh.. only the hot ones matter.

Well.. first atomics have even more non-local effects, such as barriers on instruction reordering. So Arcs that are cloned willy nilly can still be significant, with no contention.

But let’s ignore that and focus on the contended case: when you hear “uncontended X are basically free” it (subjectively, imo) downplays the issue, like contention is some special case that you can compartmentalize and only worry about when you consciously decide to write contended code. The blog post demonstrates exactly how this is so easy for contention to creep in, that you have to be superhuman levels of vigilant and paranoid to spot these issues upfront. Extremely easy to miss in eg code review.

I think both compile- and runtime tooling could help at least partly here. I’d also give rust some credit for having explicit clone instead of hiding it.


It's a good point, it's easy to cause contention and not realize it because of cache lines




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: