While I am only a Rust novice it seems to me like the "2.2 Item 11: Implement the Drop trait for RAII patterns" could use some kind of mention of Drop-leaks. I learned about it at https://doc.rust-lang.org/nightly/nomicon/leaking.html
- You can't export a reference to the thing you are dropping. You can do that in C++. This prevents "re-animation", where something destroyed comes back to life or is accessed beyond death. Microsoft Managed C++ (early 2000s), supported re-animation and gave it workable semantics. Bad idea, now dead.
- This is part of why Rust destructors cannot run more than once. Less than once is possible, as mentioned above.
- There's an obscure situation with Arc and destructors. When an Arc counts down to 0, the destructor is run. Exactly once. However, Arc countdown and destructor running are not an atomic operation. It is possible for two threads to see an Arc in a strong_count == 1 state just before the Arc counts down. Never check strong_count to see if you are "the last owner". That creates a race condition.[1] I've seen that twice now. I found race conditions that took a day of running to hit. Use strong_count only for debug print.
- A pattern that comes up in GUI libraries and game programming involves objects that are both in some kind of index and owned by Arcs. On drop, the object should be removed from the index. This is a touchy operation. The index should use weak refs, and you have to be prepared to get an un-upgradable Weak from the index.
- Even worse is the case where dropping an object starts a deletion of something else.
If the second deletion can't be completed from within the destructor, perhaps because it requires a network transaction, it's very easy to introduce race conditions.
> - You can't export a reference to the thing you are dropping. You can do that in C++. This prevents "re-animation", where something destroyed comes back to life or is accessed beyond death. Microsoft Managed C++ (early 2000s), supported re-animation and gave it workable semantics. Bad idea, now dead.
>
> - This is part of why Rust destructors cannot run more than once. ...
This is a very backwards way to describe this, I think. Managed C++ only supported re-animation for garbage collected objects, where it is still today a fairly normal thing for a language to support. This is why these "destructors" typically go by a different name, "finalizers." Some languages allow finalizers to run more than once, even concurrently, but this is again due to their GC design and not a natural thing to expect of a "destructor."
The design of Drop and unmanaged C++ destructors is that they are (by default) deterministically executed before the object is deallocated. Often this deallocation is not by `delete` or `free`, which could perhaps in principle be cancelled, but by a function return popping a stack frame, or some larger object being freed, which it simply does not make sense to cancel.
> Never check strong_count to see if you are "the last owner".
This made me think of the `im` library[0] which provides some immutable/copy on write collections. The docs make it seem like they do some optimizations when they determine there is only one owner:
> Most crucially, if you never clone the data structure, the data inside it is also never cloned, and in this case it acts just like a mutable data structure, with minimal performance differences (but still non-zero, as we still have to check for shared nodes).
I hope this isn't prone to a similar race condition!
The way to do this while avoiding race conditions seems to be `Arc::into_inner` or `Arc::get_mut`; for instance, the docs for `Arc::try_unwrap` mention a possible race condition, and recommend using `Arc::into_inner` to avoid it: https://doc.rust-lang.org/std/sync/struct.Arc.html#method.tr...
Managed C++ is pretty much around, kind of, as it got replaced by C++/CLI in .NET 2.0, is still used by many of us instead of dealing with P/Invoke annotations, is required by WPF infrastructure, and currently is on C++20 support level.
The important note here is that you can't rely on Drop running in order to satisfy the SAFETY comment of an unsafe block. In practice, in safe Rust, this knowledge shouldn't really change how you write your code.
It’s not that surprising when you consider that “unsafe” only concerns itself with memory safety. mem::forget is not unsafe from that perspective.
> In the past mem::forget was marked as unsafe as a sort of lint against using it, since failing to call a destructor is generally not a well-behaved thing to do (though useful for some special unsafe code). However this was generally determined to be an untenable stance to take: there are many ways to fail to call a destructor in safe code. The most famous example is creating a cycle of reference-counted pointers using interior mutability.
Leaking memory is unsafe. It was considered unsafe for decades: a prime example of the sort of problem you get in C or C++ that you avoid with automatic memory management. Lots of real crashes, stability issues and performance issues have been caused by memory leaks over the years.
Rust initially advertised itself as preventing leaks, which makes sense as it is supposed to have the power of automatic memory management but without the runtime overhead.
Unfortunately, shortly before Rust's release it was discovered that there were some APIs that could cause memory corruption in the presence of memory leaks. The decision was made that memory leaks would be too complicated to fix before 1.0: it would have had to have been delayed. So the API in question was taken out and Rust people quietly memory-holed the idea that leak freedom has ever been considered part of memory safety.
I think that's a retcon. Rust people did not "decide that leaking is safe" all of a sudden, that's cart-before-horse. Rust's memory model was still in its early stages back then and there was a belief (in hindsight, a mistaken belief) that destructors could be used as a means to guarantee memory safety. This turned out to be poorly reasoned and so, to preserve a consistent model of safety for other code, it was decided that having safety rely on the invocation of destructors was unsound. It's not possible to do this without also having leaks be safe, so that's the world as it is.
If "is leaking memory safe?" is an issue of contention for you, I'd suggest that it's a good idea to do some reading on what memory safety is (I mean that in all sincerity, not as a dunk). Memory safety, at least by the specific and highly useful definition used by compiler developers, is intimately entangled with undefined behaviour, but memory leaking sits entirely outside this sphere. This is as true in C and C++ as it is in Rust.
Another example of how your parent isn't really being accurate, memory leaks are also possible in garbage collected languages, yet they have been considered memory safe since well before Rust even existed.
It's not as if Rust invented the term "memory safety" or gets to define it.
Memory leaks are not possible in garbage collected languages unless you retain references to data, but by definition that isn't a memory leak, that is exactly the behaviour that you want.
Memory leaks are situations where memory is unrecovered despite there being no path to it from any active thread.
This is the same definition game you’re accusing Rust of making. Sometimes, you retain references you do not want, and therefore, leak. It’s something that comes down to programmer intent.
That retains 1GiB of memory allocated without any ownership path due to implementation details of std::shared_ptr. Is that a memory leak? There’s no active thread that has a path and yet all of the memory is tracked - if you destroy the weak_ptr thee 1GiB of memory gets reclaimed.
Reference counting is a form of GC / automatic memory management [1] but it’s ok, it’s a common mistake to make. What’s less ok is this absolute intransigence in persisting to believe that memory leaks aren’t possible in tracing GCs but only when playing the same definitional games you accuse Rust of doing by limiting the types of things you count as leaks. For example, if I implement a cache as Map<String, Object>, that’s a memory leak if you define memory leaks as retaining memory longer than you’d actually need if the goal is to have just a single instance of a value for key live (because it’s not using a weak reference) or forgetting to delete/evict from the cache. Bad software design can result in memory leaks and defining it as not a memory leak because a live reference to an object exists somewhere is just playing the definitions game [2]
You have misunderstood both the concept of a memory lwak and the concept of automatic memory management. Good job!
No, reference counting is not garbage collection. I am fully aware of the ridiculous claim that it is, promoted by people like you. I fundamentally disagree. It has none of the same properties and doesn't work anything like GC.
Multiple very talented and very knowledgeable people have tried to help you understand and these are people with firsthand knowledge of the discussion at hand (I’m not counting myself because Steve and the other know language design and Rust better than I do). You insist on doubling down on your position instead of considering the possibility you’re wrong. Not much more I can do here. You can only lead a horse to water.
I consider whether I am wrong often. It happens to be that I am not. It is quite haughty and rude of you to assume that I haven't considered it here just because I disagree with you.
There isn't much more you can do here because you are completely wrong. Instead of facing reality (that Rust, useful as it may be, only prevents a narrow class of correctness issues of varying importance) you double down on its marketing spin that all the things it fixes just happen to be all the important safety-related ones.
The difference is that leaking is not UB, the worst case is an OOM situation, which at worst causes a crash, not a security exploit. Crashing is also considered to be safe in rust, panicking is common for example when something bad happens.
Undefined behaviour is behaviour not defined by the language. So obviously Rust can define or undefine whatever it likes. It is not a sensible argument to say that something is safe because its behaviour is defined, or unsafe because it is undefined, when the whole point is that Rust's chosen definition of safety is just marketing.
no, undefined behavior not just behavior that is not covered by the language definition. undefined behavior is term of art largely taken from C/C++, basically meaning that correct programs are assumed not to have these behaviors. for example, see https://en.cppreference.com/w/c/language/behavior. the definition of ub is not "just marketing". many major security vulnerabilities stem from having ub (out of bounds access, use after free). the point of rust is pretty much that you have to try hard to have ub, whereas in c/c++, it's basically impossible to not have ub.
To add onto this, Rust actually does have UB, it’s just impossible to reach without unsafe. One “sharper” edge has is that it’s UB is much easier to trigger in unsafe than one might expect, so writing unsafe Rust actually requires more skill than C++, which is why you should be very very careful when reaching for it.
Was Box::leak ever considered unsafe? std::mem::forget is very similar to that.
Crashes, stability, and performance issues are still not safety issues since there’s so many ways to cause those beyond memory leaks. I don’t know the discussion that was ongoing in the community but I definitely appreciate them taking a pragmatic approach and cutting scope and going for something achievable.
>Crashes, stability, and performance issues are still not safety issues since there’s so many ways to cause those beyond memory leaks.
They aren't safety issues according to Rust's definition, but Rust's definition of "unsafe" is basically just "whatever Rust prevents". But that is just begging the question: they don't stop being serious safety issues just because Rust can't prevent them.
If Rust said it dealt with most safety issues, or the most serious safety issues, or similar, that would be fine. Instead the situation is that they define data races as unsafe (because Rust prevents data races) but race conditions as safe (because Rust does not prevent them in general) even though obviously race conditions are a serious safety issue.
For example you cannot get memory leaks in a language without mutation, and therefore without cyclic data structures. And in fact Rust has no cyclic data structures naturally, as far as I am aware: all cyclic data structures require some "unsafe" somewhere, even if it is inside RefCell/Rc in most cases. So truly safe Rust (Rust without any unsafe at all) is leakfree, I think?
> Rust's definition of "unsafe" is basically just "whatever Rust prevents".
It's not that circular.
Rust defines data races as unsafe because they can lead to reads that produce corrupt values, outside the set of possibilities defined by their type. It defines memory leaks as safe because they cannot lead to this situation.
That is the yardstick for what makes something safe or unsafe. It is the same yardstick used by other memory-safe languages- for instance, despite your claims to the contrary, garbage collectors do not and cannot guarantee a total lack of garbage. They have a lot of slack to let the garbage build up and then collect it all at once, or in some situations never collect it at all.
There are plenty of undesirable behaviors that fall outside of this definition of unsafety. Memory leaks are simply one example.
>It defines memory leaks as safe because they cannot lead to this situation.
They can't now. They could up to and almost including 1.0. At that point the consensus was that memory leaks were unsafe and so unsafe code could rely on them not happening. That code was not incorrect! It just had assumptions that were false. One solution was to make those assumptions true by outlawing memory leaks. The original memory leak hack to trigger memory corruption was fairly fiendish in combination with scoped threads (IIRC).
>There are plenty of undesirable behaviors that fall outside of this definition of unsafety. Memory leaks are simply one example.
That is my whole point. It is a useless definition cherry-picked by Rust because it is what Rust, in theory, prevents. It does not precede Rust. Rust precedes it.
>It is the same yardstick used by other memory-safe languages- for instance, despite your claims to the contrary, garbage collectors do not and cannot guarantee a total lack of garbage. They have a lot of slack to let the garbage build up and then collect it all at once, or in some situations never collect it at all.
If it will eventually be collected then it isn't a memory leak.
Most actual safe languages don't let you write integer overflow.
> They can't now. They could up to and almost including 1.0. At that point the consensus was that memory leaks were unsafe and so unsafe code could rely on them not happening. That code was not incorrect!
This is not how it worked, no. It was never memory leaks per se that led to unsoundness there. It was skipping destructors. You could have the exact same unsoundness if you freed the object without running the rest of its destructor first.
That part was the design choice Rust made- make destructors optional and change the scoped threads API, or make destructors required and keep the scoped threads API.
There is an underlying definition of memory safety (or more generally "soundness") that precedes Rust. It is of course defined in terms of a language's "abstract machine," but that doesn't mean Rust has complete freedom to declare any behavior as safe. Memory safety is a particular type of consistency within that abstract machine.
This is why the exact set of undesirable-but-safe operations varies between memory-safe languages. Data races are unsafe in Rust, but they are safe in Java, because Java's abstract machine is defined in such a way that data races cannot lead to values that don't match their types.
Sure, safety is a relative moving target. There’s no way to prevent race conditions unless you have proofs. And then there’s no way to enforce that your proof is written correctly. It’s turtles all the way down. Rust is a Paretto frontier of safety for AOT high performance languages. Even for race conditions I suspect the tools Rust has for managing concurrency-related issues make it less prone to such issues than other languages.
The problem is you’re creating a hypothetical gold standard that doesn’t exist (indeed I believe it can’t exist) and then judging Rust on that faux standard and complaining that Rust chooses a different standard. That’s the thing though - every language can define whatever metrics they want and languages like C/C++ struggle to define any metrics that they win vs Rust.
> For example you cannot get memory leaks in a language without mutation, and therefore without cyclic data structures
This does not follow. Without any mutation of any kind, you can’t even allocate memory in the first place (how do you think a memory allocator works?). And you can totally get memory leaks without mutation however you narrowly define it because nothing prevents you from having a long-lived reference that you don’t release as soon as possible. That’s why memory leaks are still a thing in Java because there’s technically a live reference to the memory. No cycles or mutations needed.
> So truly safe Rust (Rust without any unsafe at all) is leakfree, I think?
Again, Box::leak is 100% safe and requires no unsafe at all. Same with std::mem::forget. But even if you exclude APIs like that that intentionally just forget about the value, again nothing stops you from retaining a reference forever in some global to keep it alive.
What is a type system except a bunch of proofs? You can encode some program correctness properties into types. Elevating the ones you happen to be able to encode and calling them "safety" and the rest "correctness" is just marketing.
I am not creating a gold standard because as far as I am concerned, it is all just correctness. There aren't morally more and less important correctness properties for general programs: different properties matter more or less for different programs.
>Without any mutation of any kind, you can’t even allocate memory in the first place (how do you think a memory allocator works?).
data L t = E | C t (L t)
data N = Z | S N
nums Z = E
nums (S n) = C (S n) (nums n)
You cannot express a reference cycle in a pure functional language but they still have allocation.
However I don't know why I brought this up, because you can also eliminate all memory leaks by just using garbage collection - you don't need to have immutable and acyclic data structures.
>Again, Box::leak is 100% safe and requires no unsafe at all. Same with std::mem::forget.
They are implemented using unsafe. There is no way to implement Box without unsafe.
If you retain a reference in a global then it is NOT a memory leak! The variable is still accessible from the program. You can't just forget about the value: its name is right there, accessible. That is not a memory leak, except by complete abuse of terminology. The concept of "inaccessible and uncollectable memory, which cannot be used or reclaimed" is a useful one. Your definition of a memory leak seems to be... any memory usage at all?
the unsafety is because of the lifetime laundering not because the operation is unsafe. The compiler doesn’t know that the lifetime of the underlying memory because static and decoupled from the lifetime of the consumed Box
And while we’re at it, please explain to me how this hypothetical language that allocates on the heap without mutable state exists without under the hood calling out to the real mutable allocator somewhere.
> If you retain a reference in a global then it is NOT a memory leak!
> Your definition of a memory leak seems to be... any memory usage at all?
It’s just that you’re choosing to define it as not a memory leak. Another definition of memory leak might be “memory that is retained longer than it needs to be to accomplish the intended goal”. That’s because users are indifferent to whether the user code is retaining the reference and forgetting about it or the user code lost the reference and the language did too.
So from that perspective tracing GC systems even regularly leak memory and then go on a hunt trying to reclaim it when they’ve leaked too much.
More importantly as has been pointed out numerous times to you, memory safety is a technical term of art in the field (unlike memory leaks) that specifically is defined as the issues safe Rust prevents and memory leaks very clearly do not fall under that very specific definition.
>the unsafety is because of the lifetime laundering not because the operation is unsafe.
You have missed the point. I said you can't leak memory in safe Rust. That is true. Box::leak isn't safe Rust: it uses the unsafe keyword. This is half the problem with the stupid keyword: it confuses people. I am saying that it requires the trustme keyword and you are saying it isn't inherently incorrect. Rust uses "unsafe" to mean both. But in context it is quite clear what I meant when talking about Box::leak, which you falsely claimed could be written in safe Rust.
>And while we’re at it, please explain to me how this hypothetical language that allocates on the heap without mutable state exists without under the hood calling out to the real mutable allocator somewhere.
What does the implementation have to do with anything? We are talking about languages not implementations. This isn't a difficult concept.
>It’s just that you’re choosing to define it as not a memory leak. Another definition of memory leak might be “memory that is retained longer than it needs to be to accomplish the intended goal”.
That isn't the definition. I am using the only definition of the term that any serious person has ever used.
>That’s because users are indifferent to whether the user code is retaining the reference and forgetting about it or the user code lost the reference and the language did too.
Users are completely irrelevant. It is logically impossible to ever prevent "leaks" that are just the storage of information. That isn't a leak, it is the intentional storage of information by the programmer. So it is a completely useless concept if that is what you want to use. It might be a useful concept in application user experience design or something but we are talking about programming languages.
On the other hand, "memory leaks" is a very useful concept if you use the actual definition because it is almost difficult to even conceive of a memory management strategy that isn't concerned with preventing memory leaks (proper). The "short lived program; free nothing" strategy is the only one I can think of, a degenerate case.
>More importantly as has been pointed out numerous times to you, memory safety is a technical term of art in the field (unlike memory leaks) that specifically is defined as the issues safe Rust prevents
No, it isn't! That is the definition that Rust people choose to use, which nobody used before 2015ish and is only widely used because Rust captured mindshare. It isn't some definition that predated Rust and which Rust magically fell right into.
Go back and look at mailing list threads, forum posts, papers, anything before Rust tried to steal the term "safety". It referred (and properly still refers) to programs. When people complained about manual memory management, the big complaint was that big C++ GUI programs (in particular) leaked memory like sieves. Nobody was particularly concerned about data races except the people implementing concurrency primitives in standard libraries etc. C++ didn't even have a defined memory model or standard atomics. Everyone was relying on x86's strong memory model in code all over the place. The big concern was avoiding manual memory management, memory leaks, and data corruption.
"Safe" didn't mean "has no data races but might have race conditions, has no use after free but might have memory leaks, and might have overflow bugs and SQL injections and improper HTML sanitisation". That would be a truly stupid definition. It meant "correct". The fanatical Rust community came along and tried to redefine "safe" to mean "the things we prevent". Rust's definition makes sense for Rust but it is Rust-specific because it is downstream of what Rust is capable of enforcing. Nobody would a priori come up with the particular subset of correctness properties that Rust happens to enforce and call them "safety". It is transparently a posteriori.
I believe the optimizer will do optimizations in response to the NonZero which can trigger UB if it does contain a 0, which is a traditional safety issue for Rust which can have no UB in safe code. But even the value being corrupt (ie NonZero returning 0) can cause memory safety issues. But yes, Rust also uses unsafe to bypass enforcing invariants, which std::mem::forget isn’t.
You were responding to my comment, which had scope broader than just leaking memory. So, to suggest it is only about leaking memory is not really responsive.