Works very well in cases where there is a lot of parameters that default to 0.
Keep in mind that you still need to know how structs work and you lose compile-time error detection.
Set your pointers to null when you free them! Set them to null when you transfer ownership! Stop leaving dangling pointers everywhere!
Some people say they like dangling pointers because they want their program to crash if something is freed when they don't expect it to be. Good! Do this:
assert(ptr);
There are also many more tricks you can do once you start nulling pointers. You can use const to mark pointers that you don't own and thus can't free. You can check that buffers are all zero before you free them to catch memory leaks (this requires zeroing out other fields too of course).
Please, null out your pointers and stop writing (most) use-after-free bugs!
Semi-experienced C user here, I believe the anonymous block is perfectly adequate here. No idea why they are wrapping it in a single instance do loop, unless they’re unaware of block scoping or I’m unaware of some UB here.
There might be a better way of doing it though. Also, __typeof__() obviously isn't standard C.
Edit to add: I've honestly been moving away from using a macro and just putting both statements on one line like in the OP. For something so simple, using a macro seems like overkill.
Taking a pointer-to-pointer is intentional to make it clear that the pointer will be modified. That's actually the most important difference from nn3's version IMHO.
I tried making it a plain function at one point but ran into some weirdness around using void * * with certain arguments (const buffers?). You don't want to accept plain void * because it's too easy to pass a pointer instead of a pointer to a pointer. Using a macro is (ironically) more type safe.
Maybe someone else could figure out how to do it properly, since I'd definitely prefer a function.
Your approach requires extra checks, though, which are easy to forget. Also, NULL is not guaranteed to be the stored as zeros, plus padding is going to make your life annoying.
Well, dangling pointers are also easy to forget... Yes, it requires some discipline. Good code requires discipline, doesn't it?
The trick of checking that buffers are zeroed is purely a debugging tool, so it's okay if it doesn't work on some platforms. And if you allocate with calloc(), the padding will be zeroed for you. It's actually very rare that you will have to call memset() with this technique.
This is like the most clichéd way of saying “my code has security vulnerabilities” that there is. I have yet to see code that has remained secure solely on the “discipline” of programmers remembering to check things.
> The trick of checking that buffers are zeroed is purely a debugging tool, so it's okay if it doesn't work on some platforms.
Fair.
> And if you allocate with calloc(), the padding will be zeroed for you.
It might get unzeroed if you work with the memory.
All code is full of vulnerabilites. If you say your code isn't, then I'm sure it is. I just do the best I can to keep the error rate as low as possible. But it's a rate, and it's never zero.
Also, it's not just about vulns in security-critical code. It's also about ordinary bugs. Why not be a little more careful? It won't hurt.
> It might get unzeroed if you work with the memory.
Maybe, but it isn't very common. I'm not sure when the C standard allows changing padding bytes, but in practice the compilers I've used don't seem to do it. And again, it's just a debugging aid, if it causes too much trouble on some platform, just turn it off.
It’s better to have automatic checks than rely on programmers being careful enough to remember to add them. For padding: this probably happens more on architectures that don’t do unaligned accesses very well.
Help me out here, because I'm really trying to understand. Are you saying that dangling pointers that blow up if you double-free them is an "automatic check"? If not, what kind of automatic check are you talking about?
If the extra code is really that bothersome, just use a macro or wrapper function.
It's a much better situation than NULLing them out, because that hides bugs and makes tools like Address Sanitizer useless. A dangling pointer, when freed, will often throw an assert in your allocator; here's an example of how this looks like on my computer:
$ clang -x c -
#include <stdlib.h>
int main(int argc, char **argv) {
char *foo = malloc(10);
free(foo);
free(foo);
}
$ ./a.out
a.out(14391,0x1024dfd40) malloc: *** error for object 0x11fe06a30: pointer being freed was not allocated
a.out(14391,0x1024dfd40) malloc: *** set a breakpoint in malloc_error_break to debug
Abort trap
As you turn up your (automatic) checking this will be caught more and more often. Setting the pointer to NULL will silently hide the error as free(NULL) is a no-op and nothing will catch it. Thus, the suggestion here was
1. advocating adding additional code, which has historically been hard to actually do in practice, and
2. providing a suggestion that is generally worse.
I can see an argument for wrapping it in a macro so you can turn off nulling in debug builds (ASan might even have hooks so you can automate this, I know Valgrind does). But use-after-free is worse than just double-frees, and if you read a dangling pointer in production there's no real way to catch it AFAIK. Last I heard (admittedly been a few years since I checked), you're not supposed to deploy ASan builds because they actually increase the attack surface.
So, your program's memory is full of these dangling pointers, and at some point you will have a bug you didn't catch and use one. And you can't even write an assertion to check that it's valid. What do you propose?
And again to clarify, I'm not trying to advocate for hiding bugs. I want to catch them early (e.g. with assertions), but I also want to avoid reading garbage at runtime at all costs, because that's how programs get pwn'd.
> But use-after-free is worse than just double-frees
From an exploitability point of view they are largely equivalent.
As for the rest of your comment: my point of view is largely "you should catch these with Address Sanitizer in debug", so I don't usually write code like "I should assert if I marked this as freed by NULLing it out". If I actually need to check this for program logic, then of course I'll add something like this.
The macro you suggest would alleviate my concerns, I suppose, and it wouldn't really be fair for me to shoot that solution down solely because I personally don't like these kinds of assertions in production. So it's not a bad option by any means, other than my top-level comment of this requiring extra code. I know some libraries like to take a pointer-to-a-pointer so they can NULL it out for you, so that is an option for your wrapper. And a double-free that doesn't crash can sometimes open up exploitable bugs too since it messes with program invariants that you didn't expect. But these are much rarer than the typical "attacker controlled uninitialized memory ended up where it shouldn't" so it's not a big deal.
I’m not sure what it could have said after saying that programmers should have disciple after I mentioned that their thing required extra checks to work.
The “discipline” in this case (see the whole thread) is “have programmers remember to insert checks”, which has historically been a good way to have security holes crop up. So I’m not sure what was dishonest about it?
They argued that discipline is necessary, not sufficient, to produce good code. You represented the argument as: "discipline is sufficient for secure (good) code"
You took the original argument, changed it to be fallacious, and used it as a strawman. That's what was dishonest about it.
I don't think that's fair in this case because nulling out pointers isn't the first line of defense. If you forget to do it once, it's not going to cause a bug in and of itself. You can easily grep the code periodically to find any cases you missed.
I think that's the misunderstanding, then, because to me it seemed to be a defensive coding practice (I think it was certainly presented as such in the top comment). My "you need extra checks" claim was mostly aimed at the additional things you add on to your code assuming that you are now zeroing out freed pointers, which I think can lead to dangerous situations where you may come to rely on this being done consistently when it's a manual process that is easy to forget.
Left unsaid due to the fact I was out doing groceries this morning when I posted that was that I don't think this is even a very good practice in general, as I explained in more detail in other comments here.
Indeed, it shouldn't be a first line of defense (nulling + an assert seems reasonable, fwiw), and accessing a nulled out pointer is just as UB as any other UB. It's probably more likely to crash immediately in practice, but it's also easier for an optimizer to "see through", so you may get surprising optimizations if you get it wrong.
Honestly, unless you really cannot afford it time-budget wise, I would just ship everything with ASAN, UBSAN, etc. and deal with the crash reports.
Shipping code with Address Sanitizer enabled is generally not advisable; it has fairly high overhead. You should absolutely use it during testing, though!
>> NULL is not guaranteed to be the stored as zeros
> Is that a real issue, though?
Of course, it's not, but that's one of those factoids that everyone learns at some point and feels like needing to rub it into everyone else's face assuming that these poor schmucks are as oblivious to it as they once were. A circle of life and all that.
Forgive me for encouraging the adoption of portable, compliant code to those who may not otherwise be aware of it. If you want to assume all the world’s an x86 that’s great but you should at least know what part of your code is going to be wrong elsewhere.
Please don’t get me wrong but these precautions sound like you are sweeping problems under the carpet which will come out one day back again. It sounds like you have ownership issues in the design and trying to hide ‘possible future bugs’.
Do you use sanitizers for use-after free bugs? I see many people still don’t use them even though sanitizers have become very good in the last 5-6 years
It's defensive coding. Do you think defensive driving is 'sweeping problems under the carpet'? (It is, but it's still useful...)
I use every tool at my disposal. Sanitizers, static analyzers... and also not leaving dangling pointers in the first place. Why would I do anything less? It doesn't cost anything except a little effort.
Take a look at this recent HN link: https://www.radsix.com/dashboard1/ . Look at all those use-after-free bugs. Even if it only happens 1% or 0.01% of the time... It's a huge class of bugs in C code. Why not take such a simple step?
If it works for you, then it is okay. It is not ‘a little effort’ for me to worry about someone else might use this pointer mistakenly, so I need to think about that all the time. It shifts my focus from problem solving to preventing future undefined behavior bugs. These bugs in the link, I don’t know C++, it is a big language which does a lot of things automatically, so it is already scary for me :) Maybe that is it, I write C server side code mostly(database) with very well defined ownership rules. Things are a bit more straightforward compared to any c++ project I believe. I just checked again, we don’t have any use-after free bugs in the bug history, probably that is because of %100 branch coverage test suite + fuzzing + sanitizers. So I rather adding another test to the suite than doing defensive programming. It is a personal choice I guess.
Generally, it is considered preferable to find problems as early as possible. If a program fails to compile or quickly crashes (because of a failed assertion), then I consider that better than having to unit test and fuzz test your code to find that particular problem.
As an added benefit the code also becomes more robust in the production environment, if there are use cases you failed to consider -- 100% branch coverage does not guarantee that there are none!
> Generally, it is considered preferable to find problems as early as possible.
Whole heartedly agree.
> If a program fails to compile or quickly crashes (because of a failed assertion), then I consider that better than having to unit test and fuzz test your code to find that particular problem.
This confuses me. My typical order would be:
fails to compile > unit test > quick crash at runtime > slow crash at runtime (fuzzing)
Every problem can be solved in many different ways. If you think you've already got use-after-free bugs under control, then more power to you! You absolutely have to concentrate your effort on whatever your biggest problems are.
But I'll also say that if you don't have any use-after-free bugs in the history of a large C codebase... you might not even be on the lookout for them? I still have them sometimes, mainly when it comes to multiple ownership. And those are just the ones I found eventually.
So yes, different strokes for different folks, but if you make the effort to incorporate tricks like this into your "unconscious" coding style, the ongoing effort is pretty minimal. Even if you decide this trick isn't worth it, there are countless others that you might find worthwhile. I'm always on the lookout for better ways of doing things.
I meant no use-after-free bugs in production, otherwise we find a lot in development with daily tests etc. but looks like we catch them pretty effectively. It works good for us but doesn’t mean it’ll work for all other projects, so yeah I can imagine myself applying such tricks to a project some time, especially when you jump to another project which has messy code, you become paranoid and start to ‘fix’ possible crash scenarios proactively :)
Big reason for defensive coding like nulling pointers is to make the code fail hard when someone messes up when they make a change. One can imagine the sort of hell unleashed if later the code is changed to make use of a dangling pointer. That's often the type of bug that slips through testing and ends up causing rare unexplained crashes/corruption in shipped code. Worse it can take multiple iterations of changes to finally expose the bug.
This makes UAF easier to detect but double-free impossible to detect. I would consider that to be worse than not doing anything at all, especially since it isn't amenable to modern tooling that is much better at catching these issues than hand-rolled defensive coding.
The types of things that _Static_assert takes is substantially more limited than this construct, as it can only take an "integral constant expression" which is in practice basic integer arithmetic and nothing else. This construct works with more complicated things that are nonetheless known at compile time, such as "asdf"[4] (should be 0).
Careful if the the struct contains big arrays though, this will let the executable size explode because a copy of the struct content is placed in the executable.
X Macros! Mostly because it's one of the more understandable and funky things you can do with the preprocessor. You can do some crazy stuff with them :)
Oh, that has a name. In C, often used with an include file instead of a body macro, and often (?) used where code wants multiple internal representations of some table of data.
I agree wholeheartedly. I may be called a hater and I may be raining on everyone's parade, but I opened this thread expecting to find horrors and I did (plenty of metaprogramming).
The very notion of there being tricks and that knowing them makes you better is something I hate about C and C++. Most tricks I read here are bandaids over usability issues the languages have. Yes, they alleviate an issue but may introduce unexpected consequences and distance your dialect from the rest of the community.
I am so very thankful that C and C++ are no longer the only options for low level, non garbage collected programming.
True, but I also consider interesting combinations of standard features, especially C99+ features useful "tricks", because C99 features which make life so much easier are usually little known in predominantly C++ circles, because C++ only supports an outdated and non-standard subset of C.
E.g. this is 'named, optional arguments East Egg' is a useful trick which also improves readability:
Like magic. In other words, how do you pretend that you have syntax-level OOP in C... On retrospect, the macro could reduce readability, writing the argument explicitly may be better.
I think what GP is referring to is C++ not supporting C's designated initializers, restrict qualifiers, or flexible array members features. These are roughly the only C features not in C++ that are worth supporting (the STL in C++ works better than VLAs, and templates work better than type-generic macros). All of the other new C features are either backported C++ features with different spellings, or new functionality (mostly library) that C++ adopts.
There's a surprising amount of perfectly valid C code that's not valid C++ (not even taking regrettable design warts like VLAs into account). The "common subset" of C and C++ is both a subset of C, and a subset of C++, e.g. C++ has forked C and turned its C subset into a non-standard dialect.
It's interesting that the other C descendant Objective-C has decided to "respect" its C subset instead of messing with it, with the result that new C standards are automatically supported in ObjC.
C++20 designated initializers must be specified in definition order, whereas they don't in C99.
In your example,
Point point = {.z = 2.0, .x = 1.0};
produces a compilation error. This is annoying, but workable. And at least it's a compile-time error, rather than a bug that perniciously sneaks into production.
A friend of mine's showed me this in a code he used for programming an educational operating system. If you have a pointer to a member of a struct, with the macro container_of you can retrieve a pointer to the enclosing struct.
/* Return the offset of 'member' relative to the beginning of a struct type */
#define offsetof(type, member) ((size_t) (&((type*)0)->member))
#define container_of(ptr, type, member) \
((type *)((char *)(ptr) - offsetof(type, member)))
I was about to ask if it was legal to dereference a null pointer and then take the address of it... I presume it is not, but I'm surprised compilers don't complain at compilation time.
Dereferencing a null pointer and then immediately "undoing" it by taking its address is actually legal, I believe. I think the undefined behavior here is the member access instead of the magic sequence &* which is supposed to cancel out.
This is used by the Ganesha project (userspace NFS server). Look for the symbol "container_of" and usages of it in https://github.com/nfs-ganesha/nfs-ganesha/ (disclaimer: I'm a minor contributor).
The way it's used is that Ganesha supports defining of alternate filesystem backends and serving them as NFS shares. Handles to objects (e.g. files) would exist as pointers which live inside the struct of the backend's handle struct. i.e.:
struct my_file_data {
struct ganesha_file_data {
// generic data
};
// data specific to my module
};
The "my" module would take pointers to ganesha_file_data when the NFS core code calls it. The "my" module then uses container_of to convert ganesha_file_data ptr to my_file_data ptr.
Depends on what’s in your asserts. If, for example, you check validity of your custom data structure in an assert, even a single assert can cost a lot to more than nothing.
(Whether you should use assert for that kind of checks that probably only should run in tests is debatable, but it sometimes happens)
The string literal is a pointer, so !pointer is false. And then you get a nice explanation message when the assertion fails. I wish all assert gave optional explanation messages.
For anyone following along. Don't do this if the code base is already hotshit. This will only make things worse and not prevent errors, just cause more outages.
This basically allows you to use std::vector<T> like vectors in C, but with an added benefit that you can subscript the vector like arr[3] rather than using unwieldly functions like vector_get(arr, 3) or vector_put(arr, 3, value).
It's been a while since I've done any pure C, and I'm sure I'll be outshone by others', but I've always liked RAII in C: https://vilimpoc.org/research/raii-in-c/
Foreach macros. Nice when you have a list of constant that you need for declaring a lot of tables or enumerations. Here an example with ISO-639 language codes
Example with 3 "values". This is the base definition from which all the tables and enums are produced.
In general, pointer math. When I learned that myArray[10] would produce the exact same result as 10[myArray], it forced me to dig deeper into the whole C pointer model deeper and respect the architecture even more.
When reworking a large amount of code, this can be used to tag places that may need another pass or a review. The debug version will build fine, but the release won't until all review tags are removed.
This also allows adding free form comments if needed:
Interesting -- which compilers accept this? gcc and clang reject it in C and C++ modes with a message like "error: pasting formed '//', an invalid preprocessing token".
Zero-length arrays, used to implement a variable-sized structure with header: https://gcc.gnu.org/onlinedocs/gcc/Zero-Length.html. Although apparently it’s a GCC extension, which I didn’t realize until now.
Unfortunately the type can not be used at function definition but that is not where function are interesting. They are neat for function pointers especially those that require casting.
void function_taking_foo(int, foo_fn *);
function_taking_foo(1, foo1); // no cast necessary as type is identical and even
instead of
function_taking_foo(1, (bool(*)(int,char))foo1);
When you have a lot function pointers it is incredibly more readable than the usual syntax.
My favorite C trick is not related to programming - but to debugging and disassembly. Unlike C++, there is no name mangling, so stack traces are a breeze to read, especially with -fno-inline-functions and -O0 or -O1. There is no implicit action at all (i.e. no exception handling or destructors) so there is a simple mapping between the assembly and source code.
My favorite "programming" trick only applies if I'm not sharing my code. I just forego header files entirely, and just #include the .c source files. Also, put `#pragma once` in all the .c files to avoid double-inclusion without the hassle of #if ... etc. This requires a bit more diligence since you can't have mutual recursion between source files.
That's (usually) only true for the very first '->' in a chain and as you said, depends on the compiler figuring out if the pointer indirection can be resolved at compile time.
A chain of '.' on the other hand is always guaranteed to be resolved into a single offset at compile time.
> Why does C even have a two member selection operators?
Because using `(*ptr).member` everywhere is annoying. There's plenty of times you want or need to have direct access to a member rather than always dereferencing a pointer.
It's too bad C's pointer-deref operator is prefix instead of postfix. In Pascal it's ^ so you write ptr^.member and there's no special -> operator. Even better, declarations and expressions would read intuitively left-to-right instead of spiraling out through stars on the left and brackets on the right.
C was practically a portable assembler when it was designed, and it was likely helpful for performance reasoning that all indirections were clearly visible.
Unions are always better than casts. It would be better if casts in C looked like union deselection, because parenthesis are ugly. There really should be an infix casting operator.
An example of where this matter is an AST forest, like this:
struct node { int tag; }; // Generic node
struct infixnode { int tag; struct node *l, *r; };
struct intnode { int tag; int val; };
But it's really ugly to use. If you have an expression represented in this AST like 'a(b(c+d))' and you want to access 'd', you need to do this:
struct node *n;
int d = ((struct intnode *)(((struct infixnode *)(((struct infixnode *)(((struct infixnode *)(((struct infixnode *)n)->r))->r)))))->val;
union node *n;
int val = n->infix.r->infix.r->infix.r->intval.val;
The same holds for C++. In C++ you could make a class hierarchy for your AST. But you still have the casting to convert to the derived types, which is just as ugly.. But you can instead make inline access functions in the base class whose sole purpose is to do this casting, you end up something like this:
node *n;
int val = n->infix()->r->infix()->r->infix()->r->intval()->val;
Pack your structs, use tcpip host to network, and network to host to marshal data in and out, save state as ascii, never trust a float and don’t get creative and it’ll “just work“.
With modern compilers (and hardware, for that matter), Duff's device is probably a deoptimization rather than an optimization. Duff's device creates an unstructured control-flow graph (a loop with multiple entry points), which is going to cause several optimizations to bail out, and will absolutely prevent any loop optimizations (such as vectorization) from kicking in for that loop. In hardware terms, for tight loops, the entire loop in regular terms is probably going to be in a hardware loop µop cache, and the loop branch predictor will probably predict the loop exit condition with 100% accuracy.
I don’t get it, and tried to look it up without success. The only way I get it is if it also reverses logic was a 0 start value becomes true. But how is this different from !x ?
I was excited then annoyed. I hate the SO moderators. It makes no difference to them how many people seek the answer, if they deem it off topic all those people are out of luck.
As the original asker of the question, I agree. This question was from 12 years ago, back when Stackoverflow was a lot more fun. I asked the question because while I knew a few neat little things, I knew other people also knew cool things that I didn't know, and I wanted to find out what they might be.
I don't get that exclusionist attitude either. If they think its subjective, why not just tag it accordingly, or move it into some sub-forum instead of outright killing the discussion.
Because they'd have to exclude it from search because it'd affect the results
People might like it or not, but SO is a tool that provides fast access to answers on technical questions because people want to solve their problems(and learn), it's QA.
If you want to discuss stuff then there are forums, reddits, discords or even HN.
How many times have I googled a question only to arrive at a closed SO question- too many times. It's truly annoying. Plainly people do want to discuss things on SO.
it's not meant to be a discussion forum, it's more a database of questions and answers ideally written to Wikipedia type quality. It gets a lot of hate for its quite strong moderation, however, it's ended up one of the best resources to find answers to programming questions, so they are doing something right.
This is undefined behaviour except when using msvc. For GCC and clang, you should use [] (empty square brackets) to denote a variable length array at the end of a struct. For msvc, [1] is used all over win32 and is explicitly supported for that purpose.
Edit: it's UB because you're accessing past the end of an array with size given at compile time.
Right yeah the middle with char b[1]. I was fighting with the asterisks being deleted to notice the mis-type. This method is used a lot in OS programming.