Mazda cars used to have a bug where they used printf(str) instead of printf("%s", str) and their media system would crash if you tried to play the "99% Invisible" podcast in them. All because the "% In" was parsed as a "%n" with some extra modifiers. https://99percentinvisible.org/episode/the-roman-mars-mazda-...
I don't like a lot of things in C++, but one thing worth praising in particular is std::format
std::format specifically only works for constant† format strings. Not because they can't make it work with a dynamic format, std::vformat is exactly that, but most of the time you don't want and shouldn't use a dynamic format and the choice to refuse dynamic formats in std::format means fewer people are going to end up shooting themselves in the foot.
Because it requires constant formats, std::format also gets to guarantee compile time errors. Too many or not enough arguments? Program won't build. Wrong types? Program won't build. This shifts some nasty errors hard left.
† Not necessarily a literal, any constant expression, so it just needs to have some concrete value when it's compiled.
The flag is -Wformat-nonliteral or -Wformat=2. -Wformat-security only includes a weaker variant that will warn if you pass a variable and no arguments to printf.
One reason is locale-dependent format strings which are loaded from resource files.
Also, in personal projects, I almost always used custom wrapper functions for printf/fprintf/sprintf for various reasons, so that default wouldn’t be of much use, unless maybe I could enable it for the custom functions.
I've never seen locale-dependent format strings work well. The translators will change the formatting codes, and you can't change the order of the formatted arguments. You are much better off with some other mechanism for this.
(I have no recommendations. When I've seen this stuff done properly, on the occasions I've managed not to avoid doing it, it's always been using some in-house system.)
It is specified by POSIX, but not by ISO C (or C++). So most Unix(-like) systems support it. But the printf in Microsoft's C runtime doesn't. However, Microsoft does define an alternative printf function which does, printf_p, so `#define printf printf_p` will get past that.
I think the real reason you rarely see it, is it is only used with internationalisation–the idea being if you translate the format string, the translator may need to reorder the parameters for a natural translation, given differences in word order in different languages. However, a lot of software isn't internationalised, or if it is, the internationalisation is in end-user facing text, which nowadays usually ends up in a GUI or web UI, so printf has less to do with it. And the kind of lower-level tools/components for which people still often use C are less likely to be internationalised, since they are targeted at a technical audience who are expected to be able to read some level of English.
Which is fine for English, but doesn't work at all for other languages. The problem is not just that the plural ending is something other than `s` – if it was just that, it wouldn't be too hard. The problem is that the `count != 1` bit only works for English. For example, while 0 is plural in English, in French it is singular. Many other languages are much more complex. The GNU gettext manual has a chapter which goes into this in great detail – https://www.gnu.org/software/gettext/manual/html_node/Plural...
printf() has zero hope of coping with this complexity. gettext provides a special function to handle this, ngettext(), which is passed the number as a separate argument, so it can select which plural form to use. And then the translated message files contain a header defining how many plural forms that language has, and the rules to choose which one to use. And for some languages it is crazy complex. Arabic is the most extreme, for which the manual gives this plural rule:
That might be possible if the "resource file" was processed at compile time. I've never seen a C toolchain that did it that way, though - I've only seen them read in at runtime. And at that point, the preprocessor can't save you.
Since the locale is only determined at runtime, and might even change at runtime, the format strings are usually dynamically loaded from text files, and are not in the form of string literals seen by the compiler.
In principle, they are not enabled by default because a C compiler must be able to compile standard C by default.
One practical reason I can think of is because not everyone compiles their own code.
You must most definitely look for and enable such flags as they become available in your own projects. (eg I was rooting for -Wlifetime but it did not land for various reasons)
But when you compile other people's code, your breaking your local build doesn't help anyone. Best you can do is to submit a bug report, which may or may not be ignored.
It's a standard library function meaning the compiler can assume that it follows the standard. Specifically for GCC [0]:
> The ISO C90 functions abort, abs, acos, asin, atan2, atan, calloc, ceil, cosh, cos, exit, exp, fabs, floor, fmod, fprintf, fputs, free, frexp, fscanf, isalnum, isalpha, iscntrl, isdigit, isgraph, islower, isprint, ispunct, isspace, isupper, isxdigit, tolower, toupper, labs, ldexp, log10, log, malloc, memchr, memcmp, memcpy, memset, modf, pow, printf, putchar, puts, realloc, scanf, sinh, sin, snprintf, sprintf, sqrt, sscanf, strcat, strchr, strcmp, strcpy, strcspn, strlen, strncat, strncmp, strncpy, strpbrk, strrchr, strspn, strstr, tanh, tan, vfprintf, vprintf and vsprintf are all recognized as built-in functions unless -fno-builtin is specified (or -fno-builtin-function is specified for an individual function).
Builtin here doesn't mean that GCC won't ever emit calls to library functions, only that it reserves not to and allows itselfs to make assumptions about how the functions work, including diagnosing misuse.
The library functions themselves might also be marked with __attribute__(format(...)) as the sibling comment notes but that is not necessarily required for GCC to check the format strings.
The %n functionality also makes printf accidentally Turing-complete even with a well-formed set of arguments. A game of tic-tac-toe written in the format string is a winner of the 27th IOCCC.
- sez wiki.
A not so fun fact:
Because the %n format is inherently insecure, it's disabled by default.
>The %n functionality also makes printf accidentally Turing-complete
No it doesn't. Printf has no way to loop so it's not Turing complete. Even if you did what the IOCCC entry did with putting it into a loop it still wouldn't be Turing complete as it would not have an infinite memory.
Nothing real is "Turing complete" if it requires infinite memory. That's a property only abstract machines can have. In common parlance, something is Turing complete if it can compute arbitrary programs that can be computed with M bits of memory, where M is arbitrary but finite.
I'd say they're Turing complete (in common parlance) if they can reasonably viewed as a Turing-complete system that's been hobbled with an arbitrary memory limitation. FSAs generally can't be viewed this way as you can't just "add more memory" to an FSA. By way of contrast, consider a pushdown automaton with two stacks. While any physically real implementation of such a device will necessarily have some kind of limit on the size of the stacks, you can easily see how the device would behave if this limit were somehow removed.
It's definitely a bit fuzzy. I'm sure lots of philosophy papers have been written on when exactly it is or isn't appropriate to consider a finite computational system as a finite approximation to a Turing-complete system. In realistic everyday cases, however, it's usually clear enough what should and shouldn't count as such.
Yes, but it also changes the state transition logic. You can't just 'add 100 more states' to an FSA in the same way that you can 'add 100 more stack slots' to a bounded pushdown automaton.
As I said previously, these are somewhat fuzzy distinctions, and I'm not saying that they're easy to make mathematically precise. They do however seem clear enough in most cases of practical interest. There are many real-world computing systems that would be Turing-complete if they had unbounded memory. There are others that are not Turing-complete for more fundamental reasons than memory limitations. Again, I acknowledge that 'more fundamental' is not a mathematically precise concept.
Well, I'm pretty sure all existing digital computers are finite state automata, so they are not, strictly speaking, Turing complete. But that doesn't make any sense.
Strictly speaking there are essentially no programming languages that are even theoretically Turing-complete because they can only address a bounded amount of memory. For example, in C `sizeof(void*)` must be a well-defined, finite integer. But that definition is not useful in practical use.
Even if that was true, any definition of Turing completeness that includes JavaScript and excludes C is worse than useless in practice. It's useless for communication, it's useless for education, it's useless for reasoning about capabilities. There's simply no place for such a definition in a civilized society.
Turing machines themselves are a useless concept in our society. Since C is lower level and tied more to the physical hardware it makes sense that it is not Turing complete because Turing machines are not applicable to the real world. Computers do not work anything like an infinite tape. I've never seen a practical program have to implement a Turing machine.
This is one of those annoying little problems that is easily picked up by the vet command (https://pkg.go.dev/cmd/vet) when writing Go code. There are, of course, many linters that do the same thing in C, but it's nice to have an authoritative one built in as part of the official Go toolchain, so everyone's code undergoes the same basic checks.
Mazda cars used to have a bug where they used printf(str) instead of printf("%s", str) and their media system would crash if you tried to play the "99% Invisible" podcast in them. All because the "% In" was parsed as a "%n" with some extra modifiers. https://99percentinvisible.org/episode/the-roman-mars-mazda-...