Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
C23: A Slightly Better C (lemire.me)
216 points by mfiguiere on Jan 21, 2024 | hide | past | favorite | 169 comments


A small improvement in C23 - but probably my favourite - is that it finally allows writing:

   struct bla_t bla = {};
...instead of

   struct bla_t bla = {0};
This was one of those pointless differences between C and C++ which caused a lot of grief when writing code in the common C/C++ subset.

PS: also func() now actually is the same as func(void), and unnamed parameters are allowed.


  The eighth line uses the static_assert keyword, which is a feature of C++11
C11 already had _Static_assert() and static_assert() in assert.h


That's already noted at the end:

> The idea behind static_assert is great. You run a check that has no impact on the performance of the software, and may even help it. It is cheap and it can catch nasty bugs. It is not new to C, but adopting the C++ syntax is a good idea.


More accurately: C23 makes `static_assert` a keyword and remove `#define static_assert _Static_assert` in <assert.h>, so `#include <assert.h>` is no longer needed for `static_assert`. So the syntax was always supported since C11 but it's made more convenient to use.


fta "It is not new to C, but adopting the C++ syntax is a good idea."


The keyword is new though


Did that work at compile time?


What would it mean for _Static_assert to not work at compile time?


The auto keyword seems like a strange addition given it's already a C keyword with a different meaning and this change won't help developers that much. It's more useful in C++.


It's more useful in C++ with things like generic and voldemort types, but I still would not discount it in C as the lack of namespacing impacts type names, and it could limit convenience typedefs (those which exist only to avoid the `struct` and `enum` prefixes). It might also help transitioning to fixed integer types, which are rather verbose.


Hah. Voldemort types. The ones that can't be named (in practice becouse too long to fit on a line or remember)?


> The ones that can't be named (in practice becouse too long to fit on a line or remember)?

Nah those are easy to name, just annoying, it's types which literally don't have an external / public name, like lambdas, or locally defined types.


It's actually a struct that only has a name in the scope of the function which returns `auto`, and thus cannot be named outside of it. Like this:

    #include <iostream>

    auto createVoldemortType(int value) {
        struct Voldemort {
            int value;
        };
        return Voldemort{value};
    }

    int main() {
        auto voldemort = createVoldemortType(7);
        std::cout << voldemort.value << std::endl; // output: 7
    }


I wonder how much this complicates parsing C++. Because of this, you can't discard/free struct and class definitions as soon as you leave the scope, like you can in C, because the definition can still escape the scope by being returned from a function with the "auto" keyword.


The entities and their destructors still have names that the compiler and linker understand. Programs just can't name them.


I mean that it would complicate just the parser.

For many compilers, as soon as the parser sees a left curly brace, it pushes a symbol table onto a stack, and when it sees the corresponding right curly brace, it pops the symbol table off the stack, and "forgets" any declarations that were made inside that scope. That is so things like this work as expected.

  {
      int x = 0;
      {
          int x = 1;
          printf("%d\n", x); // prints 1
      }
      printf("%d\n", x); // prints 0
  }
But, in C++, using the auto keyword, declarations can escape their scope with auto. I'll change the C++ code that OP wrote. The C++ compiler has to correctly resolve cases like this, which means it can't just forget all the declarations within the scope of the function after the definition is done.

    #include <iostream>

    auto createVoldemortType(int value) {
        struct Voldemort {
            int value;
        };
        return Voldemort{value};
    }

    struct Voldemort {
        std::string value;
    };

    int main() {
        auto voldemort = createVoldemortType(7);
        std::cout << voldemort.value << std::endl; // output: 7
    }


During semantic analysis a parser usually attaches symbol info (of some kind) to the already existing abstract syntax tree, or creates a new tree entirely. Whenever it needs to know about a type, it just walks the tree to the node with the type definition. That way there’s really never any data that‘s forgotten.

At least that’s how I think the parsers work I‘m familiar with.


Yeah, it depends on the compiler.

I've read a book about a BLISS compiler [1] that does this, but still uses a stack like I described [2]. It implements a hash table that used linked list nodes for collision. A new declaration adds a new name to the table, and it attaches the node to uses of the name in expressions of the syntax tree.

When a scope is exited, the declarations from that scope are removed from the symbol table, but because they're still attached to the syntax tree, they can't just be freed. They're added to a linked list of "purged" nodes, so that the information they contain can be used later during code generation, and then freed.

One-pass compilers don't have this problem; they really can just free the memory for reuse, because after they exit a scope, they've already generated the assembly or machine code from the high-level language.

However, I don't know what LLVM or GCC, or any other remotely modern compiler, does. I haven't read the code much.

[1]: https://en.wikipedia.org/wiki/The_Design_of_an_Optimizing_Co...

[2]: Actually, it intertwines the stack and the symbol table in a complicated way, so there's only one hash table, and multiple stacks within it. It's explained by a diagram they include on page 13. You can find a PDF of it here: https://kilthub.cmu.edu/articles/journal_contribution/The_de...


How does the argument 7 get passed to the value field in the Voldemort struct?

I don't see any code ther that does that. Is it implicitly passed?

I don't know C++, though I did know C somewhat well earlier.


Oops, I missed the line:

return Voldemort{value};

I guess that does it.


This way to code is a good reason to make auto the same tabu as goto.


Also because they might be a) an undocumented implementation detail (the result of std::bind for example); b) utterly unutterable like the type of a lambda expression.


>Hah. Voldemort types. The ones that can't be named

Ha ha, that reminds me of that phrase of yore, "the quality without a name (qwan)" (google it), which was heavily bandied about years ago, during the heyday of C++ and the software patterns movement (which continued a lot in the days of Java, of course). James Coplien (IIRC) and others of that time come to mind.

https://en.m.wikipedia.org/wiki/The_Timeless_Way_of_Building

https://en.m.wikipedia.org/wiki/Pattern_language

https://en.m.wikipedia.org/wiki/Jim_Coplien

Though I read a fair amount about that stuff, a lot of of it went over my head, but later, I did understand some of the patterns, after reading the design patterns book, and trying out some of them.

The template method pattern is my favourite pattern, because I understand it more well than many of the others :), and also because it is the basis of software frameworks (inversion of control, aka the Hollywood principle - "don't call me, I'll call you"). Other patterns that I like and understand are the command pattern, the interpreter pattern, the chain of responsibility pattern, and flyweight pattern, to name a few. Builder and Factory, not so much. Singleton is straightforward, or is it really? impls matter :)

And I have written a few toy frameworks, which is fun to do and use.


C has generics as well. This permits the return type of generic functions to be preserved.


The only generic feature C has is the _Generic expression, which isn't a function, but closer to a switch(typeof(x)) expression.


_Generic can be used to merge a collection of functions.


It should have been called _Overload or something similar, since it's not really a generic.


That's not the same as being able to write a data structure that takes any type.


It's still generic. You don't have to have feature parity with C++ to meet that bar.


That would be like calling function overloading generics.


Are there any publicly-viewable C codebases that make use of pre-C23 auto? When I learned C (2003-ish) auto vs. register was at best a footnote that came up when discussing static.

I agree that you don't need auto pointers in C the way you do in C++. C++ type names can get so cumbersome...much easier to let the compiler figure it out for you.


It could help in generic macros, like this:

  #define SWAP(var1, var2) do { \
      auto tmp = var1; \
      var1 = var2; \
      var2 = tmp; \
  } while (0)
Previously, you would have needed a third macro for the type, or you would have needed to do a byteswap to be generic.


True, and now we also get typeof() so there is more than one way!

  const typeof(var1) tmp = var1;


Why’s that inside do/while?


If you simply put braces around it, you'd generate syntactically invalid code when you don't use braces for if-else statements.

Consider this:

  if (e)
      SWAP(a, b);
  else
     something_else();
Currently, that expands to this, which is still valid code:

  if (e)
      do {
          // ...macro...
      } while(0);
  else
      something_else();
If it were just wrapped in braces, the code would be parsed like this:

  // One-armed if-statement
  if (e) {
      // ...macro...
  }

  // Empty statement
  ;

  // Another statement
  else something_else();
It's incorrect syntax to start a statement with else, so the compiler will say something like "Unexpected 'else' at line N."


I would not have come up with that on my own. Thanks for the explanation!


No problem! I didn't come up with it on my own, either. I simply read it somewhere, and now I'm repeating it.


https://www.c-faq.com/ question 10.4

"What's the best way to write a multi-statement macro?"


Forces the use of a semicolon:

SWAP(…);


To scope the temporaries?


No, a block would be enough for that. It's to make ; handling natural when the macro is used.


Definitely my least favorite of the new additions. The rest of them are good but I have a feeling auto and typeof on the left hand of an assignment are both going to be considered no-nos in real codebases.

I think they obfuscate things unnecessarily. In C++ it's understandable because with templates you have a tendency to have massively complicated type names.

Everything else looks great to me though.


It standardizes existing common practice. Everyone already has `__auto_type` macros, but nearly nobody was using `auto` as a synonym for `int`.


based on https://en.cppreference.com/w/c/compiler_support/23

even the newest clang does not support many of the c23 features.

gcc13 is much better, only a very few c23 are yet to be supported.

this is a long standing problem with clang/clang++: they're used in many linters and intellisense but they're lagging behind by a lot comparing to gcc.


>they're used in many linters and intellisense but they're lagging behind by a lot comparing to gcc.

Pure speculation, but I wouldn't be surprised if these two were directly related to each other. Clang/LLVM is more modular, which makes it easy to write new things that integrate it (like a linter), but this can slow down adding new things that change the data model and external interfaces. GCC is the opposite: one big blob that makes it easier to add things since the surface area is lower.


Actually it is a mix of factors, companies enjoy their downstream forks without contributing to upstream thanks license.

Almost all major compiler vendors have migrated to clang forks and if they contribute upstream, is on the backend side, for their platforms, not fronted changes.

Big contributors like Apple and Google, deciding to refocus on their own languages, and current language support being good enough for their LLVM use cases.


All of the features descried in the article are supported by clang except maybe for the constexpr keyword. By your own list, neither one supports all the features. Also, gcc only supports about 4 or 5 features more than gcc, and clang supports a few gcc doesn't. Hardly "much better".


Pelles C for Windows actually supports all (perhaps missing one or two) C23 features, even `#embed`.


No mention of typed enums and nullptr. Small things but makes dev experience better


Interesting to see they are removing some of the quirkier things...

- Remove Trigraphs.

- Remove K&R function definitions/declarations (with no information about the function arguments)


>Remove K&R function definitions/declarations (with no information about the function arguments)

Not used C for many years, but used it a lot earlier.

And had read both the first and second editions of the k&r c book (ed. 1 pre-ansi, ed. 2 ansi).

based on that, iirc, this:

>Remove K&R function definitions / declarations

should actually be:

function definitions / declarations as in the first edition of the k&R C book.


The second edition of K&R distinguishes between them by calling the pre-ANSI declarations old-style functions, and the ANSI declarations new-style functions. Although, since it's not 1989 anymore, they're not exactly new to C either. ANSI C has been around for 35 years at this point, whereas in 1989, C was only 17 years old.


Yes, I knew that, though I did not mention it.

In my previous comment, I was going to say (from memory), that in the second edition, i.e. in ANSI C, both types of declarations are allowed, old style and new style.

Also, if you just wrote the declaration (return value, then function name followed by types with arguments in parentheses), followed by a semicolon, without a function body in braces, it was called a function prototype.

MS C (as in, some version of Visual Studio's command line C compiler), had a flag to generate the prototypes from the function definitions. I had used it some. /Zg, possibly.


Sorry, I wasn't trying to imply that you didn't know that. I meant to suggest that it would be the most unambiguous way to differentiate them, but then realized that the names weren't accurate because C89 declarations aren't new anymore. So, I shouldn't have written the comment anyway.


Not a problem anyway, but thanks for replying :) I see what you mean.


What do you mean about the second one? In K&R functions are explained as

    name (argument list, if any)
    argument declarations, if any
    {
          declarations
          statements
    }
Is this disallowed in C23?


Yes. And `int name() { ... }` now is same to `int name(void) { ... }`.


Wait, no trigraphs anymore???

This is horrible, I loved using `and` and `not` and `or` in C boolean expressions and to pretend I'm writing Python code. It's fun!


Those aren't trigraphs; those are macros defined in the <iso646.h> header, like this:

  #define and &&
  #define not !
  /* ... and so on */
In C++, they're actually keywords that are built into the language, so you don't need a header then.

When people talk about trigraphs in C, they're talking about the trigraphs listed on this page: https://en.cppreference.com/w/c/language/operator_alternativ...


I find those keywords much more readable, and faster to type (qwerty). I also wish Rust had those instead of the weird symbols. Oh well, c'est la vie.


In Rust you can just use the method call `expr.not()`


While it's true that in Rust most operators (eg. all arithmetic operators, as well as !, &, |, ^) are just syntax sugar over some trait methods that you could call directly, this is not the case for boolean "and" and "or": && and || have to be special due to their short-circuiting properties (in "false && b()" b never gets called, same in "true || b()").


Those are not trigraphs, of course.


Really? The Wikipedia article on trigraphs seems to disagree. Why do you think so?


Looks like that article is a general one on two- and three-character sequences.

In C, "trigraph" specifically refers to nine special sequences beginning with "??": https://en.wikipedia.org/wiki/Digraphs_and_trigraphs#C

They were used to provide an alternate way of typing punctuation characters for keyboards that don't have them (e.g. "??(" is "[").


Why don't they start to support units instead of these ugly header files to finally stop duplicating code? I see this might break stuff but you could implement it side by side with the preprocessor so old code would continue to work.


Walter Bright says he was able to implement modules in C with 10 lines of code. That probably works with C but it'd make C incompatible with C++. Most people think that's reason enough not to do it. I think breaking compatibility with C++ would be like cutting the ropes to a sinking ship to save your own.


> That probably works with C but it'd make C incompatible with C++.

C is not a subset of C++. The simplest example:

> int new = 1;

Perfectly valid C, but not valid C++.

Now, modules would obviously be a bigger break than a few keyword incompatibilies, but at that point you'd also massively increase complexity and basically start creating C++ again, which is probably the actual reason it hasn't happened yet.


Modula-2 had modules in 1978, and it was still a pretty simple language. They don't necessarily imply recreating C++.


If you listen to the C++ people it's obvious they think the incompatible changes are a terrible mistake. They can't undo them but they're adamant no other breaking changes be allowed. They don't want things to get worse for them. Meaning better for everyone else.

And no modules wouldn't make C more like C++. It's just make C a bit saner and faster to compile. Because modules are complete and don't depend on code compiled before the import. Which means you can compile your modules exactly once and only once. Well written C code bases you could probably just change #include to #import and reap the benefits.


TBF, C doesn't suffer from the "headers cause slow builds" problem of C++, because C headers usually only contain a handful of declarations (e.g. the C stdlib headers are a couple of hundred lines of function prototypes and struct declarations, not tens of thousands of lines of gnarly template code like C++ stdlib headers).


When I've looked at what's going on with gcc what I see is most of the code being compile is header files. It's probably 90% headers and 10% actual code. And then there is all the dependency files it's reading. Though to be fair I think a lot of that is not so much trying to make C compile fast but to make C++ compile fast.


Might not be a popular opinion, but I actually like that C headers only contain the public API declaration of a "module" while the implementation and private declarations live in a separate file. Makes it easy to figure out the public API just by reading the header and without having to sift through unimportant implementation code.

In other languages you need IDE support for extracting the public API of a module.


That's orthogonal to the module system. OCaml for example has `.ml` and `.mli` files for the same purpose, but `.mli` can be automatically generated from `.ml` if you want. And that's absolutely nothing to do with OCaml's great module system.


Does the .mli file preserve documentation comments? IMHO the header file should be written for human consumption first. Documenting the implementation isn't by far as important.


The main OCaml implementation doesn't, but I believe there are several third-party tools that automate this. And as you've mentioned, comments in `.ml` and `.mli` do tend to differ, so even the default auto-generation may be useful because it will do as many jobs as possible without a human intervention.


why not introduce an "import" keyword and keep the include keyword for old headers?


If you mean #import, it already exists as a keyword (though the meaning is different between GCC/Clang and MSVC).


I hoped for defer (e.g. "RAII" of sorts), is there any way?


According to https://www.open-std.org/jtc1/sc22/wg21/docs/papers/2021/p23..., it's an "unadopted feature being explored."

There's more information about the proposal at https://gustedt.wordpress.com/2022/01/15/a-defer-feature-usi...


There was an update on this just last week: https://github.com/ThePhD/future_cxx/issues/67#issuecomment-...


I remember back in the day everyone insisted C codebases had to be C89 so people could compile them in visual studio. EG the python interpreter. Is this still the case?


Visual C now does C17, minus the C99 features that became optional in C11 like VLAs.


> If you have GCC 23 or LLVM (Clang) 16

gcc 23? That doesn't sound right


probably meant 13


I'm told the C standards committee is actually just the C++ one. Explains a lot.


quick qn ? for the tinkerers -- is there a language these days that nicely interfaces with c i.e able to use all c libraries through ffi. without loss of performance or extra fu.

the language being small enough that all basics can be grasped in a day.

the language being complete that it will remain the same in 10 years.

while at the same time being memory safe.

btw I don't think golang applies given the bad ffi story in go.

--- edit btw:: yeah this implies the use of a GC. though it must not have massive pauses or stop the world GC.


No there really isn't. The closest thing is "Rust after you've already gotten over the learning curve", but I realize that that learning curve is why you phrased your question the way you did.

Zig is aiming for what you're talking about, but it's not yet stable. They're interested in memory safety, but they don't want to add a borrow checker and they certainly don't want to add a GC, so my outsider guess is that they'll end up tolerating memory unsafety and trying to make up for that with debug modes and tooling. I don't really have a sense of what the end result will feel like, and the problems discussed in https://youtu.be/dEIsJPpCZYg make it sound like the basic semantics still have a ways to go.

Go could've been the language you're talking about if they'd given up on goroutines and just used the C stack, but goroutines are arguably the most important feature in the entire language, and it's not clear to me that there's a market for "Go but worse for network services and better for FFI". It would be hard to carve out a niche as a systems programming language that's great at making syscalls but can't realistically implement a syscall.


> No there really isn't. The closest thing is "Rust after you've already gotten over the learning curve", but I realize that that learning curve is why you phrased your question the way you did.

I vigorously deny this claim :-)

There's an easy-to-learn and powerful language implementation, which I mentioned above, that has seamless interop with C (to the point that you can include snippets of actual C code in the source code).


Edit: I'm not familiar with Nim at all, and that might be a better answer than anything I've said here?


I've heard good things about Zig. Interop with C libraries seems to be good, and there is some degree of memory safety (though not, as i understand it, anything like .net/Java).


Zig's probably GP's best bet. Though its still developing rapidly, it sees production use already, so while it might be radically different in 10 years, it probably won't be gone.

There's also Rust, but that's a little harder.


Also not stable, will change a lot afaik


> quick qn ? for the tinkerers -- is there a language these days that nicely interfaces with c i.e able to use all c libraries through ffi.

Yes. https://ecl.common-lisp.dev/static/files/manual/current-manu...

> without loss of performance or extra fu.

Maybe a small loss, compared with fine-tuned C.

> the language being small enough that all basics can be grasped in a day.

The basics, certainly. You can learn the basics of Lisp syntax in about twenty minutes, the basics of looping, conditionals, datatype definitions, function definitions and all basic stuff in about a day.

> the language being complete that it will remain the same in 10 years.

Mostly, yes. ECL has mostly been the same for the previous 20 years. I see no reason that it would change substantially in the next 20.

> while at the same time being memory safe.

Caveats apply here, due to how deeply ECL can hook into C code. Even if you're doing weird things in the C code, it's unlikely you'd accidentally run into problems.

> btw I don't think golang applies given the bad ffi story in go.

What bad ffi story? I've never tried to use the FFI in go, but I haven't heard particularly bad things about it. Of course, that could be because most Go programmers aren't using the FFI anyway.


My experience with CGo is limited, but I think the reasons it has a bad rap are that 1) the performance overhead is costly in some cases, cause you have to switch stacks(?), and 2) the way it's integrated with the language and the build is super hacky, which makes complicated cases more complicated.


> while at the same time being memory safe.

lua? luajit's got great ffi.

If you meant a compiled language then you can write code that is memory safe in C. You can even run tools against your code to measure this in various ways.

All "memory safe" languages that compile to lowest level ISA code are just memory safe "by default" and all of them necessarily offer escape hatches that turn it all off.

No "memory safe" compiled languages offers memory protections beyond what the operating system provides. If it's within the memory space of the process you can access it without limitation. "Memory safety" can reduce your exploit surface but it can't eliminate it out of an incorrectly designed program.


There isn't a binary "memory safety" choice for a given language, but C is way off on the wrong end of realistically promoting the writing of memory safe code.


Yea, but at the end of the day, your software either has memory safety bugs or it does not. Presuming that the choice of language has the largest impact on this outcome is modern folly, I suspect.


I doubt your second sentence. Considering the prevalence of memory safety bugs in C/C++ code, when both Firefox and Microsoft report a figure of ~70% of bugs involving memory safety, generous deployment of tests, fuzzing, sanitizers, and linters entails a mix of not being a silver bullet and being cumbersome to use. On the other hand, many programs written in Rust, Go, Kotlin, etc. won't need escape hatches at all and likely won't have any memory safety bugs.


It's highly debatable, I'll give you that.

The problem with your point is that it's single ended, because the scale and scope of deployed Rust or Go software has not matched that of C/C++ software. We also don't have particularly good data on how many memory safety bugs are in a project with good deployment controls versus ones that aren't.

We also don't know how well tested any of the vulnerable Microsoft code actually was and so I'd be wary of drawing any broad conclusions across languages from that simple statistic. It's also likely self reported and not likely to be rigorously gathered for this type of analysis.

The fact that there's such a large difference between C/C++ projects with respect to historically discovered vulnerabilities to me suggests that it can't be down to the language but how the project deployments are engineered.

You're one unnoticed checkin of an "unsafe" construct in any of these languages away from having the dreaded memory safety vulnerability introduce itself into your project. Even worse, you could have a crate that has an unsafe block you didn't previously call, but a new checkin now calls this extant and disregarded method. So, what do you do? The language hasn't done anything for you here. Use an analysis and/or fuzzing tool? So, how are we anywhere different because of the language?

Even for Go, a language I love quite a bit, if you forget to synchronize shared maps with simultaneous reads and writes you're in for a panic, and possibly real safety bugs. The GC and the fact that "unsafe.Pointer" are "slightly hard" to use isn't a huge attribute as it leaves entire classes of bugs on the floor with the tines pointed straight up.


I think there is a strong culture in the Rust community to use `unsafe` carefully, somehow fostered by the overt advertisement, features, and/or design philosophy of Rust. The infamous actix-web debacle suggests that Rust users tend to be overzealous about avoiding `unsafe`, even. I think the design philosophy plays a big part to develop the culture: `unsafe` is less seen as the thing that only super hacker wizards use, and more as a tool that should be used judiciously but still out in the open.

So I suppose it's not literally just the Rust language itself, but given the context of Rust's development, there seems to be an intertwined culture that was more likely to arise than not.


Nim [0][1], but will obviously evolve in 10 years. Even Java will likely deprecate some things in that time... Nim allows importing C/C++ OOTB.

[0] https://livebook.manning.com/book/nim-in-action/chapter-8/60

[1] https://github.com/nimterop/nimterop


You don't want nimterop, you want futhark (https://github.com/PMunch/futhark).

The C FFI Nim library lineage goes c2nim --> nimterop --> something i forgot --> futhark.


> is there a language these days that nicely interfaces with c i.e able to use all c libraries through ffi

> the language being small enough that all basics can be grasped in a day.

That would be Zig. You can directly import C headers and compile C code with the Zig compiler and also cross-compile C code without requiring a separate compiler toolchain.

Currently it's also possible to compile C++ and ObjC (but not import C++ or ObjC headers), but that functionality will probably be delegated to a separate Clang toolchain in the future.

> the language being complete that it will remain the same in 10 years.

...that will take a while (also depending on whether you consider the stdlib part of the language or not).

> while at the same time being memory safe.

...that wouldn't be Zig then ;) (TBF, Zig is much stricter than C or C++, which helps to avoid some typical memory corruption problems in C or C++, but it's by far not as watertight as Rust when it comes to static memory safety - Zig does have a couple of runtime checks though, like array bounds checks - dangling pointers are still a problem though and are only caught at runtime via a special allocator.


Memory safe and easy implies a garbage collector to me. Unfortunately, garbage collection and C libraries easy to use is at odds.

In other words: Memory safe, easy to learn, easy C-FFI? Pick two.


They also wanted "without loss of performance" (compared to C, I think, based on the context), which also has tension with the other requirements. I don't think it's possible to make a memory-safe language that has easy C FFI with no overhead and doesn't require you to think about C stuff. The esoterica of C have to be addressed somewhere.


You can have fairly cheap (but not entirely free) runtime memory safety at interface boundaries in any language that supports arrays via "tagged index handles" (basically weak references which protect against dangling access). To be efficient this requires a specific module design philosophy though (you basically want to avoid converting between a handle and a pointer for each memory access, only at the interface boundary, and interfaces should be designed so that they avoid "high frequency functions" with handle parameters).

Interestingly this approach is also somewhat popular in Rust to workaround borrow checker restrictions.

For instance see:

https://floooh.github.io/2018/06/17/handles-vs-pointers.html


The instant that you have hassle-free FFI, it seems like you've given up on memory safety. Heck, just yesterday I got a segfault in a Python project because a library that I pulled in was just a wrapper for a faulty C module.

You can have all the memory safety in the world within the bounds of your own language, but it mostly gives you an illusion of security if the common pattern in the community is to just wrap C libraries. Having FFI be a bit of a hassle can actually go a long way towards shaping the community towards stronger memory safety.


D

[x] nicely interfaces with c

[x] the language being small enough that all basics can be grasped in a day

[x] the language being complete that it will remain the same in 10 years (the 1.0 release was 23 years ago)

[x] while at the same time being memory safe.


All points are subjective and many people would claim this is false for all bullet points.

> [x] nicely interfaces with c

Unfortunately, that's only if you can avoid D Strings. Otherwise, you'll need to use toStringZ which makes copies of each string to ensure they have a null terminator.

> [x] the language being small enough that all basics can be grasped in a day

I'm still learning new stuff after several weeks using D. The basics are indeed simple, but there's A LOT of stuff in D.

> [x] the language being complete that it will remain the same in 10 years (the 1.0 release was 23 years ago)

D is evolving slowly but evolving. With the new ideas about making parts of the stdlib GC-free and the borrowing concepts being slowly introduced, the language is changing... people are making a lot of pressure to add new "cool features" from other languages, like the recently accepted string interpolation proposal. It will not be the same in 10 years, but it's true that most of it will be unchanged.

> [x] while at the same time being memory safe.

D is not memory safe by default, you need to use `@safe` which is annoying because currently , a lot of the stdlib is not `@safe` (but it could be!). It's true it's much, much harder to mess up in D than in C, but compared to Rust, I think it's quite unsafe (which is why it's introducing borrowing, to catch more unsafety bugs).


> Unfortunately, that's only if you can avoid D Strings. Otherwise, you'll need to use toStringZ which makes copies of each string to ensure they have a null terminator.

There's nothing that would prevent them from using C strings. Since they want a safe language, I doubt this would be a reason they don't want to use D.

> I'm still learning new stuff after several weeks using D. The basics are indeed simple, but there's A LOT of stuff in D.

This is a bad question, because there really is no language in 2024 that you can learn everything in a day.

> D is evolving slowly but evolving.

Again, a bad question. Even C is evolving. The main complaint about D is that it isn't evolving fast enough, with too much emphasis on avoiding breaking changes. If they implement editions, it actually would work as OP wants, because code that compiles today will compile forever in the future.


C strings are just...bad, bad, bad. C++, D, Zig, Go, Rust have ALL decided to not use C strings as their primary string type. But converting is as simple as a function call. (And literals are null terminated)

What features of D are deep?

A feature like string interpolation is modest syntax sugar. The kind of change that takes 40 seconds to understand.

> Rust

Bwhahahaha if D is not simple or simple, IDK what to call Rust.


while at the same time being memory safe.

memory safety doesn't mean just one thing, but probably it requires either a lot of rust-like features, a tracing garbage collector, or automatic reference counting.

the language being small enough that all basics can be grasped in a day

that disqualifies taking the rust-like path.

able to use all c libraries through ffi. without loss of performance or extra fu.

that disqualifies most (all?) advanced tracing gc strategies

it must not have massive pauses or stop the world GC.

that disqualifies simpler tracing gc strategies

depending on what precisely you're looking for, it's possible it might be a pipe dream. but it's also possible you'll find what you want in one of D, Nim or Swift. Swift is probably the closest to what you want on technical merit, but obviously extremely tied to Apple. D and Nim strap you with their particular flavor of tracing gc, which may or may not be suited to your needs.


If you're willing to tolerate a GC, Gambit Scheme may be the closest. You can learn the basics of Scheme in a short time (but any programming language will take time to master). The C FFI is also pretty straightforward to use, but developing complete Scheme interfaces to large C libraries can get tedious.

Gambit is also reputed as the second fastest Scheme compiler out there; only Chez Scheme produces faster code.


> while at the same time being memory safe.

Woha! Well, maybe... Ada?

https://learn.adacore.com/courses/intro-to-ada/chapters/inte...

Edit: Also not yet mentioned: Julia: https://docs.julialang.org/en/v1/manual/calling-c-and-fortra...

Or guile: https://www.gnu.org/software/guile/manual/html_node/Dynamic-...

... Or ruby! (But by now we're solidly in the land of "all languages connect with C"):

https://github.com/ffi/ffi


Nein!!!


Zig


LuaJIT


Try V https://vlang.io/

One caveat is still being heavily developed.


Maybe rust if you ignore all the stuff you think looks complicated.


Vala?


> typeof

What was wrong with "decltype"?

Seems utterly bizarre given that the rest is verbatim copypasta from C++, even the attribute syntax that sticks out like a sore thumb in C, and also given that auto and decl.. I mean, typeof, serve no useful purpose in C.



Because it has different behavior than decltype. (Isn't that why they decided to call it decltype in C++ in the first place? To show that it has different behavior than the GCC extension typeof?)


IIRC behavior is has slightly different behavior, and most compilers already support typeof in c.


Apart from having different behaviour than decltype, typeof is also a much better name IMHO (and GCC had typeof as language extension probably even before C++11 added decltype).


ckd_add(), ckd_sub() are ckd_mul() are what I'm looking forward to using.


Why no std::string?


Because std::string is pretty much the worst way how one would implement a string type. Sometimes it's better to not have a feature than to have a broken feature.


Could you point me to implementations that you believe are good? Much appreciated.


Memory allocation, interoperability with existing libraries. Would require a huge new API and tricky design choices.


"slightly better" - nah, just as mediocre. Let me tell you, nobody stuck with C is going to bat an eyelid at this standard. They're going to continue using C89, or maybe C99 if they're lucky, for whatever reasons that justify it.

People who casually write C don't really care for C23 since it fixes all the wrong things. Nobody really wanted C+ (i.e. something slightly closer to C++ than before) which is basically all this standard achieves.


I write a lot of C code (https://github.com/floooh/) and I'm actually looking forward to some of the changes in C23 (depending on when MSVC will implement them, because most of the changes I'm looking forward to already exist as non-standard language extension in GCC and Clang, but not in MSVC).


Given how long it took MSVC to attempt to support anything above the bare minimum C support for C++ interoperability. I wouldn't hold my breath.


That's because MSVC had basically abandondend C in favour of C++. In recent years they reverted that stance again though.


Huh ? I use the cleanup attribute, for instance, which is a Real game changer

Nobody wants that ? Everybody who knows it exist wants that.


That's not a C23 feature. It's a non-portable compiler extension which has existed for a _long_ time.


Ha ? Maybe I just woke up in 2023 and get confused :/

Thank you for the correction


C++ is already a better C.

So much so that new revisions of C are just backporting features at this point.


C++ is a better C in the same way that a Cybertruck is a better bicycle.


Nah. More like C is a bunch of hand tools, and C++ is those same hand tools, plus a bunch of power tools. Sure, you could just use the hand tools, and hey maybe they even give you a better of sense of what you’re building at a low level, but it’s exhausting, and it turns out the power tools are really useful.


Some of C++ features are great, as recently discussed on the Linux kernel mailing list. However for many of Cs usecases, they are not appropriate, like exceptions and RAII in embedded. Once you start disabling major features those 'power tools' become far less attractive. Also, to extend the metaphor, hand tools are usually obvious and easy to use while power tools can have very long manuals. See for example the 275 page book on just Initilisation in C++ [0].

[0] https://www.cppstories.com/2022/cpp-init-book/


How is RAII a problem on embedded platforms? It is basically about scoped cleanup, something you would otherwise have to do manually.


Not sure what waa meant but what comes to mind to me who has dabbled in embedded is that in C it’s painfully clear how long “objects” live and when you tear them down and how. In C++ stuff can easily “happen” at an inopportune moment.


Same in C++: happens at the end of the scope. You precisely control where that is.


I guess some people are happy with a macro Assembler with a better syntax than MASM/TASM/NASM/yasm high level macros.


funny, but false.

the only sense in which the relationship between C & C++ is like that of a cybertruck to a bicycle is the one in which the latter are "transportation devices" and the former are "programming languages". there is no particular feature of a bicycle represented by a cybertruck other than "it gets you somewhere".


Yes, that's literally the point I was making.


Sure, but it's wrong. There's a ton of features of C present in C++. You might not like the C++ context they are present in, but it's nothing at all like the cybertruck/bicycle (non)relationship.


C++ and C are no longer used for the same types of projects so C++ is not a better C.


I think you must be from the 90s -- if so can you share your time machine?

All kinds of major projects switched to C++ already, for example GCC. All the major new projects, for example LLVM, are also in C++ from the get-go.

Even the Linux kernel is considering switching to C++ now.


A weird way to spell "Rust".


I giggled -- thank you.


> Even the Linux kernel is considering switching to C++ now.

considering doesn't mean it will happen in near future, because toolchains/ecosystem is not there, the same is applicable for many other projects.


I'm not aware of any toolchain limitations; making the kernel compile as C++ is doable with a few trivial patches, and the kernel already is heavily GCC-specific, which supports both C and C++ to the same level.


software is complex, and you can't be sure until you run it in mission critical production.


With the same argument you couldn't upgrade the compiler at all.

Thankfully the Linux kernel development relies on testing instead.


I have my doubts on c++ ever making it into the mainstream kernel while Torvalds is benevolent dictator. Especially since there is rust. I agree with the rest of what you said though.


> Even the Linux kernel is considering switching to C++ now.

Source?



I wonder if Linus has changed his views on C++ too in the mean time.

http://harmful.cat-v.org/software/c++/linus



No they’re not


Until C fixes its string and array story, C++ keeps being a better C.


I just use bstrlib... It's solid and works perfectly..


It is not part of ISO C and that is a big difference.


'just backporting [good] features' doesn't sound that bad when c++ is widely considered the most bloated mixture of the good, the bad and the ugly that there is.

Maybe the c++ killer will turn out to be just some future, modern version of C...


C++ is a different animal of a language. It is about compile-time meta-abstractions that generate a ton of invisible code under your feet. It has complex expression semantics and a crowded syntax space.


but this post is about c++ 23 and probably should stay on that topic? Everyone who uses c knows you can do it in c++ but either don't want to introduce the extra complexities, work with older code bases, prefer the "simplicity" of c, or have an organizational requirement. Personally on embedded I like c++, especially raii, smart pointers, containers, and classes, but I completely avoid rtti, exceptions, and complex template metaprogramming (by me). That subset is easy to keep in my head and on my projects. However some clients want C, and that's what I give them.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: