Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I wrote about a similar thing a while ago [1] and the conclusion I came to then is you can not compete with what is in most languages a single-character token "+" that does the wrong thing. (I justify my claim that it does the wrong thing in the linked essay.)

There are, as any number of other comments on this topic are pointing out, a huge number of "correct" ways to do it that exist today. None of them are as simple as

    html = "<p class=\"" + params["post_type"] + "\">"
and they can't be, because the only thing that beats a one-token string append is a zero-token string operation... which to the extent programming languages have them, is also always an append.

(Before you say your way is simpler than that... you need to include the setup code. That has no setup code; it is built into the language. If you so much as type "import " followed by anything you've already exceeded that in complexity, let alone if you add documentation you have to read for so much as 15 seconds or anything like that. It's not just about character count at the payload location.)

To fix this problem, first "room" has to be made to make the right thing easier than the wrong thing, and that involves the counter-intuitive requirement of not building the broken operator right into the language. Simply jamming two strings together must become more complicated than it is.

It also suggests that there ought to be something about the process of concatenation that raises alerts to a human reader, something like the minimum concatenation operation being

    format_string("%RAW;%RAW;", str1, str2)
or perhaps "%UNSAFE;" or something.

But I don't expect to see that happen anytime soon. Should a new language even try it, it would simply become the number one complaint people have about using the language ("omigosh this language is so broken you can't even concatenate strings" "wow thanks for that now I know never to try it out"), and a newborn language can't afford that.

We need to come up with something, though. The historical #1 security issue has been memory safety issues. It may still be, such things are hard to measure, but if they are, they're hanging on by a thread. The issue of string handling is rapidly taking over the #1 security slot, and it's probably going to be a similarly multi-decade adventure for the programming community to figure out what to do about it. I do know that just as we eventually realized git gud was not a solution to memory safety, we're going to realize git gud is not a solution to this problem either. I am pretty gud, have a deep understanding of the issue, and it's still walking through a minefield for me whenever I'm dealing with this issue.

I have no idea how to fix this thought, even in theory, without "making room" of some sort and making string concatenation fundamentally a harder operation. (And honestly most programmers would simply bash together an "string_append" function anyhow and call it "sa" or something and just bypass the attempt at security.)

[1]: https://www.jerf.org/iri/post/2942/



Consider applying for YC's Fall 2025 batch! Applications are open till Aug 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: