Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
Sir, Please Step Away from the ASR-33 (2010) (acm.org)
148 points by goranmoomin on June 27, 2021 | hide | past | favorite | 278 comments


> And need I remind anybody that you cannot buy a monochrome screen anymore? Syntax-coloring editors are the default. Why not make color part of the syntax? Why not tell the compiler about protected code regions by putting them on a framed light gray background? Or provide hints about likely and unlikely code paths with a green or red background tint?

I'm colorblind, please never do that. Syntax highlighting is fine since it's another way to help, but color having some kind of importance so that pink code and violet code run differently would be hell for me.

Edit: something else: color looks different depending on your computer. Consider how many complaints I already see here about people developing UI for extra wide monitors while most people are on 15 inches screen, that would be terrible. Don't even get me started about arguments on "is this blue, blueish green or green?". Colors are way more subjective than what people seem to think.

> For some reason computer people are so conservative that we still find it more uncompromisingly important for our source code to be compatible with a Teletype ASR-33 terminal and its 1963-vintage ASCII table than it is for us to be able to express our intentions clearly.

And no new letters have been added to English or French lately. They seem to be doing just fine. Typing those new symbols would be hell if keyboards are not designed for it.


I can and have resurrected busted systems with a serial interface, sed, and ed. This is not fun as it is. It would be much suckier if each line of code were subject to out-of-band context such that it wasn’t obvious when having access to only the text.

I see negative value in representing code as anything other than text. Every time I've seen an entity try to do this, I've seen programmers come up with a text-based alternative that compiles to whatever janky format the other party thinks is clever this month.


Lots of e-ink screens are monochrome. There was recently a thread here about a big one that seemed intended for desktops. I often use my phone in monochrome for various silly reasons.

I like having my code coloured, but if the colour was part of the code I'd lose the control I currently have. I'm not against that in principle, but if it was done in a way I didn't like it would really put me off.

The best uses, imho, like rainbow brackets, are applied after the fact. It would be a nightmare to have to match red brackets to red etc.


Be glad you don't code in one of the new age async languages. They love coloring their functions.


Yeah, I'm a blind coder and having color matter like this does not sound like fun.


blue int end blue red main end red green open parentheses …


Colorforth not only does this but solved the colorblind problem by switching fonts.


I don't think colorforth "solved the colorblind problem" considering it has almost no adoption. Does anyone except Chuck Moore uses it? It has a fallback for colorblind people, which is better than nothing but is still terrible.

The current way, plain text with everyone free to use syntax highlighters that change the color, font, or anything of the text is fine and works very well. I've heard about AST-based source of truth, and I think this could fix syntax mistakes while still allowing people to edit plain text as they wish (and could also simplify visual programming tools. Maybe unlock a smooth transition from low-code tools to regular code?).


If something is terrible then comparatively you must have a better solution around.


> Syntax highlighting is fine since it's another way to help, but color having some kind of importance so that pink code and violet code run differently would be hell for me.

It's not a colorblind thing; that's just a terrible idea even if you nominally can distinguish the colors. (Citation: personal experience.)


"And no new letters have been added to English or French lately."

Counterpoint: €


That's a symbol that refers to "euro" or "euros", the currency. "euro" or "euros" are still written with our lovely 26 letters.


Ok, symbol, not a letter. But it is still commonly used in everyday work, much like $ or £. People who do accounting etc. would be very unhappy if they could not produce it on a computer keyboard.

So, in practice, you need to support it and it is a relatively new addition. Most other symbols are 200+ years old.


Further in this point; even without the subjective warts, the composability of colors would leave a lot of confusion. Blue-green, or green-blue?


No new symbols in english? How about emoji?


I think emojis are mostly a passing trend, just like at some point people loved to decorate the first letter of a book, article or paragraph. They're solving a problem (show the emotion on your face when the only thing you have is written text) but they are one of the many solutions, and will probably be replaced some day.


The Unison language stores code as a syntax tree and they're planning to support multiple alternative syntaxes for the language.

https://www.unisonweb.org

Unison’s core idea is that code is immutable and identified by its content. This lets us reimagine many aspects of how a programming language works. We simplify codebase management — Unison has no builds, no dependency conflicts, and renaming things is trivial.

Some blog posts by the author of the language describing a bit of background

https://pchiusano.github.io/2013-05-22/future-of-software.ht...

https://pchiusano.github.io/2013-09-10/type-systems-and-ux-e...


This is fascinating. Thank you for linking to this.


Identified by its content how?


Hashing, like in Git

I guess built-in things get fixed names / hashes, and the hashes of everything else follow from that.

They need a mapping from content/hashes to human-friendly names, obviously. Not sure if they gained anything here, but they seem to think so, so might be worth checking this out.


Sounds like C++ name mangling.


No, because the mangled names still depend on the names ...


Not sure I get the distinction. Hashes "depend" on the name as well. You perhaps can't (easily) unmangle, but it's still a derivation.


From what I recall from looking at Unison not long ago, the hashes do not depend on names. `f x = x * x` and `g y = y * y` both have the exact same hash and are considered to be the exact same function. Any call to `f` or `g` will be stored as a call to their hash, so `f2 x = f (x + 1)` and `g2 x = g (x + 1)` would also have identical hashes.

This does appear to pose a small issue for types, since isomorphic types are identical regardless of their names, but I believe they have a way to attach a unique id to enforce a distinction.


Ah, that makes it more clear.


I am not really familiar with Unison, but I presume it works like this: Only the code itself is hashed. The then-current name is then added as an annotation. When you refer to a name, the language compiler looks up the hash currently associated with that name and stores the hash instead of the name. When code is rendered, the name associated with the hash at render-time is looked up and displayed. That way if the name is changed, all of the renderings generated after that automagically use the new name.

It's actually quite a clever idea.


You presumed right, that's how it works ;)


I don't know why we keep reinventing this distribution model and failing at it, then trying again.

Bugfixing and identity matter.


Uh, could you explain more? Your comment doesn't really give any clues about what you are complaining about.


>> For some reason computer people are so conservative [...]

Well, one of the underlying reasons for the lack of imagination might be... keyboards. If keyboard keys were small e-ink displays, easily configurable and accessible by programs, programmers would have come up with a lot of interesting stuff already. We do it with function icons in regular interfaces. If we could intergrate with keyboards, we'd definitely take advantage of it.

Now, there might be many more reasons. The article also mentions subroutines displayed horizontally and other stuff. That could definitely be done too, but... while we aren't there yet, many interfaces definitely make good use of horizontal screen space.

The main problem is that to do any of these, you kinda require coordination beyond the scope of solving a single technical problem. Unless the right hardware is available to enough people, custom symbols and keys and whatever would only work experimentally. And it would be a worthy experiment, but developing a language is already enough work to also have to add a custom revolutionary IDE to the mix, in the context of experimentation. In the current economic system, when the path to market is long and unclear most good ideas die anonymously.


> If we could intergrate with keyboards, we'd definitely take advantage of it.

From my own personal experience this is extremely true. A while back I made myself a custom keyboard [0] which can enter lots of characters, mostly for linguistic tasks. I didn’t intend to start using it for things outside linguistics, but before long I was using it everywhere — and my inventory of available characters expanded correspondingly. I started to use curvy quotes and em/en-dashes in all my writings (even in this post!), and — more topically — I started using Unicode symbols in my programming, when possible. I don’t use them too much, mostly because there’s little need for them, but in scientific tasks it’s really useful to be able to type e.g. ‘λ’ instead of ‘wavelength’. I predict that as Unicode symbols become easier to input, programming languages will indeed start to utilise them more — the limiting factor is keyboard layout. (Indeed we can already see the start of this process in Julia, Raku and Haskell.)

[0] https://github.com/bradrn/Conkey


See also the Compose key: https://en.wikipedia.org/wiki/Compose_key.

I started using a Compose key under Linux five or six years ago. I have progressively accumulated fairly extensive customisation in my ~/.XCompose. (e.g. Compose+;+; = ‘, Compose+"+" = ”, Compose+"+` = ″, Compose+z+Space = ZWSP, Compose+Space+' = NARROW NO-BREAK SPACE, Compose+++1 = THUMBS UP SIGN), Compose+-+-+= = −, Compose+l+* = λ.) Some are of my own division, and some (like Greek letters) are copied from Vim’s digraphs which I had used commonly before setting up a Compose key. I consistently type exactly what I mean. (If I type a straight quote, I meant a straight quote.)

My last laptop ended up being a Surface Book; had WinCompose not existed, I wouldn’t have been willing to shift to Windows.


Check out https://docs.raku.org/language/unicode_ascii for an overview how Raku supports Unicode. And an article about its implementation from 2015: https://6guts.wordpress.com/2015/04/12/this-week-unicode-nor...


For anyone on Vim, you can insert lots of special characters by hitting Ctrl+K: =3 is ≡, -> is →, l* is λ, M- is —, etc. You can also run :dig! to see a list of all digraphs, and :help digraph for information on how to define your own.


And for those on emacs, you can change the input method. The one I use the most is the Tex one. so I can type \sum and things like that that gets translated to unicode immediately.


You can also use ^VuXXXX to insert characters by the hex Unicode value. I find that handy sometimes.


And you can put %B in your status line to show the Unicode value of what's under your cursor.


If I see a character I can't enter in someone's code, I think I'll run away. It's cool and all for your code, but sooner or later someone else is going to have to deal with it.


I think it matters who your audience is.

If your audience is mostly programmers then just use `*`

If your audience is physicists or mathematicians then `×` or `·` may be a better fit.

When turning a math expression into code, it is often handy if the code looks a lot like the expression. Even better if you can just copy and paste it. It's hard to translate an expression wrong if you aren't translating it at all.


I agree, which is why I avoid Unicode for anything someone else will need to type in, or at least give ASCII alternatives when possible. I use it mainly for things like local variable and parameter names.


> If keyboard keys were small e-ink displays, easily configurable and accessible by programs, programmers would have come up with a lot of interesting stuff already.

I have a QMK based keyboard. The most interesting thing I did was to have an APL layer on my keyboard. I wouldn't say it's "a lot of interesting stuff"


> Well, one of the underlying reasons for the lack of imagination might be... keyboards.

That and the fact that it would probably take me longer to search for whatever obscure mathematical symbol is supposed to represent <whatever> than it would take to just type out its name.

And ASCII is just easier to deal with.


Another problem with Unicode is far too many symbols for different Unicode characters are too similar looking, or even identical.

Then there are the Unicode symbols that convey semantic information rather than just the appearance. These are an abomination and should never have been put in Unicode, but here we are anyway.


Don't forget different Unicode code points for the same character, like U+4B/U+212A(/U+39A/U+41A/...).


If keyboards were only usable like that, we would have a problem of accessibility of the technology even worse than what it currently is.


> If keyboard keys were small e-ink displays, easily configurable and accessible by programs, programmers would have come up with a lot of interesting stuff already.

The touchbar on my M1 Macbook Pro does this. It makes using Emojis a lot easier. I incorporate emojis into my languages more and more nowadays.



The best thing about using only ASCII for the main syntax is that everyone can type it with their keyboard. I think the recent fad of supporting Unicode identifiers is misguided. Of course, Unicode should be permitted in string literals, but not in the code itself.

*Also I don't think Go is better than modern C++, though it might be better than C++99 which was the main standard when the article was written.


Having to do math and physics heavy work, I very much support the "fad" of having Unicode literals. "omega_0" is much worse than "ω_0" especially if you have tons of these variables.

If you don't do math it's difficult to understand, but try writing without the alphabet and expressing the same concepts. Possible but clunky


I have a degree in math and I disagree with this. ω_0 definitely looks better than omega_0, but it is much harder to enter unless you have a special keyboard setup.

Suppose you write a function that uses a variable ω_0, and somebody else wants to change it. Unless they have the same keyboard setup as you, they will have to just copy-past it everywhere. And what if you have several variable with special names?


Okay, so we need better keyboards then. If a character is annoying to type, that is a keyboard problem, not a language or charset problem. So let's improve the keyboard. Currently we are stuck with the garbage qwerty keyboard (and various slightly better permutations of it) which doesn't even let you type all of ascii. Computers should just do what humans want. If you see a character, you should just be able to type it. It's absurd that cpus can do billions of ops per second but you can't just type an omega. Absolutely shameful, really.

You should dream bigger. A better world is possible but only if people really want to throw out all the shitty designs from the ancient past


> If you see a character, you should just be able to type it.

How should we do this?

I am imagining that we have a "char-bank" that lives on a little touch display at the left of my keyboard. I can add chars to my bank by highlighting them and pressing a 'yank' key. I can scroll up and down through my charbank and type the chars with a tap or rearrange them with touch-and-hold similarly to apps on a phone's homepage.

If I want to map a char to my keyboard, I can press and hold a "map" key, tap the char, and press the key I want to map it to. If I want to map the char directly from the screen I highlight it and press the map and yank keys together.


Some other people in this thread suggested having a keyboard with little e-ink displays on the keycaps. Then the character map could be completely user-defined. Then you could just have a big bank of keys off to the side to select your desired set of characters, i.e. greek, math, emoji etc. You can already sort of do this in software, but you have to manually remember the location of every character in each extra mapping. But with displays on the keycaps, you could have as many different mappings as you want. And that's how you would be able to type in a sane world


Well damn. I don't have a dog in this fight but boy oh boy do i want that keyboard now. The possibilities of every key being changed to match context sounds awesome. Especially since i love modal editors with visible prompts.

Vim/Kakoune users could have their help menu _be_ the keyboard. Every action changes all of your keys to whatever is possible. Sort of mind blowing.

Come to think of it, i bet someone has done this with Vim & LED Keycaps. Huh


I remember 14 years ago salivating at the Optimus Maximus keyboard from Art. Lebedev studios: https://www.artlebedev.com/optimus/maximus/


>Vim/Kakoune users could have their help menu _be_ the keyboard. Every action changes all of your keys to whatever is possible.

Wow, this is an amazing idea. Think of all the other unimagined possibilities if we had real innovation in human-computer interface tech


You can see IMEs used for Chinese/Japanese. Type in ascii keyboard, then transform to own language characters.


To get there, though, you need everything upgraded. Keyboards, sure, but also compilers and editors and IDEs and static analysis tools and grep and...


Keep in my that in Unicode there are several symbols that look alike. It is not just the character set that matters, but how many of such things we have.

May be instead of arguing for Unicode, community needs to come up with a unambiguous smaller set of symbols languages should support.


And ASCII has the reverse problem: one symbol gets extremely overloaded semantically (syntax wars?!).

Tell me what does ":" mean?

Now if I tell you the language is Typescript, what does "?" mean?

I personally believe semantic overloading causes a lot of problems, but we are so steeped in the existing limitations that we don't recognise it is a problem.


Odd choice. Punctuation, by itself, is virtually meaningless.

Language overloads things. Pretty much period. I doubt programming will get away from that.


The implied topic of my comment was programming languages, where your punctuation characters really do matter. So much so in some languages that a single misplaced " " can really ruin your day.

We are so short of semantic operators that languages start to use single letters like "s", "q", " u" as operators or modifiers - which I find ugly (although the best compromise).

I have experienced how the wrong Unicode character can ruin my day - but that is what tooling and editors and best practice are there to help with.

Keyboards are the main problem now I think (in the past it was your OS, your editor, and your tooling). I regularly type Unicode characters from my handheld devices, but hardly ever from my qwerty entry devices (I can't easily find ¡, ∆, ↑ or ç as examples).


Taken at minimal value, punctuation is merely another symbol. A variable named by the symbol "index" will have context dependent meaning.

So...I don't see how this changes my point. The list of symbols that is a program will have many of those symbols in need of context to understand. Do you really think programming can escape that?

Edit: I am sympathetic. I like lisp for having fewer signifiers than other languages.


I dread the day when I edit some code that has both \Sigma and \sum in the same scope.


You mean ∑ and Σ? Yeah, that'd be terrible. It doesn't seem like it should be Unicode's job to assign meaning to characters, so I don't know why ∑ ("U+2211 N-Ary Summation") exists.


> I don't know why [it] exists.

Ill-considered backward 'compatibility' with block-drawing character sets for mathematical typesetting. (Fucked if I know which character set, but presumably the same one they found U+23B2 '⎲' and U+23B3 '⎳' in.)


Pretty sure those are necessary with the extenders when you want to sum larger things like integrals. Can't remember the details, though.


Yeah. I was offering up an example.


You don’t need a “special keyboard setup”. Linux with X11 can do this out of the box. Add a line to your xmodmap file to put the dead Greek key where you want, and you’re done. I type ω by tapping a modifier key and then w. It’s even easier than typing W, because I don’t even have to hold down the modifier.


That... sounds like a special keyboard setup to me.


OK, I guess that’s a reasonable use of the word “special”. Here is what I was trying to get at:

I can not effectively use my computer for anything without customizing the keyboard a little. I need to make the capslock key into an extra control key. I need to set up a compose key so I can type accents when writing in Spanish (even if I didn’t do that, what about writing people's names?). Even sticking with English, I need to type curly quotes and apostrophes, and em- and en- dashes, plus, now and then, symbols like ™ or ©. And I need to be able to type Greek letters when talking about physics or math. So, since my keyboard is already set up to handle all this, and it is easy to do so, for me it is delightful that I can use ω in my Julia code. The math looks more like math, equations take more familiar forms (which makes it easier to spot errors), and the code can be made more expressive. So, for me (but clearly not for everyone) this is not a “special” setup.


Right, but the whole point of language design is to be used by many people. I’ve programmed for 30 years and have never customized my keyboard setup. If you want to create a language for both of us, sticking with standard keyboard characters seems like the best choice.


What do you mean by standard keyboard characters? Pretty sure characters on keyboards are often country and language dependent.


And again, this helps other people who have to edit your code how?


Well, if my Julia code ever gets to the point where other people want to edit it, the community has embraced Unicode, so there won’t be a problem there. More generally, I’ll turn my comment into a question: how are people using computers without already having them set up to easily type these characters?


> how are people using computers without already having them set up to easily type these characters?

Current solution for most people is google + clipboard

For example, to type the word "cliché" here, I googled it, then copied it, then pasted it.


That’s not what I would describe as “easily type”.


This thread is making the point that the issue isn’t the character set, but rather the keyboards. We’re probably locked in, there is so much inertia around the standard QWERTY with a few arrows and Esc. Maybe an integrated system builder with large scale like Apple could shift things slightly.


Indeed; Mac OS has supported easy entry of alternate (non-ASCII) characters on "self-inserting" keys with Option (which can be further modified with Shift) since the 80s. Granted it's a limited set, but they're useful. Here's a sampling “” … ™ Ω ç ß ∂ ∑ † π «» ¬ ˚ ∆ ƒ ∂


I normally use the Unicode keyboard so I can write out propositions, predicates, and set algebra. How many of us would readily adjust to the keyboarding required to break away from ASCII?


CJK users are pretty successful at entering large numbers of Unicode characters with a QWERTY keyboard, and there's more of them than you.


Good point. Just lack of determination to make use of available techniques?


It was the default for me.


That it was your default doesn't mean it was everyone's. OP's argument is that programmers should make it easy for someone else to change their code. It doesn't matter what your defaults were, it matters what the defaults of the next maintainer are.


Glad you understand why there's no need to restrict everyone to US-ASCII.


It really doesn't have to be. For instance, in Julia IDEs \omega would complete to ω, which takes 2 extra characters to enter (\ and tab). Its not hard to imagine this concept, but applied at the OS level...


I respectfully disagree with this. I do a lot of maths-heavy work, but for me in languages (not Julia!) being able to quickly write the LaTeX names for symbols is more than enough. It's also much quicker, and easier to search for when reading someone elses' code.

I personally find it also helpful to make the distinction between the maths and its implementation, though I accept that others would vehemently disagree iwth me.


It makes it easy to type, but does it make it easy to read/maintain?

As they say, it's already way too easier to write code than to read it. A programmer should make every effort to make code more readable.

An editor should make it relatively easy to enter special symbols (especially if you can specify a limited set); it is totally solvable problem.

An editor can only help you so far with reading code...


For me, the biggest problem with the whole maths ---> code mapping is that of nested brackets. For a random quick example, I mean things like

    (a*(1+exp(1i*theta[0,:]))+foo(x))/((b-cosh(bar(x))+(b+sinh(y[:,-1])). 
The sort of thing where it just _looks_ far nicer on a page with like _real fractions_ -- where missing a bracket or changing the order of two brackets can _totally_ bugger you. Yes, changing the form of the expression a bit can make it "far nicer" or simpler -- for example, by defining intermediate terms -- but sometimes there's something to be said for <expression x> matches <equation y> in the paper.

Another example where I think ASCII actually _is_ limiting is for entries of matrices directly. I mean, we try, but I'm not convinced we succeed. For an example (picked at random), hands up if you think this is a nice Euler angle transformation, where cphi/sphi etc are the cosine and sine of phi? No matter how you write it, it's going to be ugly.

   alpha = [cpsi*cphi-ctheta*sphi*spsi   cpsi*sphi+ctheta*cphi*spsi  spsi*stheta;
            -spsi*cphi-ctheta*sphi*cpsi  -spsi*sphi+ctheta*cphi*cpsi cpsi*stheta;
            stheta*sphi                  -stheta*cphi                ctheta];


> a nice Euler angle transformation

I'd probably go with something like:

  double cf = cos(phi),sf = sin(phi);
  double cp = cos(psi),sp = sin(psi);
  double ct = cos(theta),st = sin(theta);
  alpha = [ cp*cf-ct*sf*sp   cp*sf+ct*cf*sp   sp*st;
           -sp*cf-ct*sf*cp  -sp*sf+ct*cf*cp   cp*st;
            st*sf           -st*cf            ct];
but I agree it would look better with:

  alpha = [ cψ*cϕ-cθ*sϕ*sψ   cψ*sϕ+cθ*cϕ*sψ   sψ*sθ;
           -sψ*cϕ-cθ*sϕ*cψ  -sψ*sϕ+cθ*cϕ*cψ   cψ*sθ;
            sθ*sϕ           -sθ*cϕ            cθ];
Either way, I don't think Euler angles are ever going to be "nice".


In Raku you can also use × (U+D7) for multiplication

    # this assumes that ϕ, ψ, and θ have already been set

    my \cϕ = ϕ.cos; my \sϕ = ϕ.sin;
    my \cψ = ψ.cos; my \sψ = ψ.sin;
    my \cθ = θ.cos; my \sθ = θ.sin;

    my \alpha = [ cψ×cϕ−cθ×sϕ×sψ,   cψ×sϕ+cθ×cϕ×sψ,   sψ×sθ;
               −sψ×cϕ−cθ×sϕ×cψ,  −sψ×sϕ+cθ×cϕ×cψ,   cψ×sθ;
                sθ×sϕ,           −sθ×cϕ,            cθ];
I'm not sure if using × helps or hurts in this case since I'm not really experienced in this area.

These all work because Unicode defines ϕψθ as "Letter lowercase"

    say "ϕψθ".uniprops;
    # (Ll Ll Ll)
---

I would like to note that I used the Unicode "Minus Sign" "−" U+2212 so that it wouldn't complain about not being able to find a routine named "cϕ-cθ". (A space next to the "-" would have also sufficed.)


> × (U+D7) for multiplication

Blech; looks like a letter and normalizes cross products. Better to use "·" (U+B7)[0]:

  alpha = [ cψ·cϕ−cθ·sϕ·sψ,  cψ·sϕ+cθ·cϕ·sψ,  sψ·sθ;
           −sψ·cϕ−cθ·sϕ·cψ, −sψ·sϕ+cθ·cϕ·cψ,  cψ·sθ;
            sθ·sϕ,          −sθ·cϕ,           cθ];
Minus sign is a nice-to-have, though.

0: Also "∧" (U+2227, wedge), the real other vector product[1], but that doesn't matter for scalar multiplication.

1: http://en.wikipedia.org/wiki/Wedge_product


It would be easy to just make `·` an alias. Then that code would work.

  my &infix:< · > = &infix:< × >;
(After all `×` itself is just an alias of `*` in the source for Rakudo.)

If you need more control you can write it out

  sub infix:< · > (+@vals)
    is equiv(&[×])       # uses the same precedence level etc.
    is assoc<chaining>   # may not be necessary given previous line
  {
    [×] @vals            # reduction using infix operator
  }
I made it chaining for the same reason `+` and `×` are chaining.

---

I don't know enough about the topic to know how to properly write `∧`.

It looks like it may be useful to write it using multis.

  # I don't know what precedence level it is supposed to be
  proto infix:< ∧ > (|) is tighter(&[×]) {*}

  multi infix:< ∧ > (
    Numeric $l,
    Numeric $r,
  ) {
    $l × $r
  }

  multi infix:< ∧ > (
    Vector $l, # need to define this somewhere, or use List/Array
    Vector $r,
  ) {
    …
  }
If it was as simple as just a normal cross product, that would have been easy.

  [[1,2,3],[4,5,6]] »×« [[10,20,30],[40,50,60]]
  # [[10,40,90],[160,250,360]]

  # generate a synthetic `»×«` operator, and give it an alias
  my &infix:< ∧ > = &infix:< »×« >;

  [[1,2,3],[4,5,6]] ∧ [[10,20,30],[40,50,60]]
  # [[10,40,90],[160,250,360]]
Of course, I'm fairly confident that is wrong.


I believe it is most definitely easier to read/maintain -- restricted to that small domain-specific function. Some greek letters have very specific meanings, replacing them with long-ass names straight up worsens readability. Writing out density, velocityX, etc will quickly fade the core logic of that function.


I don't think that there is a single example of a greek letter which has been universally adopted by all of the different engineering and mathematics communities to have a single non-overloaded meaning.

The use of greek letters in academic writing has a single purpose: Compact notation. It has nothing to do with abstract readability.


> The use of greek letters in academic writing has a single purpose: Compact notation. It has nothing to do with abstract readability.

The compactness of the notation is a major factor in overall readability. Long variable names make it harder to see the overall structure of an expression. When operators, numerical constants and parentheses mostly are represented with one or two characters each but variable names are all much longer, then the only thing you can tell about a long line of code at first glance is what variables it uses as inputs. Mathematical pretty printing can help with some operators (eg. fractions) and make it more visually apparent what's being grouped together by parentheses, but even then long variable names will still detract from the ability recognize the structure of an expression.


Greek letter abuse rarely happens in CS but there are a few cases. Try to guess what "lambda calculus" and "pi calculus" are without looking them up.


Maybe there could be a middle ground. For math-heavy applications, the editor could implement ligatures same as Fira-Code and the latex name would render as the symbol until you put a cursor on it.


Why not go all the way, and write ω₀ :-)

I don't often write complicated formulae, but I do find it easier to use the original symbols if possible. Its one less thing to think about when comparing the source code with a mathematical description of the algorithm.


believe it or not but the ₀ (u2080) is not allowed in Python, it's not for a lack of trying :)

for some reason the entire Unicode "No" (Number, other) category is not included in the admissible character list for identifiers


I disagree. Instead of writing one letter variable names, one could do the hard work of naming things properly, which software developers often do. Instead of going with one letter or the name of that letter as variable name, which tells your fellow non-mathematicians exactly nothing about what it contains, one could put in the effort and make the code readable.

Using these one letter variables forces the reader of the code to either be already familiar with the concept, which that letter refers to, or keep the whole calculation in their head, until the final result, to hope, that they then can make sense of it. It is implicitly requiring outside knowledge. It is shutting out many potential readers of the code.

It might still be saved/helped though, if good prose comments are given, that help the reader understand the meaning of the one letter symbols. Whenever I have seen this one letter code salad, I have seen few if any comments, as if the author of the code expects me to magically know, what each symbol stands for.

Lets say omega was a vector of weights. Why not name it "weights"? That is much better naming than "omega" or that character, that most cannot easily input and need to copy paste from somewhere.


> Using these one letter variables forces the reader of the code to either be already familiar with the concept, which that letter refers to, or keep the whole calculation in their head, until the final result, to hope, that they then can make sense of it. It is implicitly requiring outside knowledge. It is shutting out many potential readers of the code.

When you're writing code for physics, then yes - absolutely: It is intended for other physicists, and the prerequisite is precisely that you have outside knowledge. When they use a symbol like ℏ in a journal article, it is expected that the reader knows it is Planck's constant[1]. Why should they have a different expectation when writing code? To a physicist, -ℏ^2/(2 * m) is a lot more recognizable than -planck^2/(2 * mass).

To be clear: The chances that a person who doesn't recognize these symbols will need to read and understand your code are virtually nil.

Within a context, single character symbols are very useful. To take a different example, what if I insisted that people should write:

2 add 3 add 7

instead of:

2 + 3 + 7

Would anyone reasonably argue that the former is more readable? We all accept that it's OK that a reader is familiar with the '+' sign. While ℏ is not readable to most, it is likely readable to anyone who is expected to read the code.

The thing that really is annoying when writing physics code is the need to explicitly write * for multiplication, and not being able to write fractions the way you would on paper.

[1] divided by 2π


I like the rule of thumb that an identifier in code should usually be longer and more specific as scope increases. The main exception is that if your problem domain has well-established terminology or notation for some concept, reflecting that as closely as possible in the code is often best. So, I personally have no problem with someone using short identifiers and concise notation if it’s either used very locally or idiomatic.

For example, suppose we have a function whose job is to calculate two vectors of weights and do something unless they are equal but opposite. Personally, I would rather read something like

    v = someCalculation()
    w = someOtherCalculation()
    if (v ≠ -w) {
        doSomething()
    }
where we use short names for the local variables and vector-aware unequality and negation operators, than read something like

    weights1 = someCalculation()
    weights2 = someOtherCalculation()
    if (not(vector.equals(weights1, vector.negate(weights2)))) {
        doSomething()
    }
The longer but still arbitrary names for the vectors add no value here and the longhand function names for manipulating them are horrible compared to the short and immediately recognisable vector notation.

In general, I think there could be advantages to using a broader but still selective symbol set for coding, as long as we also had good tool and font support to type and display all of the required symbols. I would be hesitant about including Greek letters in that symbol set, but that’s because several letters in the Greek alphabet look similar to other common letters or symbols, not because I think calling a value ω_0 is a bad idea if that’s how practitioners in the field would all write it mathematically.


Mathematicians have been thinking about nomenclature for a long time - certain fields have very specific usage of symbols. I see no evil in using these well-known one-letter variables in these rare/small parts of functions.


Sorta. Examples abound of notation that had changed or that is adapted for some fields. Complex numbers using j instead of i, for example.


Formulae simply aren't understood like prose, or more precisely, formulae aren't understood as isolated bits of logic. For plumbing code where the logic is important, descriptive variable names are useful in aiding comprehension of the code. For formulaic code, where verifying the equality of two written expressions (one in code and one in the derivation, possibly on a piece of paper) compactness--so called simplification of equations--is far more important. A formula cannot be understood or verified in isolation by itself: as you say, it can only be truly understood by someone familiar with the content. It is by definition, arcane. But the use of arcane variables aids the practitioner since it leads to more compact expressions which makes it easier to check against another source. You cannot eliminate this arcane aspect from formulae simply by spelling out the names of things.

  G_{ab} = 8 \pi T_{ab}
does not benefit from being converted in code to,

  einstein_tensor[index1][index2] = 8 * PI * stress_energy_tensor[index1][index2]
For a practitioner of physics, this is just unhelpful and takes more effort to comprehend. In the spirit of The Humble Programmer, we have unwisely spent the mental energy of the reader. If you actually expand out the Einstein tensor in terms of the Levi-Civita connection and spell it out instead of using the canonical symbol \Gamma and respectively g for the metric tensor then you will just make an incomprehensible wall of text--especially if you inline the summation. What is added by expanding the variable names? A practitioner has gained nothing and lost familiarity and compactness while a layperson has learned nothing about general relativity except the names of the objects. Don't apply best practices uncritically: they are always premised on context.

That being said, formulae would benefit greatly from editors that allow the formula to be visualized next to the code. Sort of like compiling latex; if IntelliJ or some other IDE would actually render the code as a formula in some pane next to the code, that would be the greatest benefit to comprehension of a formula.


In the example, you are not really naming your variable usefully. You are merely transcribing from a one letter variable to its abstract name.

You can do that in a general function, where the general concept is expressed, where it would still help me to understand your code and search for concepts and terms online, in case I do not understand what is going on.

However, in many contexts it wont be merely an "einstein_tensor", but something that has a meaning in the specific context. Ask yourself what you are doing with that einstein_tensor. What is it used for? Is there a real-world equivalent to the thing you are looking at in the code? Those are the names you should choose in a non-general context and that is why naming things is hard.


Descriptive names will never help you understand the implementation of the business domain of math code. From the Domain-Driven design perspective, the symbols are the ubiquitous language which you should stick to.

Here is a more appropriate example, spot the bug in this Tolman-Oppenheimer-Volkoff equation:

    let radial_derivative_of_pressure = - (pressure + energy_density) * (4. * PI * pressure * radius_squared + mass_potential) / (radius_squared - 2. * mass_potential * radius)
vs (with the same bug)

    let dp_dr = - (p + rho) * (4. * PI * p * r2 + m) / (r2 - 2. * m * r);
Of course you have to look up the TOV equation to do so, even most domain experts would need to compare to a formula. One of these is much easier to compare and it isn't the spelled out version.

Your questions are unhelpful. It is merely the radial derivative of pressure. It is going to be used to be passed to a general purpose ODE integrator which just needs the radial derivative of pressure to integrate pressure for some range of radii. There is a real world equivalent: dp/dr; it is a known entity with exactly that symbol. Naming is not at all hard in this case: dp_dr or dpdr even dp_by_dr. Any other choice and you are just creating problems due to uncritical application of the belief that variable names should be descriptive while ignoring the fact that there is a well known ubiquitous language to describe these entities.

The closest thing in other business domains is common acronyms. Nobody is going to spell out GDPR in the implementation of their cookie banner. Nobody spells out HTTP, or JSON, or XML. Spelling them out won't help anyone trying to read the code, it just creates line noise.


Well yes, clarity and good practices can definitely help but to be fair, software developers might often do this:

(_ < 5)

Instead of this:

(e: Int) => (e < 5)

It’s just a trivial example so I understand if someone wants to have the same comfort. I am not a mathematician so I am not sure what is best here so I’d cut them some slack.


A variable named ω is no better than a list named l. Both are bad. Be explicit. Terse, but explicit.


Part of the problem here is that a variable in mathematics is a different concept from a variable in programming.

Mathematicians are used to using one letter variables. Using a set of unwritten naming rules, mathematicians almost always know what concepts the variables are referring to (in a mathematical context).

So, I'd say if the code is only to be maintained by mathematicians, one letter variables for mathematical concepts only, would be an advantage.

The danger comes when you start using them for everything.


<disclaimer>Not a mathematician, but took enough Ph.D. level classes to be dangerous with bad proofs.</disclaimer>

> Using a set of unwritten naming rules

Simplest being things like using upper case Greek for certain sets and lower case Greek for its elements. Or, in Statistics, using Greek letters for parameters and English equivalents for their estimates.

So, if I were programming something to do something in that realm, I would write:

    for my $ω ($Ω) {

      # ...
    
    }
where what Ω represents would be clear from the subfield of Math that is relevant to the program.

Then, of course, you get people like one of my former professors who liked to invent notation on the fly and would quickly and up cycling through Greek (π, Π), Hebrew (פ), Blackboard (ℙ), and reach out for Sanskrit (?). (Symbols used here for illustration, this was a long time ago and we didn't even know if he was correct as we had no way of checking with no intarwebs back then.)


I'm also under the impression that mathematicians generally work much more deeply with a smaller set of variables, so you actually don't need to name that many individual concepts.

A typical program contains thousands and thousands of "named things", so you're naturally going to see a proliferation of names. That just doesn't seem to be that necessary in math once you're working in a particular context (e.g. statistics).


Translation: i learned my domain’s variable names, so you should have to as well.

Currently learning some dsp. One of the biggest barriers to entry has been the inscrutable variable names inherited from the field’s math connections.


Feel free to reason about a complex statistics formula with full-blown variable names. There are reasons for short variable names. Pass into a specific function a descriptive name of what you do, but do use the mathematically "accepted" writing mode in the implementation of a well-known function.


> There are reasons for short variable names

Which are? I suspect the reasons are a combination of the price of paper and ink, the history of teaching using chalkboards. None of those mean we can’t make the canonical version of an equation the expanded representation.

Why do we all know e=mc2 and not energy = mass * lightspeed^2.

The broader the base of people that have to interact with a formula, the less mathy the terminology tends to be. Think of trigonometry. Instead of greek letters the sides of a triangle and the functions get real world names. Sine(angle) = opposite / hypotenuse. Sure when you write that out you use sin ϴ = O/H but that’s just a compressed representation of human-readable variable names.


How about Gauss’s distribution’s density function? Or a numerically stable version of a simpler formula? Also, you forgot the usual case of implementing a whitepaper with mathematical notation. It is much easier to proof read that, when you copy largely the same in your program as well. And if you pass in the readable energy, mass, etc variables, inside the function you can use the domain-specific mathematical notation. That way it will be readable both outside and inside.


I think we might be arguing different points. I'm not saying use different variable names from the canonical math version - I'm saying the canonical math version should use variable names instead of symbols.

So per your point - I'm not qualified to re-define mathematical notation so I won't go very far with it, but taking the formula you mentioned, Gaussian distribution density function (not focusing on the fact that I had to re-write it in pseudocode to represent it in a text format):

    f(x) = 1/(σ * sqrt(2*pi))*e^(-1/2*((x-μ)/σ)^2)
I would suggest a couple changes:

- change σ to `standard_deviation`, or if you don't like snake_case I could handle `stddev`

- change μ to `mean`.

- change x to `input` - there may be a better name than that, and x is pretty widely used in math so I'm not married to this one.

- e should probably stay the same - e means e no matter the mathematical context, while μ means different things in different fields of math.

    f(input) = 1/(standard_deviation * sqrt(2*pi))*e^(-1/2*((input-mean)/standard_deviation)^2)
This doesn't make it more or less readable, but it does mean I don't have to _just know_ or look up that μ is `mean` to parse it.


I accept your position, and perhaps my examples for a difficult formula was not difficult enough — my point is more along the line that the underlying structure of a formula becomes more easily visible with shorter symbols — and for syntactic manipulation one can more easily see patterns emerge.


Depending on the context, ω may be as explicit as it gets.


But "everyone" doesn't have the same keyboard nor does everyone speak the same language. ASCII is not a universal character set and treating it as such is nothing short of cultural imperialism: "If it's good enough for us, it's good enough for everyone".

Artificial limits on language and characters sets might sound simple to you but it introduces a lot of complexity for others. Unicode code solves that problem with not too much overhead.

Also, just to point out, this assumption is exactly why diversity in tech matters so much.


English is not my first language and my native language doesn't use Latin script, still all the keyboards that I ever saw supported entering ASCII characters. As far as I know all the keyboards that are in active use support English layout.

We shouldn't dismiss this as "cultural imperialism". Instead we should use this to our advantage. Currently source code written in China or in Russia can be read by developers in India or in the US. This is amazing. Let's not forsake it!


Agreed. Growing up in a country and at a time with personally accessible computers and very little literature, I was able to learn stuff that withstood the test of time and did not subject me to switching costs.

Already, differences in menus and keyboard shortcuts make the skills of trained office workers less relevant when they immigrate.

We do not need more barriers to trade.

Some related thoughts:

https://www.nu42.com/2013/04/translation-of-programming-term...

and

https://www.nu42.com/2014/08/replacing-hash-keys-with-values...

The proponents of translation tend to believe what they themselves or other translators will be clear to people just learning the concepts. That is most often not true.

> I am willing to bet the sentence “Çözümü SCALAR bağlamda veri döndüren scalar() fonksiyonunu kullanmaktır” does not make any more sense to a Turkish speaker who speaks no English than “The solution is to use the scalar() function that will create SCALAR context for its parameter.”

> Translation and hash-lookup are different things. If you want to convey meaning, you have to have a command of both languages, and the subject matter. Without that, you are only going to add to the word soup. Translating “big event” as “büyük okazyon” helps no one.


The availability of alternative commercial keyboards is possibly not the best metric to use to judge actual need/desire. There are a large number of non-ASCII layouts that can be mapped on top of a standard QWERTY[0] which show that ASCII is not a valid assumption for a modern general purpose programming language.

The idea that we should all be grateful for the unification that using one language brings is a sentiment that reinforces the idea that this is cultural imperialism. There are definitely benefits to conformity, no doubt, but imposing it unnecessarily is a design decision that should be questioned. Why take away options from people who might value that freedom?

[0] - https://en.wikipedia.org/wiki/Keyboard_layout#Keyboard_layou...

[1] - https://en.wikipedia.org/wiki/Cultural_imperialism


> There are a large number of non-ASCII layouts that can be mapped on top of a standard QWERTY[0] which show that ASCII is not a valid assumption for a modern general purpose programming language.

I have seen multiple "foreign" keyboards. They all have ASCII labeled on each key. Using the QWERTY layout demonstrates the dominance of ASCII, which is a valid assumption that modern programming language can reliably rely on.


Please lookup in Image Search a Russian keyboard and a Japanese keyboard. What do they have in common? Right, besides the respective alphabets they have QWERTY Latin layout on them. Basically all of the keyboards can be used to input ASCII symbols out of the box.

I don't really care whether you call this cultural imperialism or not. What matters is that a whole industry can use the same standard worldwide. It's the same as metric system. We already have to deal with imperial system, imagine how much more painful would it be if every country used its own measurement system or its one calendar.


Agreed. My first language didn't use latin script--it didn't even read left-to-right. But I'm not offended that programming languages all do use latin script and a constrained character set.


>> The best thing about using only ASCII for the main syntax is that everyone can type it with their keyboard.

Well, some keyboard layouts make it harder though. I have spent a considerable amount of time trying to teach programming constructs over Zoom to budding developers in Japan over the past two years. The placement, access mode, and actual typing of characters such as `{`, `$`, `~`, `|`, `:` etc has been the biggest stumbling block during these sessions.

So, there the subset of ASCII that is equally easy to type on all keyboards is smaller than the full range of non-control characters.

> But "everyone" doesn't have the same keyboard nor does everyone speak the same language.

I like that Vim's digraph feature lets me solve the problem in my editor without having to rely on the keyboard layout or OS level preferences. So, typing these lines:

    my $μ = "İstanbul'da hava çok güzelmiş";
    say uc($μ);
takes the exact same keystrokes regardless of the OS/environment I am in:

    m y CTRL-K m * = "CTRL-K \ I s t a
On my own machines, this has the advantage of not having to switch languages in the act of typing (although Win+SPACE is pretty easy on Windows, cycling through the five I have installed is not trivial). And, do I really remember where ø is on the Danish keyboard as opposed to where ö is on the Turkish keyboard?


> ASCII is not a universal character set and treating it as such is nothing short of cultural imperialism: "If it's good enough for us, it's good enough for everyone".

SI units like meters and kilograms are also cultural imperialism and we should return to diversity of units, ideally different one for each town. /s


Yeah, go start using metric in the US.


That's the inverse. What we're talking about with keyboards is the opposite, some are advocating for more division. That's like starting a movement for scientists worldwide (who unflinchingly use metric even in the US, as far as I'm aware) to ditch metric and start using the standard units that were historically common in their culture.

And for what it's worth, I've never talked to a person who doesn't agree that the US using metric would be great, the problem is that getting over the inertia of the existing casual measurements is extremely difficult because even when metric is on the label, imperial is the emphasized one (the entire traffic sign infrastructure, the vast majority of cooking implements, decades of cookbooks, most food packaging, medical records systems, drivers licenses, advertising campaigns, the personal fitness ecosystem, entire product names and trademarks). 60 years ago maybe it could be doable, but at this point it's a bit of a lost cause without very much gain.


And you don't see how "inertia" actually applies to both of those directions?

The nuances if the states could or should switch are irrelevant, but the response demonstrated the point perfectly.


Railway track gauges came to mind.


I remember reading an article that the width of railway track was set by the width of chariots.

I don't remember much about the article except it ended with some quip about the width of a car was determined by a couple of horses asses.


Not sure why you're downvoted; it's simply true that ASCII is not universal. The "A" in ASCII stands for American.

I am a monolingual English speaker but I think I should be able to write ÷, ≤ and ≥ in my Go code, and I think I should be able to do maths using π.


Given that the go community does not seem to hold this as a majority opinion, you might consider whether an editor translation layer could allow you to tailor your environment to your preferences while maintaining source-level compatibility with the broader go community.

If compatibility is not as important to you, privately forking the language is also an option, but that seems fraught with peril of your code dying with you.


GoLand actually does this.

But the point is not my preference to use the actual, proper symbols, but rather the technically unnecessary decision forced upon me that says I cannot.


> The "A" in ASCII stands for American.

Just call it ISO 646-IRV instead. ;)


Raku

    say π ≤ 3+⅘ ≤ τ
    # True

    say π² ÷ 4
    # 2.4674011002723395


Fine do math's using π this is about programming


This is just the English/Esperanto problem again. Do you do the thing that's the most fair or do you do the thing that's unfair but has the most benefits for everyone involved? Yes, it's unfair to limit people to ASCII but it's also the only way their code is going to see widespread engagement.


> it's also the only way their code is going to see widespread engagement

Wow, what a claim! I can see how this may have some anecdotal evidence behind it since programming has had such a strong English bias but it's conjecture to say that this is the only path. Not only does that claim lack evidence, it also lacks imagination. The future doesn't have to follow the same patterns of the past.

The technical constraints of ASCII have long been irrelevant and its only the cultural imposition that remains. While this has been the case the size and importance of computing has exploded everywhere (e.g. 4 billion people worldwide use the internet).

Can you really say with confidence that global "widespread engagement" is (and will always be) a desirable property for code projects? Is software produced in this way and at this global scale really likely to be better quality for everyone?


That's the brilliant thing about programming: you can do it however you want. If you want to write your code in such a way that an Anglophone spends half their time looking up ALT key combinations to type the characters, go ahead. I'm just saying that I'm not touching that code with a ten foot pole, nor is anyone I know, nor would any of my customers. There's too much talent in the world for me to put up with one iota more hassle than necessary.


What's the purpose of supporting other character sets in a language's syntax? Good luck finding contributors from all over the world for a software written in Turkish characters. I mean for hobbyists or very special purpose cases where such a thing is really needed (writing code in Turkish or Chinese or Arabic or whatever), people can add language support for that (after all it's just a bunch of keywords). It needs some effort but not impossible. People can also design programming languages with their own special character sets. It's wild to see someone objecting writing software in ASCII because of diversity/cultural issues. Really wild!


Not sure about Turkish, but I know for a fact that when you have a team of Chinese devs work on a Chinese software project Mandarin starts to show up in the codebase. And while usually it's written in pinyin (the standard Chinese romanization system) since most programming languages assume ASCII, it's obvious that those people could benefit from being able to use Chinese characters in place of ASCII and English.

If your argument is "sure, then let them invent/fork $LANGUAGE to suit their needs", just take note that if you're a language designer/developer that wants to see the language widely adopted, it's detrimental to have the attitude that ASCII is enough for everybody. There's no reason to have (for example) "Arabic-python" just because somebody insists on ASCII instead of UTF-8 even though there's little technical reason not to support it.


Non-ASCII characters make sense in national application domains, e.g. tax software, where you don’t want be forced to transliterate (or worse: try to translate) the non-English terminology.


Remind me how well typewriters worked in Japan and China


Let’s be realistic 99% of software, weighted by value, is in English. This is a good thing. There are strong network effects, and fracturing the software world into multiple competing linguo-spheres would destroy economies of scale.

Would you similarly oppose the requirement that all civil aviation globally be done in English on the grounds of “cultural imperialism”?


> fracturing the software world into multiple competing linguo-spheres would destroy economies of scale

There is a very visible pattern in most Open Source projects where a small group of core maintainers will do most of the work even if there is a much wider body of causal contributors (see Nadia Eghbal's _Working In Public_). So, what are the economies of scale? Are you perhaps referring to problems experienced in the by-gone age of the early internet?

Economists largely agree that healthy competition is beneficial for improving quality, spurring innovation and reducing costs. I don't see the problem with multiple language/culture specific software projects that all solve the same problem. This could even look like different programmers working on different projects for different audiences.

The only downside I can see for the everyday programmers is the FOMO generated from inaccessible software in a foreign language. Isn't that ironic?


> Would you similarly oppose the requirement that all civil aviation globally be done in English on the grounds of “cultural imperialism”?

Do you think that every pilot on earth always speaks only English? No.

People should have the choice not to speak English and write their alphabets if they wish.

Enforcing ASCII on code is like enforcing ASCII on URIs, e-mail addresses. Unnecessarily restrictive to the rest of the world outside of the anglosphere. Of which there's a lot.


Yes. All civil aviation globally is conducted in English. Becoming a pilot or air traffic controller anywhere in the world requires proficiency.


It requires proficiency when you want to go or receive international, but doesn't enforce it as the single language that could ever be used, e.g. domestically. There's no point in pretending there's some magic filter that would enforce it.


Diversity? Like what, you want source code to be written in Cyrillic or Chinese? Please elaborate.

ASCII is the standard, and programmers already learned to deal with it. What's the problem?


Why not? English is a convention only, not a law of computing.


The Tower of Babel. If the source code is proprietary or for education then it doesn't matter, but for the open source world it'd mean division onto multiple different languages where no one can understand and learn from each other. No one is going to learn 5 different alphabets and 12 different languages just so they can understand the source code.


There is already open source software written in other languages.

Sometimes identifiers and comments are transliterated into ASCII, sometimes they remain in the original script.

Of course, they won't get many contributions from English speakers. They will get more contributions from speakers of that language. That decision is for the project to make, not others to impose.


> They will get more contributions from speakers of that language.

Will they?


Realistically, English will still the lingua franca for software in the short-term future, and I don't think anyone is saying we should change this. The thing is, for many people "interacting with the worldwide programming community" is not a thing they need.

Imagine a computer class teacher in (for example) China teaching primary school children. Why should they need to learn English before starting to write "Hello world!" (or rather, 「你好世界」)

In an alternate universe the lingua franca of programming languages is Chinese, would you still make the claim that it's better to take away choices and force ALL programmers into using the existing lingua franca regardless of their background?

I'm guessing you'd be crying cultural imperialism because you can't teach your 6 year old kid programming in your native language.


I already addressed education.

> If the source code is proprietary or for education then it doesn't matter

I'm not forcing you to speak and teach children in English. My country actually was under cultural imperialism and people back then weren't allowed to speak or learn in our native language. And you make it sound like a joke, but whatever.


That's also a good reason to standardize all open source code in C.


Overall, Unicode is a good idea. That said, languages should limit themselves to a subset of Unicode to avoid the use of glyphs that look too similar or that have too much variation (e.g. emoji).


Slight typo: It's C++98.


Thanks! I confused it with C99.


> The best thing about using only ASCII for the main syntax is that everyone can type it with their keyboard.

My keyboard doesn't have a key for NUL, BEL, VT, EOT. What kind of keyboard do you have which has buttons for these?


Let's not be so pedantic. I meant visible ASCII symbols, character codes 32-126 plus newline.


With vim at least, you can type them in as control-key sequences (preceded by Ctrl-V).


But then the statement is incorrect. I can write all of Unicode using keyboard sequences as well.

People from US seems to have a strong fetishes with ASCII.


ASCII is an abbreviation for American Standard Code for Information Interchange after all.


A programming language designed by a Mac user might make use of the symbols §, ±, ≤ and ≥, among others. I think the tyranny of ASCII is really the tyranny of the tragically poor support for entering characters beyond a small national-language set on the commonly-used OSes, especially Windows. (macOS is significantly better here but no utopia.) Windows-1252, MacRoman and so on may not be the standard character sets any more, but you wouldn't know it from what the OSes make easy to type!


I don't think there is any tyranny in any of the modern OSes. They all let you type in many languages. Maybe the feature is not enabled in some UIs, so newbies don't learn it. Linux users can use IBus or fcitx5. I often type in ಕನ್ನಡ and हिंदी when chatting with people. Typing emoticons and obscure math symbols is just as easy.


Character inputs will always differ between platforms and the most common denominator will always end up being popular for text input. Most of the world is on Windows, so most of the world will use Windows text input. From that input, a limited subset of characters will be used for common expressions, because many people simply don't know they can write the ¬ symbol. Does it make sense to use that instead of the exclamation mark for negation? I don't think so.

The US International keyboard has loads of other characters currently not used in programming languages either, whether it's the ¡¿, the ²³ powers, the euro character, guillemets («») or the negation character. If Apple would decide their next Swift project will use the characters only quickly accessible to Apple users, it'll be treated just like that, only accessible to Apple users.

With Apple shipping keyboards that have the £ character where the # character would otherwise be, I wouldn't be so sure if using all the available characters would blow over well.

With ligature support being in every major IDE there's very little reason to step away from the old comparison operators in my opinion.

What OSes make easy to type depends on where you live. It can be quite a challenge to write the Spanish ¿? style operators repeatedly, but there's no reason not to use, for example, ¿expression? instead of parentheses for "if" "switch" statements, or to use ¡expression! as a shorthand for "return". Format strings could just as easily have been written as «string» or „string”, but the characters on the American English keyboard were chosen because they were available on most layouts or out of pure laziness.

I personally prefer US International over Apple's layout. I rarely need to use the paragraph key in normal text processing, and the short shift makes for a very awkward typing experience for me. The vertical enter also just seems like a waste of space to me. I don't really see how Apple's keyboard input is that much better than Windows', their special character set seems just as arbitrary as the rest.

Adding more special characters found in US English only makes the situation worse for people on, for example, Italian keyboards or Polish keyboards, where despite writing in a language based on the Latin alphabet, characters with additional diacritics and such are part of the main layout and deserve separate keys. Cyrillic keyboards are just as bad, lacking most programming characters already because of the larger Cyrillic character set, although they'll have to cope with unaccessibily because of the character set difference anyway.

In my opinion, the amount of special characters used in a programming language should be reduced, not increased. Backticks are already impossible to find on some keyboards, bbut languages like Javascript have gone and used them for format strings anyway. Driving people to learn a special "programmer's" keyboard layout just because the required characters aren't on their native keyboard layouts isn't a good thing. We want more compatibility, not less.


> Apple shipping keyboards that have the £ character where the # character would otherwise be

That's not really an Apple thing so much as a non-US thing. The standard PC UK layout also has a £ in that position.

> I personally prefer US International over Apple's layout.

Which Apple layout? Apple has its own US English, International English and British English layouts, as well as myriad other languages.


If I wanna use something like a ™ I have to google how to enter it, or just google and copy the character. I don't want any extra glyphs in my code until it's just as easy to enter (at close to full speed) as the glyphs I already have access to.


Last year, when I suddenly had to teach symbol-heavy stuff over the internet, I put together a tool for this because my handwriting is just too bad with a mouse or even a tablet. I wanted to be able to "live" calculate with symbols while I talked over it with my students.

I called this tool √𝚎𝚍, the rich Unicode text editing suite (RUTED, pronounced ˈruːtɪd) and I use it in vim: https://gitlab.com/ruted/ruted-vim.

I first defined so called modes of ASCII sequences representing Unicode symbols, like "forall" for "∀", and "NN" for "ℕ", and ";o" for "∘". Then I created a very small vim plugin to enable modes and change modes. When in a mode, you can enter the ASCII sequences, and they are replaced by their Unicode equivalent.

It is a bit of a simple hack, but it worked great over the past year. Entering symbols has become very easy for me now.


You could accomplish this globally with a Composefile on Linux, AutoHotKey script on Windows, or custom abbreviations on Mac.


I'm surprised you didn't go the xcompose route! Was this an explicit choice or just didn't know about it?


Yes, I wanted to build something that my students using Windows or Mac OS also could easily use. And I wanted to build a web extension for Firefox and Chrome using the same format. And maybe integrate it in some other tools we're using here as well. Of course, that was too ambitious and the only thing I ended up with was an integration in vim. And that could have been done easier, as your sibling posters have pointed out.

Anyway, it was fun to build and very useful the past year.


vim has digraphs and it does the same thing.



I use the compose key a lot; when writing docs I tend to use → instead of ->, I use – instead of -, etc. I have an extensive XCompose set up for all this, and I find it very convenient.

But it's still three keystrokes instead of one. I wouldn't really look forward to using × instead of *, ÷ instead of /, etc. all the time, even though I can type them here with relative ease the hassle increases the more you use it (I don't tend to bother with fancy “quotes” for example).

I also don't think it really matters all that much. What's wrong with *? Sure, × looks nicer, but * is clearly the pragmatic choice.*


BTW: there is another obstacle aside from typing issues: readability problems and similarities between operators and ordinary characters with computer fonts.

In case of manually written math equations (or LaTeX-ones) operators are easy distinguishable from arguments. They have a different sizes too.

For example, "result = axe" and "result = a×e" in many cases looks the same and, even in my browser, with font larger, than ones usually used by my colleagues, they are very hard to distinguish. Difference between "result a÷n" and "result a-b" can be spot easier, but it depends on two one-pixel dots.

Ok that was only a mumbling of malcontent - in fact we all know that we all have a sharp, young eyes and we never will be tired or distracted, I'm pretty sure of that. ;)


I would imagine that tooling such as syntax highlighting and decent errors should alleviate that sufficiently to not be a serious practical problem.

But yeah, I read over the "axe" and "a×e" difference on my first read (and I browse HN at quite a large zoom by default, not because of vision issues, I just like larger text as a matter of personal preference).

Either way, I don't really see the significant advantages in the first place. In spite of the article and some of the strong words of some in this thread ("horrendous", "embarrassing", etc.), I don't see the problem is with just sticking to ASCII. The only case I've seen is where it would have been nice is when «T» was briefly considered for Go generics instead of <T> (later changed to [T]) to avoid overloading the existing meanings of <> and []. I actually would have liked that. But / vs ÷? shrug.


The thing is, not everyone has your keyboard, and not everyone speaks your language. Having languages support unicode doesn't mean your project must be written using non-ascii characters, it means other people's projects can be.


If you are on Mac you can use CTRL+CMD+Space to open the symbol picker that also allows searching for symbols. There is something similar on Windows iirc.


The shortcut on Windows is WIN+PERIOD. You can continue typing and it will search for matching emoji. For example, type WIN+PERIOD+smile+ENTER and it inserts a smiley face. It also supports mathematical symbols and greek letters using the mouse, but the search doesn't find them. A shame, because it's nearly useful.


In emacs counsel-unicode-char brings up a menu where you type to narrow for example entering trademark gives you both the ® and ™

On my phone's keyboard typing the words registered or trademark bring up those options as completions.

Word and libre office have the menu under insert

I actually like the emacs version the best. One could even rig up a deal with emacs client to effectively use it outside of emacs by popping up a floating window then shoving the result into the clipboard.


Ah, cool function that I did not know about before, thank you! Any idea, why it does not list all unicode characters? For example, if I search for what charmap shows me as description of "你", "you, second person pronoun", then the results in Emacs are empty.


Presumably the lookup is based on Unicode character names. The name of "你" is CJK UNIFIED IDEOGRAPH-4F60.

Charmap must be showing you information from some sort of dictionary entry or similar, but that's not part of the identity or name of the Unicode character, any more than the name of the Unicode character U+0049 "I" is "first person singular pronoun" (it's actually named LATIN CAPITAL LETTER I); that's just what it happens to mean in one particular language.


That makes sense. A character might mean one thing in one language and another in another language.

Even when I enter "4F60" it does not show up. Seems some ranges are missing.


Here is something else neat

https://github.com/jeremija/unipicker

Can be run with --command="rofi -dmenu" to filter via rofi and piped to xdotool type to insert immediately


I haven't advertised this very much but years ago I wrote my own keyboard layout because things like this annoyed me. I came from an AZERTY background and wanted the m key moved in qwerty, as well as various other adjustments for common symbols and punctuation, especially for programming.

Anyway, a year or so ago i contributed it back to xkeyboard. On Linux you can find it in English (US) variants, Drix.

https://cgit.freedesktop.org/xkeyboard-config/tree/symbols/u...

The ™ symbol is AltGR+T. × is AltGR+x.


For what it's worth, it seems to be a default option in Linux. xUbuntu, for me, went into Settings, Keyboard, Layout, unchecked Use System Defaults, selected a compose key (I picked Right Alt)

Was able to do a TM by holding shift and right alt, typing TM and letting go of shift and right alt. ™

Punctuation and vowels result in accents: Û Ü Ä Ö Ô Þ Ŷ Ô ⸘ ⸘ Æ «» ¿¡ ¨ ¯--_ ⋄


Julia REPLs and editor plugins have adopted a nice feature where you can type the LaTeX name of the symbol you want to get it.

e.g. `\subseteq` followed by tab yields `⊆` or `\trademark` for `™`.

You can also do emoji with `\:eyes:` for ``

I think this is very practical and if you read code in Julia you can see that it's led to quite a lot of unicode and emoji in source code.


> we still find it more uncompromisingly important for our source code to be compatible with a Teletype ASR-33 terminal and its 1963-vintage ASCII table than it is for us to be able to express our intentions clearly.

D is a fully Unicode language - comments can be Unicode, the builtin documentation generator can be Unicode, even Unicode characters in the identifier. I've been tempted to add Unicode characters for operators many times.

The trouble is the keyboard.

I've seen many ways to enter Unicode. I even invented a new way to do it for my text editor. All awkward and unsatisfactory. No way to touch type them, either.

What is needed is a dynamically remappable keyboard, with a configurable display on each keytop so you know which letter it is. Nobody is going to remember what the remapped letters are without it.


I've recently dipped my toe into APL, and I tend to agree.

APL only really advances from lines of ASCII text to lines of Unicode text. It's still very line-oriented. If Unicode expands toward Tex/LaTeX/&c it could look a lot more like written math. Subscripts are nice, but there's more to it.

Unicode entry is so poorly supported on macOS & Windows it's really not funny. Emacs (C-x 8 ENTER) is my best bet, or Xah Lee's website &c.

If one were to design a new APL, one would be tempted to just use the few characters Apple lets you type with an Option or Command key. (and of course I can't easily type those symbols so I refer to them by six letters each—that's likely not going to change)

All APL needs though (EDIT: and really I mean any language, as long as we have common sequences), is leader key sequences. Rho is `r. To me, it always will be. The interface [1] has this down pat. We should be able to type anything in Unicode with just a few of the 104 keys. Like how T9 allows 12 buttons to type 26, &c.

The world is never going to build a 12,000-key Unicode keyboard. We're going to have to use leader sequences. Just my (beginner's) opinion.

[1] even tryapl.org/ !


If it's not on the keyboard, nobody is going to type it.

Solutions:

a) only use characters in the intersection of the top N most popular keyboard layouts

b) issue programmers' keyboards with an agreed character set

c) issue programmers' keypads with supplementary characters

d) add on-screen supplementary keypads


I used to use Mathematica a lot and it was never an issue that the keyboard doesn’t have an integral sign or a lambda. You just type :int or :lambda and it displays as the proper character.

When working with mathematical modeling it’s honestly just so refreshing and every other symbolic math solution in other languages feels decades behind.

But alas Mathematica is proprietary and expensive enough that it’s always a management decision if you want to use it, so I never use it any more.


Hmm, Julia begs to differ, with most editors replacing lambda with the lambda Unicode symbol for example.


Using ligatures (of sort) in presenting code doesn't mean the code itself is not ASCII. Plus, let's see mainstream languages adopt this, I doubt they will. Languages do stupid things all the time.


It means in julia, the Editor and the repl actually replace \lambda with the unicode greek letter


This seems to be a solved problem, if we want it to be. We can easily enough teach our programming editors either that typing < followed by = should convert to ≤ automatically unless you press some sort of literal key first (say Esc), or that typing some sort of compose key (say Alt Gr) followed by < and then = should convert to ≤.


What we need is a programmer orientated IME.


People should listen to this guy. He's got some perspective. The main reason we keep things the way they are is something like tradition, or some kind of psychological effects where we want to fit in. Whatever it is, it's not reason. All the rationalizations come after.

Definitely should be able to put functions in the horizontal space, use colors as part of the syntax, use Unicode symbols where it is warranted.

I would go farther and say that we should have at least some ability to edit things in a non-serialized way, like a WYSIWYG math formula for example.

I hope people will also explore structural program editing. https://en.m.wikipedia.org/wiki/Structure_editor

Almost forgot, one more crazy idea: switch to larger instantly reconfigurable touchscreen keyboards to allow more symbols to be entered easily.


> It was certainly a fair tradeoff—just think about how fast you type yourself—but the price for this temporal frugality was a whole new class of hard-to-spot bugs in C code.

> Niklaus Wirth tried to undo some of the damage in Pascal, and the bickering over begin and end would no } take.

"1970 - Niklaus Wirth creates Pascal, a procedural language. Critics immediately denounce Pascal because it uses "x := x + y" syntax instead of the more familiar C-like "x = x + y". This criticism happens in spite of the fact that C has not yet been invented." http://james-iry.blogspot.com/2009/05/brief-incomplete-and-m...


> How desperate the hunt for glyphs is in syntax design is exemplified by how Guido van Rossum did away with the canonical scope delimiters in Python, relying instead on indentation for this purpose. What could possibly be of such high value that a syntax designer would brave the controversy this caused? A high-value pair of matching glyphs, { and }, for other use in his syntax could. (This decision also made it impossible to write Fortran programs in Python, a laudable achievement in its own right.)

The irony here is that Python does have an open-scope delimiter. It is the colon. What it lacks is a close-scope delimiter. But you can hack one using the PASS statement and emacs auto-indent, and in my code I do this so that my Python code always auto-indents correctly. Without this you cannot reliably cut-and-paste Python code because you can't count on leading white space being correctly preserved.


> Why do we still have to name variables OmegaZero when our computers now know how to render 0x03a9+0x2080 properly?

Majority of developers are not from Greece; they don’t have the keys on their keyboard. Technically, modern C# supports Unicode just fine, here’s an example.

    using System;
    using System.Collections.Generic;
    using System.Linq;

    static class Program
    {
        static double Σ( this IEnumerable<double> elements ) =>
            elements.Sum();

        static void Main( string[] args )
        {
            double[] α = new double[ 3 ] { 1, 2, 3 };
            Console.WriteLine( α.Σ() );
        }
    }
> Why not make color part of the syntax?

Similar reason, because input becomes more complicated. You gonna need to either memorize hotkeys, or reach for the mouse. Also copy-pasting UX becomes way too complicated.


> Why keep trying to cram an expressive syntax into the straitjacket of the 95 glyphs of ASCII when Unicode has been the new black for most of the past decade??

Because it must be possible to actually type syntax.

The problem is a physical one: keyboards are limited in space, we need alphabets, punctuation, a bunch of control keys, and finally we have a very small amount of space left to fit some arbitrary symbols - might as well be ASCII ones. The only way to truly break away from the ASCII table would be to start making giant non-standard keyboards... and now you have a huge inclusivity barrier, not to mention the impracticality of physically huge keyboards.

This all seems like a lot of effort and argument over something even more superficial than language syntax - they are only glyphs... how does using more and different glyphs substantially change anything?


Vim supports entering quite a lot of symbols that don't exist on most keyboards (like the integral symbol.) I always have to look up how and that forces people to use editors which support that.

>And, yes, me too: I wrote this in vi(1), which is why the article does not have all the fancy Unicode glyphs in the first place.

Maybe vi doesn't but as I said, I've used them in vim. Also, the title calls out terminals but I don't have any VTE software installed that doesn't have Unicode support.


Even if it's possible to type the syntax, there's also the speed of typing. How fast can I type "omega"? Pretty fast. How fast can I enter a unicode omega code point? Probably not as fast.

How fast can I read "omega"? Pretty fast. How fast can I read an actual Greek letter omega? I don't read Greek, and my physics classes were decades ago. Yeah, I can figure out that it's an omega. It might take a bit, though...


I wrote this, and I'm here if you have questions.


This isn't the kind of article people have questions about - everyone is ready to jump in feet-first with strongly-held opinions!

I thought the article was fascinating and thought-provoking, I hope all the strongly-worded opinions aren't causing you to regret it.

I was also wondering whether tokenizer rules deserve some flak for making it harder to build identifiers and operators? Why is it more important (in C-like languages) for programmers to write without spaces, than to be able to call a variable ready? or define a +++ operator?


Far from, if anything the opposite.

I mainly write this kind of stuff to make people think about what they would not otherwise have thought about, with the secondary goal getting at chuckle or two along the way.

This particular one have been discussed on HN previously and every time the "beause ASCII == keyboard" argument have come up.

...and been slapped down by people from the rest of the world, who have two or even three keystroke sequences to reach [\]{|} etc, because more important national characters, like æøåÆØÅ lives on the "usual" keys.

If I managed to get people to think about that, I've done my job.

As for C:

I think the C lanuage has been mismanaged, why did we need yet a threading API, while we still dont have explicit struct packing, including per struct or per field endianess specification ?

The ISO-C Group is incredibly conservative, which is plain wrong: A finite amount of C code have already been written, but there is a potentially infinite amount yet to be written. They should focus on the larger oeuvre.


I'd love to use unicode on a current project - but just the unicode library is bigger than the memory of the small computer being used :(

Seriously, unicode is appallingly messy and heavy-weight. With all that code-point space to burn, a rational design would be very different to what we've got.


All you need to know about unicode, is that they thought they could "unify" all the asian glyphs into one alphabet.

That said, given the origins of writing, I dont think there are any much cleaner solutions.

Their biggest mistake is probably not insisting on a mandatory vector representation which could be used to generate a "font-of-last-resort"

My personal pet-peeve is that unicode have not yet assigned 128 code points and two modifiers to cove all possible combinations of 7-segment LEDs :-)


They'd probably just define each of the segments separately and require you to use a zero width joiner.


>>> world's second write-only programming language

I am not clever enough to comment on the meat of the article but I love that quote :-)


The article's title uses "ASR-33" instead of "Terminal". Actually I think most users called it a teletype.


This really strikes me as a "not even wrong" post. I'm not sure there is anything wrong with programming and if there is something wrong with it, the problem sure isn't "there are not enough operators."

My favorite languages don't even have the _idea_ of operators and in languages with custom operators, with all their crazy ass rules about precedence and content-free representations, I'm always re-translating the code to s-expressions in my mind.

The Scheme way seems fine to me - operator-like functions when the meaning is universally understood and then human-readable names for literally everything else.


This really made me smile:

> Programmers are a picky bunch when it comes to syntax, and it is a sobering thought that one of the most rapidly adopted programming languages of all time, Perl, barely had one for the longest time. The funny thing is, what syntax designers are really fighting about is not so much the proper and best syntax for the expression of ideas in a machine-understandable programming language as it is the proper and most efficient use of the ASCII table real estate.


Most people who think they see ASCII dependence are actually perceiving standard keyboard dependence. Modern keyboards all ape the Model-M and all modern programming languages are constrained by Model-M compatibility. ASCII is an implementation detail. Very few “radical ideas” for programming language formats pass the “easier to use than typing” test, let alone the “so much easier it might merit jumping off path dependence” test.


The author wonders,

“But programs are still decisively vertical, to the point of being horizontally challenged. Why can't we pull minor scopes and subroutines out in that right-hand space and thus make them supportive to the understanding of the main body of code?”

Which made me wonder if there might be some way to make an editor do something along these lines, without changing the programming language. Similarly with his speculations about using color.


Yeah, this couldn’t be more misguided. It would be the very definition of a boondoggle.

The key point here is chunking. Mentally you can process the + as a plus sign. It’s been single concept. It’s association with addition probably means the reader can make useful inferences about what it does.

Now consider the Greek letter ζ (zeta). For anyone unfamiliar with that it would be more than one chunk as you would probably try to remember the shape.

Worse, you may have no idea how to input it.

And to get any of this we’d have to deal with encodings. For what, exactly?

And sure, not everyone is familiar with the Latin script that dominates ASCII but you have to pick something and for better or for worse English is the lingua franca of programming.

And don’t even get me started on the insanity that is Unicode attempting to assign a code point to every symbol ever imagined and then extending that with compound code points and modifiers.


I think it's good to think outside the box. Maybe incorporating some tried and true Unicode would make sense.

However, I don't think the issue is ASCII but rather consistency. I'm making a C-competetor language and using "c-like syntax" except I'm enforcing consistency. () is always one or more statements, with the last expression returning the value. {} is always representing concrete data like a struct, array or function args/result. []? That's a piece of a type name, allowing the developer to name something Foo[U32; Str]

fn foo: {a: U32, b: Str} -> U32 ( if a > 5 then b + "is greater then 5" else b + "lte 5" ) )


I agree with the conclusion of the article; we should move past the requirements of a long-gone era of computing, but the suggestions provided don't feel like improvements to me.

> Why keep trying to cram an expressive syntax into the straitjacket of the 95 glyphs of ASCII when Unicode has been the new black for most of the past decade?

Because I don't have 143,859 keys on my keyboard. Having to type λ would involve either changing my keyboard layout or using some key shortcut. In either case I'd miss the benefit of muscle memory to just type `func`.

I don't see ASCII as a hindrance, but as the lowest common denominator for being expressive in any language, computer or human (Romance ones at least).

> Why not make color part of the syntax? Why not tell the compiler about protected code regions by putting them on a framed light gray background? Or provide hints about likely and unlikely code paths with a green or red background tint?

Why should a compiler be concerned about how errors are displayed? You can already do these things with a program that reads the output and renders it as you wish. We could argue that the information should be available in a machine-readable format to avoid parsing text, but I don't see how a deeper integration of color would help.

The things that I would like to see in the next 20 years of programming are:

- Abandoning text files and filesystems as an abstraction. Most programming languages are fine with handling only "modules", so having to manage files and file paths just feels unnecessarily clunky.

Light Table[1] attempted something like this IIRC, where you only dealt with functions or modules directly, and it was a joy to use. Expanding this to a language would simplify things a lot.

- Speaking of which, code versioning systems should abandon that concept as well. It's 2021 and the state of the art is still diffing pieces of text and doing guesswork at best to try and complete an automatic merge.

Our versioning tools should be smarter, programming language aware and should minimize the amount of interaction they require. I can't imagine the amount of person-hours ~~wasted~~invested on trying to understand and make Git work.

- Maybe just a fantasy and slightly off-topic for this discussion, but I'd like to see more programmer-friendly operating systems. With the way Windows and macOS are going, Linux is becoming more like the last bastion for any general programming work. Using these systems often feels like the computer is controlling me rather than the reverse.

[1]: http://lighttable.com/


I can't agree with the conclusion when he does not even try to guess at the reasons for status quo, only the offhand remark "computer people are conservative".

Perhaps the text with limited alphabet and fixed-width display is actually optimum both for programmers' mental workload and for computer processing? Why the urge to abandon it?

Many many people tried to "move on" but there's always some catch. I like DRAKON graphical programming editor but it makes diffing harder and merging changes from outside impossible.


This will work once we have keycaps with e-ink or similar technology and this will be super cheap for everyone. Then we all will have keyboards with infinite number of characters and new chapter of typing will begin.


Another idea: let the Windows-key (I have no use for it anyway) trigger the appearance of an onscreen palette of glyphs, which you can click with the mouse.


Windows + . (dot key) does that by default. Sure, most of the symbols are just weird cat faces or whatever, but the basic feature is there.


I'm on Ubuntu :)


Doesn't something like that already exist: Lenovo's Yoga Book C930? It had 2 screens, with one being an e-Ink keyboard. Very useful tech.


Already done! Emoji code, emoji-based programming language ! https://www.emojicode.org/


Wow. The same thing I was thinking about. We built a fundamentally flawed system when it comes to ASCII, strings and human languages.

Computer language needs to abstract time and space representation. Solving will improve human - computer interfacing problem. Connect two worlds with commonality, humans writing instructions is not that.

MOV command is the single source of truth. It does something with space and involves time. Start there and build upwards.


Go was designed by Ken Thompson, Robert Griesemer and Rob Pike. The article makes it sound as if Go was designed by a single person.


I use Unicode characters in Java identifiers as much as I can get away with; I wrote a code generator that embeds all kinds of funny brackets that are ‘meaningful’ as an emergent property.

For entry I have symbols cut-and-pasted faster than most people type. Also the completion feature of the IDE works just fine.


Discussed at the time:

Programming languages have to break free from the tyranny of ASCII - https://news.ycombinator.com/item?id=1850938 - Oct 2010 (116 comments)


We literally have to: https://github.com/ASCII-Rightholders/the_characters

Unless, of course, we want to buy the not too generously priced license


I'm fairly certain that is a joke. Or a very badly done scam.

Either way, I doubt anyone can claim a copyright on ASCII.


Why not just use ligatures? For example => turns into an arrow in some fonts. This could likely be extended. I think the major issue is with accessibility. How do we make sure whatever solution can be used by most?


Python could be variable font language. The more you indent, the smaller the font gets. You could always see the whole program on one page. Editor would zoom automatically of course.


As much as I admire Mr Kamp's work, I think this is nonsense. Just because something looks familiar doesn't really mean it is the same. There are good reasons to pick syntax that is familiar. Novelty comes at a cost.

And I also think he is wrong if he thinks languages will be improved by making them harder to type and read.

I've spent a considerable amount of time trying to understand code written by someone in China. All the comments in that project are Chinese - which I don't understand. Now imagine using symbol names in Chinese. Or Hangul. Or Russian. Or in Baybayin script. Or Sanskrit.


(2010)


Go hardly looks different than BCPL from 1967.


This is pretty much an intentional design choice.

Fun fact: the first commit in the Go repository is from 1972. In B.

https://github.com/golang/go/commit/7d7c6a97f8

Followed by a commit from 1974 to convert it to C, and 1988 to convert it to ANSI C:

https://github.com/golang/go/commit/0bb0b61d6a

https://github.com/golang/go/commit/0744ac9691

https://github.com/golang/go/commit/d82b11e4a4


Which is horrendous. In the forty years since BCPL (go was 2008?) the best way we could coordinate groups of developers in developing applications is to type streams of ASCII text and save them on disk with 0x0A indicating "the human wants a new line here". FML.

Imagine if, to change the graph of your Facebook feed, you had to checkout myname.person, add "friend Bob { ... }", and then try to commit it. But far more complex graphs of objects, classes, ASTs of implementation, sure, ASCII is fine for that.

It's embarrassing is what it is. I'm not saying it's not hard to come up with something better, but it's been fifty fucking years now.


https://wiki.c2.com/?WorseIsBetter

Your image computing system isn't better than text files, much like your online file storage system isn't better than emailing things to yourself.


So we shouldn't have invented the database?


I would say sqlite contains much of the wisdom of flat files and other databases often don't.


Seems like this was done to pay homage to the roots of Golang. :)


It's a terrible idea. Sooner or later there will be identifiers with homoglyphs. Good luck debugging that!


Supporting unicode doesn't mean you can't defend against homograph attacks. See http://www.unicode.org/reports/tr39/.


I quote from that reference: As discussed in Unicode Technical Report #36, "Unicode Security Considerations" [UTR36], confusability among characters cannot be an exact science.


Most of that makes sense but it's worth remembering the ASR-33 was upper case only!


Emoji symbols for some funky new Haskell function, why not.

type (@@@#@$$$)

Could definitely be made more compact :).


Binary is too limiting; we need real number based transistors.


The article's argument in a nutshell:

We need to break free from the tyranny of the characters on our keyboard, and express ourselves using characters not on our keyboard.

So I hope you see why that argument keeps failing.


Your keyboard is not my keyboard, nor everybody else's. So I hope you see how US-centric your view is.


I've never been to the US.

Before you hurl incorrect accusations at others, check out how international keyboards look. All have QWERTY-like (in some cases AZERTY etc. but still similar) layout as a BASE, and the local characters are on top of that. Yes even in China, Japan, Russia, Greece, Bulgaria, Macedonia and so on.

This is why languages center around this charset, and why it'll stay like that.


> All have QWERTY-like (in some cases AZERTY etc. but still similar) layout as a BASE.

But not the Latin charset as the single one that can be entered. Let's pick a random one on your list, Russia for example. Funny thing appears, it's clear that you haven't visited Russia for example, hint: Cyrillic on keyboards. Shall we continue? ASCII is not the world's default and never should be.

And incorrect? No, maybe I should've just said narrow-minded instead of American.


Did you just call Latin on keyboards "US-centric"?

How exactly do you think someone, say, in Russia, types their Gmail address in a form?

I feel you're the one accidentally revealing US-centric views.


Yes.

> How exactly do you think someone, say, in Russia, types their Gmail address in a form?

They switch their keyboard to Latin, from Cyrillic. They probably prefer mail.ru though, because of such stupid limitations.

Nice try though.


Here’s some F#

    let ``I  emoji`` = true


> My disappointment with Rob Pike's Go language is that the rest of the world has moved on from ASCII, but he did not.

Um, the Go reference explicitly states that "Source code is Unicode text encoded in UTF-8" [1], so I'm not sure what the hell this guy is talking about. The language itself maybe? Well contrary to the author of the article I'm really no fan of Rob Pike and I think he's a massive arrogant prick, but in this particular case he was absolutely right to be pragmatic for the language syntax and let people be stupid enough to use inaccessible characters in their source code. Which, come to think of it, actually flies in the face of the root principle of Go as stated by Pike himself, which is to shield dumb rookie Google developers from their supposed inexperience.

[1] https://golang.org/ref/spec#Source_code_representation


This feels a bit gratuitously rude. He has strong opinions, and literally decades of experience. He is said to dislike time wasters. As one of a very small set of people who built much of the modern development stack from the ground up, it would be tedious in the extreme to have to deal with every headstrong tyro determined to make their (likely highly pedestrian, repetitive) point.

Never met him, never really dealt with him, don't see why you feel you need to denigrate him.


I understand where you're coming from, but in the case of Rob Pike I get the impression that he enjoys having the reputation of a contrarian, and is not above trolling once in a while. Just look at the history section of the Mark V. Shaney wiki page.

[0] https://en.wikipedia.org/wiki/Mark_V._Shaney


I've met him. His desk was a couple meters away from mine while I was working on Google Wave, and he was working on early versions of Go. Over 6 months or so I had a lot of little interactions with him. He's an interesting guy. He's quiet and hard working, but he has some great stories. And lots of strong opinions that he's spent a long time thinking about.

I agree with the GP. Calling someone you've never met "a massive prick" with no evidence is rude and pretentious. Especially given how much he's contributed to computing.

He is a bit of a contrarian - but so what? All the interesting people are. There's a lot of daylight between someone being "a bit of a contrarian" and being "a massive prick". One does not imply the other.

You don't have to like Go. I sure don't. But have some respect for the people who have poured their lives and souls into making computing what it is today. Our industry wouldn't exist without them.


That's a bit different to how he has to deal with contemporary compsci nerds. And, it's very old. We're talking about a 40 year old story.

I attended my share of Unix User Group meetings. All of the names had people ride point for them at drink time, adoration is tedious.


> This feels a bit gratuitously rude

Agreed, and I 100% believe that meeting him in person would most likely change my mind. I also 100% believe that he has every right to be an "arrogant prick" for all his contributions and achievements to the field. But that doesn't make him any less insufferable, unfortunately.


You're missing the point. You're just throwing insults around (again in this comment), without even letting HN readers know what are you referring to. Why do you think Rob Pike is an arrogant prick, or insufferable?

Also note that HN has guidelines: https://news.ycombinator.com/newsguidelines.html


[flagged]


Please don't.


Good lord.

Pike's developed a language designed to be easy "to understand and easy to adopt."

"It must be familiar, roughly C-like. Programmers working at Google are early in their careers and are most familiar with procedural languages, particularly from the C family. The need to get programmers productive quickly in a new language means that the language cannot be too radical." – Rob Pike

They are not researchers, they are engineers.

Do I think Pike could develop a language for researches filled with special unicode characters that do magic things? Of course.

But this practical approach to a language doesn't make him a giant prick. As someone who has played with "Fancy" languages you get tired of them (and the people using them) after a while.


It's even more funny, given the fact Rob Pike co invented UTF-8 with Ken Thompson. So he must know a thing or two about Unicode.

Not using Unicode for Go syntax is a great thing.


Both of which coinvented Golang, haha


> I am not sure what the hell this guy is talking about

Have a look at the APL language the author is mentioning, you will see you missed the point of his article.

The point of the author is not just about having more characters supported for identifiers or within strings (as the part of the golang specifications you are pointing to). He has more operators and related constructions in mind, which could benefit from more elaborate characters, e.g. the whole set of mathematical operations.

The issue I see however in his reasoning,is that precisely in the case of APL, dedicated keyboards had been built for it. The reason for limiting character set to ASCII, is there immediate accessibility on everybody's keyboard


I noticed Rust added unicode support for writing code recently, does anyone know if it can be disabled per crate?

I don't think having symbols that cannot be typed or pronounced is such a good idea..


"When I was a child, I used to speak like a child, think like a child, reason like a child; when I became a man, I did away with childish things.

[..]

Syntax highlighting is juvenile. When I was a child, I was taught arithmetic using colored rods. I grew up and today I use monochromatic numerals." - Rob Pike


A great example of how very smart capable people can express silly ideas wherein they justify personal preference with bad logic.

https://ppig.org/papers/2015-ppig-26th-dimitri/

Code isn't arithmetic nor prose and the majority usage of highlighting is because people intuitively grasp that life is easier with highlighting.

It is perfectly ok to have different preferences but one must be careful not to elevate a preference to a law built upon sand not bedrock.


I hypothesised that syntax highlighting just had to provide some tangible benefits just because of how it allows the brain to process lots of information with a quick glance. After having witnessed a few geniuses work and swear by no syntax highlighting, I started to doubt my thesis, so thank you for linking the paper.

Maybe some people just perceive the syntax highlighting as cognitive overload although it's meant to achieve the exact opposite.


Maybe They want to try fixing syntax errors to old school way printed out fanfold paper and highlighter pens.

narrator voice : They do not want to do that.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: