Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

The best thing about using only ASCII for the main syntax is that everyone can type it with their keyboard. I think the recent fad of supporting Unicode identifiers is misguided. Of course, Unicode should be permitted in string literals, but not in the code itself.

*Also I don't think Go is better than modern C++, though it might be better than C++99 which was the main standard when the article was written.



Having to do math and physics heavy work, I very much support the "fad" of having Unicode literals. "omega_0" is much worse than "ω_0" especially if you have tons of these variables.

If you don't do math it's difficult to understand, but try writing without the alphabet and expressing the same concepts. Possible but clunky


I have a degree in math and I disagree with this. ω_0 definitely looks better than omega_0, but it is much harder to enter unless you have a special keyboard setup.

Suppose you write a function that uses a variable ω_0, and somebody else wants to change it. Unless they have the same keyboard setup as you, they will have to just copy-past it everywhere. And what if you have several variable with special names?


Okay, so we need better keyboards then. If a character is annoying to type, that is a keyboard problem, not a language or charset problem. So let's improve the keyboard. Currently we are stuck with the garbage qwerty keyboard (and various slightly better permutations of it) which doesn't even let you type all of ascii. Computers should just do what humans want. If you see a character, you should just be able to type it. It's absurd that cpus can do billions of ops per second but you can't just type an omega. Absolutely shameful, really.

You should dream bigger. A better world is possible but only if people really want to throw out all the shitty designs from the ancient past


> If you see a character, you should just be able to type it.

How should we do this?

I am imagining that we have a "char-bank" that lives on a little touch display at the left of my keyboard. I can add chars to my bank by highlighting them and pressing a 'yank' key. I can scroll up and down through my charbank and type the chars with a tap or rearrange them with touch-and-hold similarly to apps on a phone's homepage.

If I want to map a char to my keyboard, I can press and hold a "map" key, tap the char, and press the key I want to map it to. If I want to map the char directly from the screen I highlight it and press the map and yank keys together.


Some other people in this thread suggested having a keyboard with little e-ink displays on the keycaps. Then the character map could be completely user-defined. Then you could just have a big bank of keys off to the side to select your desired set of characters, i.e. greek, math, emoji etc. You can already sort of do this in software, but you have to manually remember the location of every character in each extra mapping. But with displays on the keycaps, you could have as many different mappings as you want. And that's how you would be able to type in a sane world


Well damn. I don't have a dog in this fight but boy oh boy do i want that keyboard now. The possibilities of every key being changed to match context sounds awesome. Especially since i love modal editors with visible prompts.

Vim/Kakoune users could have their help menu _be_ the keyboard. Every action changes all of your keys to whatever is possible. Sort of mind blowing.

Come to think of it, i bet someone has done this with Vim & LED Keycaps. Huh


I remember 14 years ago salivating at the Optimus Maximus keyboard from Art. Lebedev studios: https://www.artlebedev.com/optimus/maximus/


>Vim/Kakoune users could have their help menu _be_ the keyboard. Every action changes all of your keys to whatever is possible.

Wow, this is an amazing idea. Think of all the other unimagined possibilities if we had real innovation in human-computer interface tech


You can see IMEs used for Chinese/Japanese. Type in ascii keyboard, then transform to own language characters.


To get there, though, you need everything upgraded. Keyboards, sure, but also compilers and editors and IDEs and static analysis tools and grep and...


Keep in my that in Unicode there are several symbols that look alike. It is not just the character set that matters, but how many of such things we have.

May be instead of arguing for Unicode, community needs to come up with a unambiguous smaller set of symbols languages should support.


And ASCII has the reverse problem: one symbol gets extremely overloaded semantically (syntax wars?!).

Tell me what does ":" mean?

Now if I tell you the language is Typescript, what does "?" mean?

I personally believe semantic overloading causes a lot of problems, but we are so steeped in the existing limitations that we don't recognise it is a problem.


Odd choice. Punctuation, by itself, is virtually meaningless.

Language overloads things. Pretty much period. I doubt programming will get away from that.


The implied topic of my comment was programming languages, where your punctuation characters really do matter. So much so in some languages that a single misplaced " " can really ruin your day.

We are so short of semantic operators that languages start to use single letters like "s", "q", " u" as operators or modifiers - which I find ugly (although the best compromise).

I have experienced how the wrong Unicode character can ruin my day - but that is what tooling and editors and best practice are there to help with.

Keyboards are the main problem now I think (in the past it was your OS, your editor, and your tooling). I regularly type Unicode characters from my handheld devices, but hardly ever from my qwerty entry devices (I can't easily find ¡, ∆, ↑ or ç as examples).


Taken at minimal value, punctuation is merely another symbol. A variable named by the symbol "index" will have context dependent meaning.

So...I don't see how this changes my point. The list of symbols that is a program will have many of those symbols in need of context to understand. Do you really think programming can escape that?

Edit: I am sympathetic. I like lisp for having fewer signifiers than other languages.


I dread the day when I edit some code that has both \Sigma and \sum in the same scope.


You mean ∑ and Σ? Yeah, that'd be terrible. It doesn't seem like it should be Unicode's job to assign meaning to characters, so I don't know why ∑ ("U+2211 N-Ary Summation") exists.


> I don't know why [it] exists.

Ill-considered backward 'compatibility' with block-drawing character sets for mathematical typesetting. (Fucked if I know which character set, but presumably the same one they found U+23B2 '⎲' and U+23B3 '⎳' in.)


Pretty sure those are necessary with the extenders when you want to sum larger things like integrals. Can't remember the details, though.


Yeah. I was offering up an example.


You don’t need a “special keyboard setup”. Linux with X11 can do this out of the box. Add a line to your xmodmap file to put the dead Greek key where you want, and you’re done. I type ω by tapping a modifier key and then w. It’s even easier than typing W, because I don’t even have to hold down the modifier.


That... sounds like a special keyboard setup to me.


OK, I guess that’s a reasonable use of the word “special”. Here is what I was trying to get at:

I can not effectively use my computer for anything without customizing the keyboard a little. I need to make the capslock key into an extra control key. I need to set up a compose key so I can type accents when writing in Spanish (even if I didn’t do that, what about writing people's names?). Even sticking with English, I need to type curly quotes and apostrophes, and em- and en- dashes, plus, now and then, symbols like ™ or ©. And I need to be able to type Greek letters when talking about physics or math. So, since my keyboard is already set up to handle all this, and it is easy to do so, for me it is delightful that I can use ω in my Julia code. The math looks more like math, equations take more familiar forms (which makes it easier to spot errors), and the code can be made more expressive. So, for me (but clearly not for everyone) this is not a “special” setup.


Right, but the whole point of language design is to be used by many people. I’ve programmed for 30 years and have never customized my keyboard setup. If you want to create a language for both of us, sticking with standard keyboard characters seems like the best choice.


What do you mean by standard keyboard characters? Pretty sure characters on keyboards are often country and language dependent.


And again, this helps other people who have to edit your code how?


Well, if my Julia code ever gets to the point where other people want to edit it, the community has embraced Unicode, so there won’t be a problem there. More generally, I’ll turn my comment into a question: how are people using computers without already having them set up to easily type these characters?


> how are people using computers without already having them set up to easily type these characters?

Current solution for most people is google + clipboard

For example, to type the word "cliché" here, I googled it, then copied it, then pasted it.


That’s not what I would describe as “easily type”.


This thread is making the point that the issue isn’t the character set, but rather the keyboards. We’re probably locked in, there is so much inertia around the standard QWERTY with a few arrows and Esc. Maybe an integrated system builder with large scale like Apple could shift things slightly.


Indeed; Mac OS has supported easy entry of alternate (non-ASCII) characters on "self-inserting" keys with Option (which can be further modified with Shift) since the 80s. Granted it's a limited set, but they're useful. Here's a sampling “” … ™ Ω ç ß ∂ ∑ † π «» ¬ ˚ ∆ ƒ ∂


I normally use the Unicode keyboard so I can write out propositions, predicates, and set algebra. How many of us would readily adjust to the keyboarding required to break away from ASCII?


CJK users are pretty successful at entering large numbers of Unicode characters with a QWERTY keyboard, and there's more of them than you.


Good point. Just lack of determination to make use of available techniques?


It was the default for me.


That it was your default doesn't mean it was everyone's. OP's argument is that programmers should make it easy for someone else to change their code. It doesn't matter what your defaults were, it matters what the defaults of the next maintainer are.


Glad you understand why there's no need to restrict everyone to US-ASCII.


It really doesn't have to be. For instance, in Julia IDEs \omega would complete to ω, which takes 2 extra characters to enter (\ and tab). Its not hard to imagine this concept, but applied at the OS level...


I respectfully disagree with this. I do a lot of maths-heavy work, but for me in languages (not Julia!) being able to quickly write the LaTeX names for symbols is more than enough. It's also much quicker, and easier to search for when reading someone elses' code.

I personally find it also helpful to make the distinction between the maths and its implementation, though I accept that others would vehemently disagree iwth me.


It makes it easy to type, but does it make it easy to read/maintain?

As they say, it's already way too easier to write code than to read it. A programmer should make every effort to make code more readable.

An editor should make it relatively easy to enter special symbols (especially if you can specify a limited set); it is totally solvable problem.

An editor can only help you so far with reading code...


For me, the biggest problem with the whole maths ---> code mapping is that of nested brackets. For a random quick example, I mean things like

    (a*(1+exp(1i*theta[0,:]))+foo(x))/((b-cosh(bar(x))+(b+sinh(y[:,-1])). 
The sort of thing where it just _looks_ far nicer on a page with like _real fractions_ -- where missing a bracket or changing the order of two brackets can _totally_ bugger you. Yes, changing the form of the expression a bit can make it "far nicer" or simpler -- for example, by defining intermediate terms -- but sometimes there's something to be said for <expression x> matches <equation y> in the paper.

Another example where I think ASCII actually _is_ limiting is for entries of matrices directly. I mean, we try, but I'm not convinced we succeed. For an example (picked at random), hands up if you think this is a nice Euler angle transformation, where cphi/sphi etc are the cosine and sine of phi? No matter how you write it, it's going to be ugly.

   alpha = [cpsi*cphi-ctheta*sphi*spsi   cpsi*sphi+ctheta*cphi*spsi  spsi*stheta;
            -spsi*cphi-ctheta*sphi*cpsi  -spsi*sphi+ctheta*cphi*cpsi cpsi*stheta;
            stheta*sphi                  -stheta*cphi                ctheta];


> a nice Euler angle transformation

I'd probably go with something like:

  double cf = cos(phi),sf = sin(phi);
  double cp = cos(psi),sp = sin(psi);
  double ct = cos(theta),st = sin(theta);
  alpha = [ cp*cf-ct*sf*sp   cp*sf+ct*cf*sp   sp*st;
           -sp*cf-ct*sf*cp  -sp*sf+ct*cf*cp   cp*st;
            st*sf           -st*cf            ct];
but I agree it would look better with:

  alpha = [ cψ*cϕ-cθ*sϕ*sψ   cψ*sϕ+cθ*cϕ*sψ   sψ*sθ;
           -sψ*cϕ-cθ*sϕ*cψ  -sψ*sϕ+cθ*cϕ*cψ   cψ*sθ;
            sθ*sϕ           -sθ*cϕ            cθ];
Either way, I don't think Euler angles are ever going to be "nice".


In Raku you can also use × (U+D7) for multiplication

    # this assumes that ϕ, ψ, and θ have already been set

    my \cϕ = ϕ.cos; my \sϕ = ϕ.sin;
    my \cψ = ψ.cos; my \sψ = ψ.sin;
    my \cθ = θ.cos; my \sθ = θ.sin;

    my \alpha = [ cψ×cϕ−cθ×sϕ×sψ,   cψ×sϕ+cθ×cϕ×sψ,   sψ×sθ;
               −sψ×cϕ−cθ×sϕ×cψ,  −sψ×sϕ+cθ×cϕ×cψ,   cψ×sθ;
                sθ×sϕ,           −sθ×cϕ,            cθ];
I'm not sure if using × helps or hurts in this case since I'm not really experienced in this area.

These all work because Unicode defines ϕψθ as "Letter lowercase"

    say "ϕψθ".uniprops;
    # (Ll Ll Ll)
---

I would like to note that I used the Unicode "Minus Sign" "−" U+2212 so that it wouldn't complain about not being able to find a routine named "cϕ-cθ". (A space next to the "-" would have also sufficed.)


> × (U+D7) for multiplication

Blech; looks like a letter and normalizes cross products. Better to use "·" (U+B7)[0]:

  alpha = [ cψ·cϕ−cθ·sϕ·sψ,  cψ·sϕ+cθ·cϕ·sψ,  sψ·sθ;
           −sψ·cϕ−cθ·sϕ·cψ, −sψ·sϕ+cθ·cϕ·cψ,  cψ·sθ;
            sθ·sϕ,          −sθ·cϕ,           cθ];
Minus sign is a nice-to-have, though.

0: Also "∧" (U+2227, wedge), the real other vector product[1], but that doesn't matter for scalar multiplication.

1: http://en.wikipedia.org/wiki/Wedge_product


It would be easy to just make `·` an alias. Then that code would work.

  my &infix:< · > = &infix:< × >;
(After all `×` itself is just an alias of `*` in the source for Rakudo.)

If you need more control you can write it out

  sub infix:< · > (+@vals)
    is equiv(&[×])       # uses the same precedence level etc.
    is assoc<chaining>   # may not be necessary given previous line
  {
    [×] @vals            # reduction using infix operator
  }
I made it chaining for the same reason `+` and `×` are chaining.

---

I don't know enough about the topic to know how to properly write `∧`.

It looks like it may be useful to write it using multis.

  # I don't know what precedence level it is supposed to be
  proto infix:< ∧ > (|) is tighter(&[×]) {*}

  multi infix:< ∧ > (
    Numeric $l,
    Numeric $r,
  ) {
    $l × $r
  }

  multi infix:< ∧ > (
    Vector $l, # need to define this somewhere, or use List/Array
    Vector $r,
  ) {
    …
  }
If it was as simple as just a normal cross product, that would have been easy.

  [[1,2,3],[4,5,6]] »×« [[10,20,30],[40,50,60]]
  # [[10,40,90],[160,250,360]]

  # generate a synthetic `»×«` operator, and give it an alias
  my &infix:< ∧ > = &infix:< »×« >;

  [[1,2,3],[4,5,6]] ∧ [[10,20,30],[40,50,60]]
  # [[10,40,90],[160,250,360]]
Of course, I'm fairly confident that is wrong.


I believe it is most definitely easier to read/maintain -- restricted to that small domain-specific function. Some greek letters have very specific meanings, replacing them with long-ass names straight up worsens readability. Writing out density, velocityX, etc will quickly fade the core logic of that function.


I don't think that there is a single example of a greek letter which has been universally adopted by all of the different engineering and mathematics communities to have a single non-overloaded meaning.

The use of greek letters in academic writing has a single purpose: Compact notation. It has nothing to do with abstract readability.


> The use of greek letters in academic writing has a single purpose: Compact notation. It has nothing to do with abstract readability.

The compactness of the notation is a major factor in overall readability. Long variable names make it harder to see the overall structure of an expression. When operators, numerical constants and parentheses mostly are represented with one or two characters each but variable names are all much longer, then the only thing you can tell about a long line of code at first glance is what variables it uses as inputs. Mathematical pretty printing can help with some operators (eg. fractions) and make it more visually apparent what's being grouped together by parentheses, but even then long variable names will still detract from the ability recognize the structure of an expression.


Greek letter abuse rarely happens in CS but there are a few cases. Try to guess what "lambda calculus" and "pi calculus" are without looking them up.


Maybe there could be a middle ground. For math-heavy applications, the editor could implement ligatures same as Fira-Code and the latex name would render as the symbol until you put a cursor on it.


Why not go all the way, and write ω₀ :-)

I don't often write complicated formulae, but I do find it easier to use the original symbols if possible. Its one less thing to think about when comparing the source code with a mathematical description of the algorithm.


believe it or not but the ₀ (u2080) is not allowed in Python, it's not for a lack of trying :)

for some reason the entire Unicode "No" (Number, other) category is not included in the admissible character list for identifiers


I disagree. Instead of writing one letter variable names, one could do the hard work of naming things properly, which software developers often do. Instead of going with one letter or the name of that letter as variable name, which tells your fellow non-mathematicians exactly nothing about what it contains, one could put in the effort and make the code readable.

Using these one letter variables forces the reader of the code to either be already familiar with the concept, which that letter refers to, or keep the whole calculation in their head, until the final result, to hope, that they then can make sense of it. It is implicitly requiring outside knowledge. It is shutting out many potential readers of the code.

It might still be saved/helped though, if good prose comments are given, that help the reader understand the meaning of the one letter symbols. Whenever I have seen this one letter code salad, I have seen few if any comments, as if the author of the code expects me to magically know, what each symbol stands for.

Lets say omega was a vector of weights. Why not name it "weights"? That is much better naming than "omega" or that character, that most cannot easily input and need to copy paste from somewhere.


> Using these one letter variables forces the reader of the code to either be already familiar with the concept, which that letter refers to, or keep the whole calculation in their head, until the final result, to hope, that they then can make sense of it. It is implicitly requiring outside knowledge. It is shutting out many potential readers of the code.

When you're writing code for physics, then yes - absolutely: It is intended for other physicists, and the prerequisite is precisely that you have outside knowledge. When they use a symbol like ℏ in a journal article, it is expected that the reader knows it is Planck's constant[1]. Why should they have a different expectation when writing code? To a physicist, -ℏ^2/(2 * m) is a lot more recognizable than -planck^2/(2 * mass).

To be clear: The chances that a person who doesn't recognize these symbols will need to read and understand your code are virtually nil.

Within a context, single character symbols are very useful. To take a different example, what if I insisted that people should write:

2 add 3 add 7

instead of:

2 + 3 + 7

Would anyone reasonably argue that the former is more readable? We all accept that it's OK that a reader is familiar with the '+' sign. While ℏ is not readable to most, it is likely readable to anyone who is expected to read the code.

The thing that really is annoying when writing physics code is the need to explicitly write * for multiplication, and not being able to write fractions the way you would on paper.

[1] divided by 2π


I like the rule of thumb that an identifier in code should usually be longer and more specific as scope increases. The main exception is that if your problem domain has well-established terminology or notation for some concept, reflecting that as closely as possible in the code is often best. So, I personally have no problem with someone using short identifiers and concise notation if it’s either used very locally or idiomatic.

For example, suppose we have a function whose job is to calculate two vectors of weights and do something unless they are equal but opposite. Personally, I would rather read something like

    v = someCalculation()
    w = someOtherCalculation()
    if (v ≠ -w) {
        doSomething()
    }
where we use short names for the local variables and vector-aware unequality and negation operators, than read something like

    weights1 = someCalculation()
    weights2 = someOtherCalculation()
    if (not(vector.equals(weights1, vector.negate(weights2)))) {
        doSomething()
    }
The longer but still arbitrary names for the vectors add no value here and the longhand function names for manipulating them are horrible compared to the short and immediately recognisable vector notation.

In general, I think there could be advantages to using a broader but still selective symbol set for coding, as long as we also had good tool and font support to type and display all of the required symbols. I would be hesitant about including Greek letters in that symbol set, but that’s because several letters in the Greek alphabet look similar to other common letters or symbols, not because I think calling a value ω_0 is a bad idea if that’s how practitioners in the field would all write it mathematically.


Mathematicians have been thinking about nomenclature for a long time - certain fields have very specific usage of symbols. I see no evil in using these well-known one-letter variables in these rare/small parts of functions.


Sorta. Examples abound of notation that had changed or that is adapted for some fields. Complex numbers using j instead of i, for example.


Formulae simply aren't understood like prose, or more precisely, formulae aren't understood as isolated bits of logic. For plumbing code where the logic is important, descriptive variable names are useful in aiding comprehension of the code. For formulaic code, where verifying the equality of two written expressions (one in code and one in the derivation, possibly on a piece of paper) compactness--so called simplification of equations--is far more important. A formula cannot be understood or verified in isolation by itself: as you say, it can only be truly understood by someone familiar with the content. It is by definition, arcane. But the use of arcane variables aids the practitioner since it leads to more compact expressions which makes it easier to check against another source. You cannot eliminate this arcane aspect from formulae simply by spelling out the names of things.

  G_{ab} = 8 \pi T_{ab}
does not benefit from being converted in code to,

  einstein_tensor[index1][index2] = 8 * PI * stress_energy_tensor[index1][index2]
For a practitioner of physics, this is just unhelpful and takes more effort to comprehend. In the spirit of The Humble Programmer, we have unwisely spent the mental energy of the reader. If you actually expand out the Einstein tensor in terms of the Levi-Civita connection and spell it out instead of using the canonical symbol \Gamma and respectively g for the metric tensor then you will just make an incomprehensible wall of text--especially if you inline the summation. What is added by expanding the variable names? A practitioner has gained nothing and lost familiarity and compactness while a layperson has learned nothing about general relativity except the names of the objects. Don't apply best practices uncritically: they are always premised on context.

That being said, formulae would benefit greatly from editors that allow the formula to be visualized next to the code. Sort of like compiling latex; if IntelliJ or some other IDE would actually render the code as a formula in some pane next to the code, that would be the greatest benefit to comprehension of a formula.


In the example, you are not really naming your variable usefully. You are merely transcribing from a one letter variable to its abstract name.

You can do that in a general function, where the general concept is expressed, where it would still help me to understand your code and search for concepts and terms online, in case I do not understand what is going on.

However, in many contexts it wont be merely an "einstein_tensor", but something that has a meaning in the specific context. Ask yourself what you are doing with that einstein_tensor. What is it used for? Is there a real-world equivalent to the thing you are looking at in the code? Those are the names you should choose in a non-general context and that is why naming things is hard.


Descriptive names will never help you understand the implementation of the business domain of math code. From the Domain-Driven design perspective, the symbols are the ubiquitous language which you should stick to.

Here is a more appropriate example, spot the bug in this Tolman-Oppenheimer-Volkoff equation:

    let radial_derivative_of_pressure = - (pressure + energy_density) * (4. * PI * pressure * radius_squared + mass_potential) / (radius_squared - 2. * mass_potential * radius)
vs (with the same bug)

    let dp_dr = - (p + rho) * (4. * PI * p * r2 + m) / (r2 - 2. * m * r);
Of course you have to look up the TOV equation to do so, even most domain experts would need to compare to a formula. One of these is much easier to compare and it isn't the spelled out version.

Your questions are unhelpful. It is merely the radial derivative of pressure. It is going to be used to be passed to a general purpose ODE integrator which just needs the radial derivative of pressure to integrate pressure for some range of radii. There is a real world equivalent: dp/dr; it is a known entity with exactly that symbol. Naming is not at all hard in this case: dp_dr or dpdr even dp_by_dr. Any other choice and you are just creating problems due to uncritical application of the belief that variable names should be descriptive while ignoring the fact that there is a well known ubiquitous language to describe these entities.

The closest thing in other business domains is common acronyms. Nobody is going to spell out GDPR in the implementation of their cookie banner. Nobody spells out HTTP, or JSON, or XML. Spelling them out won't help anyone trying to read the code, it just creates line noise.


Well yes, clarity and good practices can definitely help but to be fair, software developers might often do this:

(_ < 5)

Instead of this:

(e: Int) => (e < 5)

It’s just a trivial example so I understand if someone wants to have the same comfort. I am not a mathematician so I am not sure what is best here so I’d cut them some slack.


A variable named ω is no better than a list named l. Both are bad. Be explicit. Terse, but explicit.


Part of the problem here is that a variable in mathematics is a different concept from a variable in programming.

Mathematicians are used to using one letter variables. Using a set of unwritten naming rules, mathematicians almost always know what concepts the variables are referring to (in a mathematical context).

So, I'd say if the code is only to be maintained by mathematicians, one letter variables for mathematical concepts only, would be an advantage.

The danger comes when you start using them for everything.


<disclaimer>Not a mathematician, but took enough Ph.D. level classes to be dangerous with bad proofs.</disclaimer>

> Using a set of unwritten naming rules

Simplest being things like using upper case Greek for certain sets and lower case Greek for its elements. Or, in Statistics, using Greek letters for parameters and English equivalents for their estimates.

So, if I were programming something to do something in that realm, I would write:

    for my $ω ($Ω) {

      # ...
    
    }
where what Ω represents would be clear from the subfield of Math that is relevant to the program.

Then, of course, you get people like one of my former professors who liked to invent notation on the fly and would quickly and up cycling through Greek (π, Π), Hebrew (פ), Blackboard (ℙ), and reach out for Sanskrit (?). (Symbols used here for illustration, this was a long time ago and we didn't even know if he was correct as we had no way of checking with no intarwebs back then.)


I'm also under the impression that mathematicians generally work much more deeply with a smaller set of variables, so you actually don't need to name that many individual concepts.

A typical program contains thousands and thousands of "named things", so you're naturally going to see a proliferation of names. That just doesn't seem to be that necessary in math once you're working in a particular context (e.g. statistics).


Translation: i learned my domain’s variable names, so you should have to as well.

Currently learning some dsp. One of the biggest barriers to entry has been the inscrutable variable names inherited from the field’s math connections.


Feel free to reason about a complex statistics formula with full-blown variable names. There are reasons for short variable names. Pass into a specific function a descriptive name of what you do, but do use the mathematically "accepted" writing mode in the implementation of a well-known function.


> There are reasons for short variable names

Which are? I suspect the reasons are a combination of the price of paper and ink, the history of teaching using chalkboards. None of those mean we can’t make the canonical version of an equation the expanded representation.

Why do we all know e=mc2 and not energy = mass * lightspeed^2.

The broader the base of people that have to interact with a formula, the less mathy the terminology tends to be. Think of trigonometry. Instead of greek letters the sides of a triangle and the functions get real world names. Sine(angle) = opposite / hypotenuse. Sure when you write that out you use sin ϴ = O/H but that’s just a compressed representation of human-readable variable names.


How about Gauss’s distribution’s density function? Or a numerically stable version of a simpler formula? Also, you forgot the usual case of implementing a whitepaper with mathematical notation. It is much easier to proof read that, when you copy largely the same in your program as well. And if you pass in the readable energy, mass, etc variables, inside the function you can use the domain-specific mathematical notation. That way it will be readable both outside and inside.


I think we might be arguing different points. I'm not saying use different variable names from the canonical math version - I'm saying the canonical math version should use variable names instead of symbols.

So per your point - I'm not qualified to re-define mathematical notation so I won't go very far with it, but taking the formula you mentioned, Gaussian distribution density function (not focusing on the fact that I had to re-write it in pseudocode to represent it in a text format):

    f(x) = 1/(σ * sqrt(2*pi))*e^(-1/2*((x-μ)/σ)^2)
I would suggest a couple changes:

- change σ to `standard_deviation`, or if you don't like snake_case I could handle `stddev`

- change μ to `mean`.

- change x to `input` - there may be a better name than that, and x is pretty widely used in math so I'm not married to this one.

- e should probably stay the same - e means e no matter the mathematical context, while μ means different things in different fields of math.

    f(input) = 1/(standard_deviation * sqrt(2*pi))*e^(-1/2*((input-mean)/standard_deviation)^2)
This doesn't make it more or less readable, but it does mean I don't have to _just know_ or look up that μ is `mean` to parse it.


I accept your position, and perhaps my examples for a difficult formula was not difficult enough — my point is more along the line that the underlying structure of a formula becomes more easily visible with shorter symbols — and for syntactic manipulation one can more easily see patterns emerge.


Depending on the context, ω may be as explicit as it gets.


But "everyone" doesn't have the same keyboard nor does everyone speak the same language. ASCII is not a universal character set and treating it as such is nothing short of cultural imperialism: "If it's good enough for us, it's good enough for everyone".

Artificial limits on language and characters sets might sound simple to you but it introduces a lot of complexity for others. Unicode code solves that problem with not too much overhead.

Also, just to point out, this assumption is exactly why diversity in tech matters so much.


English is not my first language and my native language doesn't use Latin script, still all the keyboards that I ever saw supported entering ASCII characters. As far as I know all the keyboards that are in active use support English layout.

We shouldn't dismiss this as "cultural imperialism". Instead we should use this to our advantage. Currently source code written in China or in Russia can be read by developers in India or in the US. This is amazing. Let's not forsake it!


Agreed. Growing up in a country and at a time with personally accessible computers and very little literature, I was able to learn stuff that withstood the test of time and did not subject me to switching costs.

Already, differences in menus and keyboard shortcuts make the skills of trained office workers less relevant when they immigrate.

We do not need more barriers to trade.

Some related thoughts:

https://www.nu42.com/2013/04/translation-of-programming-term...

and

https://www.nu42.com/2014/08/replacing-hash-keys-with-values...

The proponents of translation tend to believe what they themselves or other translators will be clear to people just learning the concepts. That is most often not true.

> I am willing to bet the sentence “Çözümü SCALAR bağlamda veri döndüren scalar() fonksiyonunu kullanmaktır” does not make any more sense to a Turkish speaker who speaks no English than “The solution is to use the scalar() function that will create SCALAR context for its parameter.”

> Translation and hash-lookup are different things. If you want to convey meaning, you have to have a command of both languages, and the subject matter. Without that, you are only going to add to the word soup. Translating “big event” as “büyük okazyon” helps no one.


The availability of alternative commercial keyboards is possibly not the best metric to use to judge actual need/desire. There are a large number of non-ASCII layouts that can be mapped on top of a standard QWERTY[0] which show that ASCII is not a valid assumption for a modern general purpose programming language.

The idea that we should all be grateful for the unification that using one language brings is a sentiment that reinforces the idea that this is cultural imperialism. There are definitely benefits to conformity, no doubt, but imposing it unnecessarily is a design decision that should be questioned. Why take away options from people who might value that freedom?

[0] - https://en.wikipedia.org/wiki/Keyboard_layout#Keyboard_layou...

[1] - https://en.wikipedia.org/wiki/Cultural_imperialism


> There are a large number of non-ASCII layouts that can be mapped on top of a standard QWERTY[0] which show that ASCII is not a valid assumption for a modern general purpose programming language.

I have seen multiple "foreign" keyboards. They all have ASCII labeled on each key. Using the QWERTY layout demonstrates the dominance of ASCII, which is a valid assumption that modern programming language can reliably rely on.


Please lookup in Image Search a Russian keyboard and a Japanese keyboard. What do they have in common? Right, besides the respective alphabets they have QWERTY Latin layout on them. Basically all of the keyboards can be used to input ASCII symbols out of the box.

I don't really care whether you call this cultural imperialism or not. What matters is that a whole industry can use the same standard worldwide. It's the same as metric system. We already have to deal with imperial system, imagine how much more painful would it be if every country used its own measurement system or its one calendar.


Agreed. My first language didn't use latin script--it didn't even read left-to-right. But I'm not offended that programming languages all do use latin script and a constrained character set.


>> The best thing about using only ASCII for the main syntax is that everyone can type it with their keyboard.

Well, some keyboard layouts make it harder though. I have spent a considerable amount of time trying to teach programming constructs over Zoom to budding developers in Japan over the past two years. The placement, access mode, and actual typing of characters such as `{`, `$`, `~`, `|`, `:` etc has been the biggest stumbling block during these sessions.

So, there the subset of ASCII that is equally easy to type on all keyboards is smaller than the full range of non-control characters.

> But "everyone" doesn't have the same keyboard nor does everyone speak the same language.

I like that Vim's digraph feature lets me solve the problem in my editor without having to rely on the keyboard layout or OS level preferences. So, typing these lines:

    my $μ = "İstanbul'da hava çok güzelmiş";
    say uc($μ);
takes the exact same keystrokes regardless of the OS/environment I am in:

    m y CTRL-K m * = "CTRL-K \ I s t a
On my own machines, this has the advantage of not having to switch languages in the act of typing (although Win+SPACE is pretty easy on Windows, cycling through the five I have installed is not trivial). And, do I really remember where ø is on the Danish keyboard as opposed to where ö is on the Turkish keyboard?


> ASCII is not a universal character set and treating it as such is nothing short of cultural imperialism: "If it's good enough for us, it's good enough for everyone".

SI units like meters and kilograms are also cultural imperialism and we should return to diversity of units, ideally different one for each town. /s


Yeah, go start using metric in the US.


That's the inverse. What we're talking about with keyboards is the opposite, some are advocating for more division. That's like starting a movement for scientists worldwide (who unflinchingly use metric even in the US, as far as I'm aware) to ditch metric and start using the standard units that were historically common in their culture.

And for what it's worth, I've never talked to a person who doesn't agree that the US using metric would be great, the problem is that getting over the inertia of the existing casual measurements is extremely difficult because even when metric is on the label, imperial is the emphasized one (the entire traffic sign infrastructure, the vast majority of cooking implements, decades of cookbooks, most food packaging, medical records systems, drivers licenses, advertising campaigns, the personal fitness ecosystem, entire product names and trademarks). 60 years ago maybe it could be doable, but at this point it's a bit of a lost cause without very much gain.


And you don't see how "inertia" actually applies to both of those directions?

The nuances if the states could or should switch are irrelevant, but the response demonstrated the point perfectly.


Railway track gauges came to mind.


I remember reading an article that the width of railway track was set by the width of chariots.

I don't remember much about the article except it ended with some quip about the width of a car was determined by a couple of horses asses.


Not sure why you're downvoted; it's simply true that ASCII is not universal. The "A" in ASCII stands for American.

I am a monolingual English speaker but I think I should be able to write ÷, ≤ and ≥ in my Go code, and I think I should be able to do maths using π.


Given that the go community does not seem to hold this as a majority opinion, you might consider whether an editor translation layer could allow you to tailor your environment to your preferences while maintaining source-level compatibility with the broader go community.

If compatibility is not as important to you, privately forking the language is also an option, but that seems fraught with peril of your code dying with you.


GoLand actually does this.

But the point is not my preference to use the actual, proper symbols, but rather the technically unnecessary decision forced upon me that says I cannot.


> The "A" in ASCII stands for American.

Just call it ISO 646-IRV instead. ;)


Raku

    say π ≤ 3+⅘ ≤ τ
    # True

    say π² ÷ 4
    # 2.4674011002723395


Fine do math's using π this is about programming


This is just the English/Esperanto problem again. Do you do the thing that's the most fair or do you do the thing that's unfair but has the most benefits for everyone involved? Yes, it's unfair to limit people to ASCII but it's also the only way their code is going to see widespread engagement.


> it's also the only way their code is going to see widespread engagement

Wow, what a claim! I can see how this may have some anecdotal evidence behind it since programming has had such a strong English bias but it's conjecture to say that this is the only path. Not only does that claim lack evidence, it also lacks imagination. The future doesn't have to follow the same patterns of the past.

The technical constraints of ASCII have long been irrelevant and its only the cultural imposition that remains. While this has been the case the size and importance of computing has exploded everywhere (e.g. 4 billion people worldwide use the internet).

Can you really say with confidence that global "widespread engagement" is (and will always be) a desirable property for code projects? Is software produced in this way and at this global scale really likely to be better quality for everyone?


That's the brilliant thing about programming: you can do it however you want. If you want to write your code in such a way that an Anglophone spends half their time looking up ALT key combinations to type the characters, go ahead. I'm just saying that I'm not touching that code with a ten foot pole, nor is anyone I know, nor would any of my customers. There's too much talent in the world for me to put up with one iota more hassle than necessary.


What's the purpose of supporting other character sets in a language's syntax? Good luck finding contributors from all over the world for a software written in Turkish characters. I mean for hobbyists or very special purpose cases where such a thing is really needed (writing code in Turkish or Chinese or Arabic or whatever), people can add language support for that (after all it's just a bunch of keywords). It needs some effort but not impossible. People can also design programming languages with their own special character sets. It's wild to see someone objecting writing software in ASCII because of diversity/cultural issues. Really wild!


Not sure about Turkish, but I know for a fact that when you have a team of Chinese devs work on a Chinese software project Mandarin starts to show up in the codebase. And while usually it's written in pinyin (the standard Chinese romanization system) since most programming languages assume ASCII, it's obvious that those people could benefit from being able to use Chinese characters in place of ASCII and English.

If your argument is "sure, then let them invent/fork $LANGUAGE to suit their needs", just take note that if you're a language designer/developer that wants to see the language widely adopted, it's detrimental to have the attitude that ASCII is enough for everybody. There's no reason to have (for example) "Arabic-python" just because somebody insists on ASCII instead of UTF-8 even though there's little technical reason not to support it.


Non-ASCII characters make sense in national application domains, e.g. tax software, where you don’t want be forced to transliterate (or worse: try to translate) the non-English terminology.


Remind me how well typewriters worked in Japan and China


Let’s be realistic 99% of software, weighted by value, is in English. This is a good thing. There are strong network effects, and fracturing the software world into multiple competing linguo-spheres would destroy economies of scale.

Would you similarly oppose the requirement that all civil aviation globally be done in English on the grounds of “cultural imperialism”?


> fracturing the software world into multiple competing linguo-spheres would destroy economies of scale

There is a very visible pattern in most Open Source projects where a small group of core maintainers will do most of the work even if there is a much wider body of causal contributors (see Nadia Eghbal's _Working In Public_). So, what are the economies of scale? Are you perhaps referring to problems experienced in the by-gone age of the early internet?

Economists largely agree that healthy competition is beneficial for improving quality, spurring innovation and reducing costs. I don't see the problem with multiple language/culture specific software projects that all solve the same problem. This could even look like different programmers working on different projects for different audiences.

The only downside I can see for the everyday programmers is the FOMO generated from inaccessible software in a foreign language. Isn't that ironic?


> Would you similarly oppose the requirement that all civil aviation globally be done in English on the grounds of “cultural imperialism”?

Do you think that every pilot on earth always speaks only English? No.

People should have the choice not to speak English and write their alphabets if they wish.

Enforcing ASCII on code is like enforcing ASCII on URIs, e-mail addresses. Unnecessarily restrictive to the rest of the world outside of the anglosphere. Of which there's a lot.


Yes. All civil aviation globally is conducted in English. Becoming a pilot or air traffic controller anywhere in the world requires proficiency.


It requires proficiency when you want to go or receive international, but doesn't enforce it as the single language that could ever be used, e.g. domestically. There's no point in pretending there's some magic filter that would enforce it.


Diversity? Like what, you want source code to be written in Cyrillic or Chinese? Please elaborate.

ASCII is the standard, and programmers already learned to deal with it. What's the problem?


Why not? English is a convention only, not a law of computing.


The Tower of Babel. If the source code is proprietary or for education then it doesn't matter, but for the open source world it'd mean division onto multiple different languages where no one can understand and learn from each other. No one is going to learn 5 different alphabets and 12 different languages just so they can understand the source code.


There is already open source software written in other languages.

Sometimes identifiers and comments are transliterated into ASCII, sometimes they remain in the original script.

Of course, they won't get many contributions from English speakers. They will get more contributions from speakers of that language. That decision is for the project to make, not others to impose.


> They will get more contributions from speakers of that language.

Will they?


Realistically, English will still the lingua franca for software in the short-term future, and I don't think anyone is saying we should change this. The thing is, for many people "interacting with the worldwide programming community" is not a thing they need.

Imagine a computer class teacher in (for example) China teaching primary school children. Why should they need to learn English before starting to write "Hello world!" (or rather, 「你好世界」)

In an alternate universe the lingua franca of programming languages is Chinese, would you still make the claim that it's better to take away choices and force ALL programmers into using the existing lingua franca regardless of their background?

I'm guessing you'd be crying cultural imperialism because you can't teach your 6 year old kid programming in your native language.


I already addressed education.

> If the source code is proprietary or for education then it doesn't matter

I'm not forcing you to speak and teach children in English. My country actually was under cultural imperialism and people back then weren't allowed to speak or learn in our native language. And you make it sound like a joke, but whatever.


That's also a good reason to standardize all open source code in C.


Overall, Unicode is a good idea. That said, languages should limit themselves to a subset of Unicode to avoid the use of glyphs that look too similar or that have too much variation (e.g. emoji).


Slight typo: It's C++98.


Thanks! I confused it with C99.


> The best thing about using only ASCII for the main syntax is that everyone can type it with their keyboard.

My keyboard doesn't have a key for NUL, BEL, VT, EOT. What kind of keyboard do you have which has buttons for these?


Let's not be so pedantic. I meant visible ASCII symbols, character codes 32-126 plus newline.


With vim at least, you can type them in as control-key sequences (preceded by Ctrl-V).


But then the statement is incorrect. I can write all of Unicode using keyboard sequences as well.

People from US seems to have a strong fetishes with ASCII.


ASCII is an abbreviation for American Standard Code for Information Interchange after all.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: