Why not just write out ProgramCounter, StackPointer, Accumulator, InputFileDescr...

userbinator · on Feb 18, 2020

Is it faster to read? Maybe in terms of bytes/s, but not in lexemes/s nor in terms of getting an overall idea of how everything works.

Compare with this, which is probably more in the style you're thinking of:

https://github.com/dotnet/roslyn/blob/master/src/Compilers/C...

There's so much "noise" that it's hard to see the "big picture", and the repetition of VeryLongIdentifiers causes https://en.wikipedia.org/wiki/Semantic_satiation to occur quickly.

pavas · on Feb 18, 2020

If you're talking about a small, quickly-written, one-off piece of code, then I think truncated variable names are OK.

If the code is anything that anyone else (including future you) will have to read, or a part of a larger system, then descriptive variable names are best.

I can't count the number of times I've dropped into some source code with variable names that didn't mean anything and with no comments describing what they mean.

JustSomeNobody · on Feb 18, 2020

If one understands anything at all about a cpu, then pc, sp, a mean something instantly.

Aachen · on Feb 18, 2020

Agree on those, but t and tk can mean anything. We can symbolize everything and, as someone else argued, once you know tk is token you can just read it, but replacing variable names with the shortest possible names is still called obfuscation for a reason.

I personally also hate the Java (and to a lesser extent, C#) custom to write MemoryLocationRepresentation when you can say pointer, but there is certainly a middle ground. Token is 5 characters, not 30.

Hnrobert42 · on Feb 18, 2020

Consider non-native English speakers. For them, the abbreviation makes it much harder to read.

GuB-42 · on Feb 18, 2020

I am not a native English speaker. Reasonably skilled, but nowhere near native.

Abbreviations don't make it harder because of it. If anything, it is less of a problem. Because using the proper English word doesn't help more than using an abbreviation if you don't know the meaning of the English word in the first place.

On a side note, I have more trouble understanding code written in French (my native language) than in English. Simply because when we learn programming, we learn it with the English terms. For example, we know what a "token" is in the context of a "parser", that's how we call it. The french translation would be "symbole" and "analyseur syntaxique" respectively, but you will be better understood if you use the English words.

psychoslave · on Feb 19, 2020

>Because using the proper English word doesn't help more than using an abbreviation if you don't know the meaning of the English word in the first place.

If you don't know the meaning of an English word, you can use a dictionary. If you don't know the meaning of some ad hoc abbreviation, unless you can waste even more human time by asking at people who already are in the secret, you are left on your own.

> On a side note, I have more trouble understanding code written in French (my native language) than in English.

USA soft power is strong, that's it. It's people duty to take care of better mastering their own languages if they don't want to see it ineffective in their daily linguistic needs.

People know what a token is in the context of a parser, only after they learned it. When this is not the learner native language, they will learn it most likely without having a clue of how it makes sense in the semantic network of English. If a French is first introduced to this notion using the term "lexie" (which also exists in English by the way, as a borrowing from French to English in linguistic this time), chances are far greater that it will evoke something meaningful to this person, as it's lexically close to the term lexic. Using French morphemes, one could also easily produce terms like métataxeur[1], or even distaxeur and transtaxeur.

>but you will be better understood if you use the English words.

Chance are greater that they will see what you are referring to as they already crossed the term before more often. It doesn't necessarily imply that they will better understand what it means. When a notion is well assimilated, it's recognized in any language mastered, even when it's expressed under a bright new metaphor.

[1] see https://fr.wiktionary.org/wiki/m%C3%A9tataxe and https://fr.wiktionary.org/wiki/-eur

schoen · on Feb 18, 2020

Was there a period in the 1960s or 1970s where French speakers used native terms instead of English for computing terminology?

I'm wondering about this because a Brazilian friend is doing a computer history project and he noticed that 1970s documentation used literal Portuguese translations of English technical terms, and the translations are no longer transparently comprehensible to present-day Brazilians because of the subsequent switch to using the English terminology. For example, the documentation refers to a "montador", and he had to translate that into English for his Brazilian audience ("assembler").

userbinator · on Feb 18, 2020

If they don't speak English, it matters even less...

(I've read code written by Chinese --- variables named dzhq, xljn, etc. are not uncommon. If anything, they like to abbreviate even more.)

LeifCarrotson · on Feb 18, 2020

If they're not fluent in the same abbreviations but have decent English-as-a-second-language skills, they can read Rosalyn style code but not 2-letter abbreviations.

Heck, I can't even read my own 2-letter abbreviations a year later sometimes.

When I write the code, I'm likely coming off reading a paper or datasheet that used certain abbreviations. I might have seen the word "token" so many times in that week so in that moment, I can't imagine what else 'tk' might mean. But it's when I come back a year later off a heat stake project that used K-type thermocouples where seeing 'token' is much clearer.

If those Chinese variables were named DaanZenghQian (sorry, I know my Mandarin sucks) instead of dzhq you might have a chance to translate that into "result of the upper thousands" for whatever that means in your context.

Pretend you're someone who doesn't have exactly the state of mind and background knowledge you have right now. That might be a Chinese person with limited English, it might be your coworker who was working in Delphi instead of assembler in the 90s, it might be yourself with a bit of time elapsed. That's the person who you need to be writing for, not for you in the moment of writing it.

lifthrasiir · on Feb 18, 2020

I have read many Korean codes and while there are lots of Latin transliterations abbreviations were rare.

tkln · on Feb 18, 2020

I, as a non-native English speaker, disagree.

pvitz · on Feb 18, 2020

Because e.g. pc and sp are exactly the abbreviations used in assembler for some decades?

gmfawcett · on Feb 18, 2020

We should probably mention that "e.g." is an abbreviation for the Latin "exempli gratia", and means "for example." ;-)

earenndil · on Feb 18, 2020

You don't have to 'mentally substitute' the actual words. PC, SP, A, etc. are the words themselves. StackPointer is a pointless formalism.

psychoslave · on Feb 19, 2020

Could you also provide the meaning for the other (pointless) abbreviations? :)

GuB-42 · on Feb 18, 2020

Because it make the lines longer and long lines are bad. If it results in a horizontal scrollbar, it is terrible, but even without it, there is a reason papers are often printed in column format and most coding rules specify a maximum line length (often 80, though 120 is becoming popular these days, with big wide screen and all that).

So long lines need to be split. Which is difficult to do properly and results in more lines, and more lines mean less of the code is visible at once and that makes it harder to see the big picture.

But to each his own I guess. Anyway, you can try it out yourself. Just take the code, do the replacements and see for yourself.

psychoslave · on Feb 19, 2020

Started here: https://github.com/psychoslave/c4

But help would be welcome to retrieve the intended meaning for many of variable names that were turned to nonsense, be it a comment here, an issue on the repository, a pull request or anything else.

Minor49er · on Feb 18, 2020

There are often ways of reformatting a line to break it if it's too long that also does not require renaming things. For example, a long list of conditions in an if statement can be broken into one condition per line. Results of comparisons can be put into their own variables. Logic flow can be adjusted and produce the same result. And so on.

Hnrobert42 · on Feb 18, 2020

Are you reading this on an Apple Watch? I still generally use 80 characters out of habit, but given how monitors have grown, 120 or even 140 should be the new norm.

clarry · on Feb 18, 2020

Adding an extra column for code|docs|other context is so much more useful than allowing longer lines for obese identifiers that rarely serve to make a point more clear.

I'll take my four or five columns of 80 chars over two columns of 120-140 chars any day.

32gbsd · on Feb 18, 2020

It takes longer to type and read. All the little seconds fiddling with the mouse, popup menus, hand eye coordination wastes your time and prevents muscle memory. Its hard to reach max throughput with long variable names.

DagAgren · on Feb 18, 2020

Typing speed is absolutely not the limiting factor for programming productivity. If you are actually limited by typing speed, you are doing something very, very wrong.

pavas · on Feb 18, 2020

Besides, this issue has been solved for many years now. Auto-completion in modern IDEs has gotten really good.

psychoslave · on Feb 19, 2020

Humans don't read words letter by letter, you recognize the whole word pattern. Abbreviations are actually slowing you down on this point, at least the first times you encounter each new one. Having a longest but more usual term will take you least time of reading treatment.

Autocompletion will rarely ask you to type more than four keystrokes for selecting any arbitrary long term.

Meaningful terms in context often happen to be far more easier to grep.

Except for sounding far more impenetrable to the lay man, there is not much left to these H4x0r turns. Of course jargon curse is not a prerogative of CS, this is a common spontaneous social behaviour.

cestith · on Feb 18, 2020

Most of these abbreviations are well established. 'pc', 'sp', and 'a' are the names of those registers in many assembly languages.

cr0sh · on Feb 18, 2020

To clarify, you don't usually see just "a" for an accumulator, as there are usually more than one accumulator-style registers in a CPU, and in many cases they are split along byte (possibly word) boundaries.

So you end up with accumulators called "A" and "B", but are composed of registers "AX" and "AY", and "BX" and "BY", with each being one byte (or word) wide; X and Y being high and low bytes/words of the register (and dependent on "endian-ess" too).

Sometimes you even get where multiple registers can be referenced by a singular name - "D" is a popular choice, and may be made up of "A" and "B" (being low/high "registers" of the larger word). IIRC, the 6809 was like this (?) - A and B were 16 bit registers, but could be referenced as a 32-bit word "D" (or maybe I am thing of the 68k or some other architecture - it's been a long while).

The only other time I have ever seen singular letters used for registers in assembly was for very old pre-microcomputer systems (beasts like the Univac and System/360 - though I think the PDP-8 had similar style). Also some of the very early "microcontrollers" (which were more like glorified sequencers with some extra memory and rudimentary branching, if any) had similar "registers" (Radio Shack once sold, as a part of their "Science Fair" electronic kits, a "Microcomputer Trainer" that was something like a very small 4-bit microcontroller with 128 bytes of memory or something like that - to teach assembler and a bit of hardware interfacing - it had "small" registers like that referred to in single letters).

gmfawcett · on Feb 18, 2020

The 6502 is still in production, and has single-character register names (A, X, Y, P, S).

cestith · on Feb 20, 2020

The 8080 had A, B, C, D, E, H and L. These mostly carried over to the 8085. Newer chips have ax/al/ah, eax, rax type names the grew out of the original names. The Zilog Z80 and Sharp LR35902 were mostly 8080 compatible.

The MOS 6502 has, as gmfawcett said, single-letter names. These in turn carried over to Western Design Center (WDC)'s 65C816. There are actually separate instructions for loading and storing in A, X, Y and Z at least on the '816. LDX, STX, and so on. This means the Ricoh 2A03, Ricoh 5A22, Hitachi 6309, MOS 8501, MOS 8502, and the later MOS 65xx series and the CSG chips. A fun fact is that the 6502 had especially fast access to its zero page memory and special instructions for some functions on that page, the first 256 bytes of RAM. Language implementers sometimes made up for the dearth of registers by treating certain addresses in the zero page as additional registers.

The Motorola 6800 had two accumulators, A and B. The stack pointer was merely S. X is the index register. It also treats the zero page specially. The 68000 series broke with this, having eight address registers a0-a7 and eight data registers d0-d7.

All of the above used A as an accumulator at least by convention in the materials.

SP is the literal name of the stack pointer on x86 in 16-bit mode. It's also used as an alias for R13 in at least some Arm (AArch32 on v7 and earlier for example). SP and PC are the stack pointer and program counter on the PDP-11. It's aliased to r1 on the Intel 80960 (i960) since that is the stack pointer on that platform.

The PDP-8 used similar zero-page tricks to the MOS 6502, only given that it had one (1 !!!) register, that was necessary.

All of these processors where CPUs for commercially successful systems. They might "only" be microcontrollers today.

The MOS 6502 / 6510 and its variant the WDC 65C816 was in the Commodore 64, Commodore PET, the Vic-20, the Apple II, the Atari 2600, the Atari 400/800/600XL/800XL/1200XL/800XE/65XE/130XE, Nintendo Famicom, SuperFamicom, the NES, the SuperNES, BBC Micro, Ohio Scientific Challenger 4, Atari Lynx, Apple III, Apple IIgs, Acorn Atom, Acorn Electron, Franklin Ace, and loads of clones.

The Z80 was in most Amstrad models, in the original TRS-80, the MSX standard, VTech Laser, Intercompex Hobbit, Mattel Aquarius, the Microbee, the NEC PC-6000 & PC-8800 series, Sinclair ZX line & Timex Sinclair, Coleco Adam, and again a bunch of clones.

The Motorola 6809 was in the Tandy Color Computer, while the smaller CoCo MC-10 used the 6803. A few other companies built around this chip family, too.

The Commodore 128 featured both a 6500 series processor and a Z80.

Several of these processors still have versions produced in 2020, although they're not for your main desktop or your phone. Several of them are targets for emulation or new hobbyist software due to the popularity of their platforms. And yes, some of them are used as microcontrollers. Microcontrollers need code written for them, too.