The English Programming Language

throwaway1492 · on July 12, 2023

As someone who worked on a 35 million loc cobol and jcl system for several years in the late 90's, I don't find this to be funny at all. Cobol was purported to be readable with english language like syntax to make it more understandable. Giving gems such as:

    MULTIPLY TAX-RATE OF STATE(43) BY BALANCE GIVING SALES-TAX-AMOUNT.

Note the period is sigificant.

sharmi · on July 12, 2023

There are multiple responses to the above parent noting how readable the above line of code is and yet how it might go wrong.

The first pitfall of using english as a programming language that occurs to me (as a totally Cobol ignorant person):

* Human languages tend to be nebulous around the edges and fluid, often with single word taking up multiple meanings and same function done by multiple words. There are multiple ways to express the same concept. This allows for the language to change and evolve with the times.

* OTOH programming languages need to be specific and exact for the computer to be able to interpret it and ensure it functions as expected across devices and over time.

This means to make English function as a programming language, we will have to take the existing language, whittle down most of the senses and many words, assign one function to one word and use a very trimmed down and reduced version of English.

Now you will have to know two forms of English * The human one * The computer-compatible one

Worse, our human version of the language can often trip up our computer-compatible one. I imagine even debugging would be harder, because when you look into the code, the brain would see perfectly good english and not register any issue with punctuation or tokens as expected by the interpreter.

Consider the below version.

   MULTIPLY TAXRATE FROM STATE(43) WITH BALANCE GIVING SALES-TAX-AMOUNT.

I have made 3 changes here, which might or might not work with COBOL (I am Cobol ignorant ;) ). But if a person tries to find the changes or debug, it would difficult for the human brain to register what is wrong as this is perfectly good english.

schoen · on July 12, 2023

> I have made 3 changes here, which might or might not work with COBOL (I am Cobol ignorant ;) ).

I've never used COBOL before, but this thread and your post made me curious and I installed GNU Cobol and found a sample program that demonstrates a few language features.

After playing around with it a little, I think I've understood that "OF" is like a struct element (or object property) accessor, the parens are an array index, and the hyphen is a syntactically significant part of the identifier. "BY" is apparently a mandatory part of the multiplication operator.

Conclusion: each of your three changes (while, as you said, perfectly reasonable from the English language point of view) would indeed break the COBOL code!

By the way, I remember this being a bit of an issue when I tried programming in HyperTalk: on occasion I would unintentionally come up with an English synonym for some natural-language HyperTalk code, and it wouldn't be valid HyperTalk.

JohnL4 · on July 12, 2023

The existence of Gnu COBOL:

schoen · on July 12, 2023

Furthermore, vim turned out to have a built-in COBOL syntax highlighting mode which was activated when I opened the source first!

p-e-w · on July 12, 2023

It's actually a pretty interesting question why programming languages that work like that are a bad idea.

From decades of experience, I know intuitively that they are, but I find myself unable to formulate a concise description of what exactly is wrong with this approach.

jacobolus · on July 12, 2023

Because they turn out to be just as precisely fiddly as any other programming language, but deceptively appear not to be.

If you treat them as English you'll get burned. You have to still treat them as very precise formal languages where apparently trivial/irrelevant details are significant, and minor hard-to-spot mistakes will break your program.

Having more explicitly formal/structured syntax makes it easier to distinguish the different parts of the language, make sense of the details, and figure out what is or isn't allowed.

The best description I have for languages like this, e.g. Applescript, is that they are "read only languages" (in the opposite direction old-style Perl has been called a "write only language"): That is, if you have an already written program in hand it will be easier for a complete novice to read (at least, single isolated lines of code will be). But writing new programs is a huge pain in the ass.

xypage · on July 12, 2023

My guess would be that it's because spoken languages and programming languages are fundamentally different, so in trying to make one fit the other you might end up with a programming language that looks and reads a lot like English, but it's almost become self obfuscating because now your brain could automatically try parsing it using the rules for English and not the rules for the programming language, making things look like it should work even if it doesn't. The example the comment you're replying to works pretty well, we probably largely ignore punctuation at the end of sentences in terms of actually consciously seeing it as opposed to just inserting a pause in our mental cadence, so if they're suddenly important pieces of syntax in statements that look like English you could easily start messing up the use of it

ryanschaefer · on July 12, 2023

I might offer an alternative perspective.

In my opinion it’s because of english’s evolution over time. Something that might have made perfect sense in the past becomes an antiquated way of saying the same thing in the future. It also deliberately encodes the authors belief about how language should be spoken ignoring any regional variances you see in the real world.

When you abstract away the English meaning of code into something new and unchanging, you provide stability not seen in natural language.

frutiger · on July 12, 2023

Too much ambiguity that needs a large amount of context to resolve.

p-e-w · on July 12, 2023

But why is that a problem? That's literally how we speak every single day.

In fact, given that this is how all the languages that humans are already familiar with work, it's hard to see why this wouldn't be the best approach for constructing a programming language.

behnamoh · on July 12, 2023

Because:

- humans can ask back questions to clarify any ambiguity; computers don't have the capacity to understand ambiguity, let alone asking for clarification.

- computers are trusted to work w/o failure almost 100% of the time. humans err (to be human is to err, anyway), so we'd never put a human in charge of critical things we use computers for.

readthenotes1 · on July 12, 2023

I agree with your sarcastic reply that naturally grown languages are simply horrible for shared comprehension!

For instance, the poem, or should I say program using the example in the post has several different interpretations, and at least some of those teachers would say are flabbergastingly incorrect.

oarabbus_ · on July 12, 2023

It’s quite literally a problem, every day.

alanbernstein · on July 12, 2023

Because "computers aren't human"

jancsika · on July 12, 2023

That is surprisingly readable and makes me want to go learn COBOL.

Could I now get you to criticize eating healthy, please? :)

SanjayMehta · on July 12, 2023

Before jumping in with both feet, please check out the ALTER statement. GOTO on steroids and more.

lcnPylGDnU4H9OF · on July 12, 2023

> The ALTER statement changes the transfer point specified in a GO TO statement.

What could go wrong?

xeonmc · on July 12, 2023

Finalizing in COBOL:

    I am altering the data. Pray I do not alter it further.

SanjayMehta · on July 12, 2023

T-Shirt material which a vanishingly small number of victims will understand.

dr_kiszonka · on July 12, 2023

Fair critique. That's why INTERCAL was so much better.

Turing_Machine · on July 12, 2023

I hear the successor language is much better.

It's called ADD 1 TO COBOL GIVING COBOL.

hunter2_ · on July 12, 2023

I have never read cobol before now, but using another translation in this thread as my guide, and minifying the variable label:

C++

Nice.

ooterness · on July 12, 2023

I'm so excited about the new language features in ADD 1 TO COBOL GIVING COBOL EX TWENTY-THREE.

paddw · on July 12, 2023

I have no doubt English like syntax is a terrible idea but this seems like a surprisingly readable example.

maxbond · on July 12, 2023

Agreed, I rather like this line of code, but I must admit that the C-like version is perhaps better:

    sales_tax_amount = balance * state[42].tax_rate;

(Assuming I've guessed the meaning of the Cobol correctly - perhaps that's the rub.)

(ETA: Added semicolon.)

(ETA2: Fixed critical indexing bug.)

phlakaton · on July 12, 2023

You forgot the semicolon.

Note the semicolon is significant. Barbaric, I know.

throwawaymobule · on July 13, 2023

I think indexes start at 1 in COBOL. I had to look it up though. (aiui, they're called TABLEs and the index can start from any number)

So that'd be state[42].

maxbond · on July 13, 2023

Not all heroes wear capes.

I've put in a PR to upstream. LGTM

quickthrower2 · on July 12, 2023

    #define MULTIPLY sales_tax_amount = balance * state[43].tax_rate; //

i-use-nixos-btw · on July 12, 2023

> There are millions of English libraries available. You can look for the closest in Google Maps.

LOL!

Though I have a bug report. There are actually only 320k. This makes me feel sad.

bdcravens · on July 12, 2023

A number of libraries in the Southern states have biased maintainers who have merged in pull requests deleting a large number of classes in bulk, despite the fact that the tests have passed for years and those classes are in use by a large number of language users.

theandrewbailey · on July 12, 2023

Adding and removing packages based on demand is standard procedure for physical repository maintainers.

paddim8 · on July 12, 2023

States of what?

DriverDaily · on July 12, 2023

Leave two English programs running in isolation for a few centuries and suddenly you need a translation layer…

nathell · on July 12, 2023

> Currently, the only compiler available is Wernicke

Well, there’s Inform 7: https://ganelson.github.io/inform-website/ :)

zabzonk · on July 12, 2023

ooh, inform. before you can work out the english grammar for interacting with the game when the game is running, you have to work out how to interact with the english grammar when writing the grame itself. worst software idea ever, in my opinion.

throwanem · on July 12, 2023

I don't know.

I found it as you describe, but that was after spending decades with more conventional programming languages. Meanwhile, there are folks over in the IF community who, despite to the best of my knowledge never having written a meaningful amount of what you and I would recognize as code, turn out works of complexity I'd never have imagined you could achieve with Inform 7.

It doesn't have all that much value to me, although I've found it fun to play with, but too many people have done too well with it for me to want to write it off entirely. I think we're just no more for it than it is for us.

anthk · on July 12, 2023

Inform6 was better if you wanted megasimple OOP approaches. The game practically wrote itself.

mvuijlst · on July 12, 2023

I admire the absolute confidence here. I find it baffling, but still.

zabzonk · on July 12, 2023

>This is a typical English program that implements the Frost pathfinding algorithm:

although this is one of my favouite programs (or poems), i have never before noticed that it is one quite long poem (or program) terminated by a period - i wonder why Frost did that? possibly a pascal programmer?

fuzztester · on July 12, 2023

That algorithm didn't terminate, although he tried divide and conquer (which was called fork in those days).

fuzztester · on July 12, 2023

I heard he was a separatist, and the path seemed interminable.

freedmand · on July 12, 2023

Never shared this before, but I had a phase where I built satirical landing pages and of them was “Words: The programming language for ideas”

https://readymag.com/dylan/words/

getmeinrn · on July 12, 2023

Cool language! I wish there was more standard interpreter support though. Everyone seems to be using a different flavor, which yields different results depending on where the code runs.

jonathanpglick · on July 12, 2023

The Product team uses a different version than the Engineering team

noman-land · on July 12, 2023

Works on my machine. Have you tried turning it off for a bit?

heywoodlh · on July 12, 2023

Just use the container

K0balt · on July 12, 2023

The main issue with EPL is that compiling or interpreting the language (it supports both paradigms) is extremely resource intensive and incurs significant compute overhead, often unavailable without resorting to fixed infrastructure. Inference requires access to the massive n-dimensional memetic matrix that comprises the execution environment since most of the processing is done in the data structure itself.

Is silicon this implies a large amount of memory and processing capacity, and in carbon the training process takes decades, even though the cost is (amazingly) lower. The big downside of carbon based solutions is that each instance has to be individually trained, so scaling is a serious problem. Not only that, but the success of the training stage is not predictable, and retraining is usually ineffective.

Interestingly, EPL is considered by some to be an exemplary example of the “data as code” paradigm, since the majority of the processing is pre compiled into the n-dimensional memetic matrix.

Even though the mematrix is so massive that resource consumption is high just to move tiny fragments in and out of working memory, it enables the use of simple algorithms to produce surprising inference performance. The main requirement for extracting inference from the dataset is that the extraction algorithm perform statistical prediction of sequential tokens, which is easily accomplished in both silicon and carbon based neural networks.

It is also worth noting that each data output is also code, which is then recursively be added to the mematrix even if the output is an error. This requires careful code hygiene and error checking and correction or inference results can quickly go off the rails.

themoonisachees · on July 12, 2023

To be entirely fair, interpretation on silicon-based processors isn't exactly a supported use case. You can just about make it work nowadays but you're really just emulating carbon-based processors in a box.

drpixie · on July 12, 2023

Unfortunately there is a serious version problem amongst existing English systems. The language does not provide a method to enforce use of particular versions, and there is no universal naming convention to identify "dialects".

While less problematic at small sites, where it will be assumed that all instances are of the same "dialect", this commonly causes problems at large sites and for internationalization.

Conflicts between versions and/or dialects may be limited to identifier conflicts ("colour" v "color"), may involve terms not recognized by multiple parties, terms with conflicting meaning ("chaps"), or terms unacceptable to some parties ("toilet" v "bathroom") which break the "conversation".

PrimeMcFly · on July 12, 2023

This is peak 'nerd' humor. Worse than dad jokes.

thebigah · on July 12, 2023

Make it poetry and it's verse than that.

LeoPanthera · on July 12, 2023

I've often described telling LLMs what to do as "programming in English", and I genuinely think some people have an unusually low opinion of the capabilities of LLMs because they're not so great at expressing ideas and concepts in English.

ilaksh · on July 12, 2023

The original version of my aidev.codes project was a process of refining specifications that would be rerun each time you changed something (picking up where you left off if you just added something). Running meant generating JavaScript code based on the spec which would be executed on page load.

It kind of worked. The biggest challenge was the limitation of gpt-3 or Codex that I was using at the time and the fact that it didn't always return the exact same result. Although with temperature zero it was somewhat stable.

beej71 · on July 12, 2023

It can be tough to unambiguously express ideas and concepts in English.

optimalsolver · on July 12, 2023

I thought this was going to be similar to StupidStackLanguage:

https://esolangs.org/wiki/StupidStackLanguage

theandrewbailey · on July 12, 2023

> English has been designed over the course of fourteen centuries.

Wasn't English forked from several other intermediate forks (Anglo-Saxon, Germanic, etc.) going all the way back to Proto-Indo-European? I guess that since none of those carried the name "English" (or its original name "Anglish"), all those beta projects don't count in that timespan, otherwise, it has to be at least twice that age.

theletterf · on July 12, 2023

A story more intricate than BSD’s. https://aeon.co/essays/why-is-english-so-weirdly-different-f...

sorokod · on July 12, 2023

A programing language can be seen as UI between a human and the computer. As such it is similar to a keyboard. The path to keyboard includes phisically wiring the hardware and punchcards. At the moment keyboards provide a stable tradeoff between precision and ergonomics. Maybe in some future we will be able to replace the keyboard with a microphone but there are good reasons why this hasn't happened yet.

chaosprint · on July 12, 2023

language is interface and keyboard is interaction device, so in what aspect are they similar? could you elaborate on this?

sorokod · on July 12, 2023

Don't see how adding "device" makes a relevant difference.

In both cases they are an interface between the intent a human wishes to convey and the computer that is to execute that intent.

GrumpySloth · on July 12, 2023

Keyboard doesn’t compete with English. It competes with microphone. And I’ll be dead before anyone gets me to use a computer via voice FWIW.

sorokod · on July 12, 2023

By analogy keyboard - voice is similar in a useful way to $programing_laguage - English

chaosprint · on July 12, 2023

related for those who are interested in DSL design:

- an ancient Chinese programming language: https://wy-lang.org/

- I am also building an synth/audio graph inspired music programming language: https://glicol.org/

owenpalmer · on July 12, 2023

> English is used on more than 1,456 million carbon-based devices The aliens are going to find this and use humans as their computers

throwaway7868 · on July 12, 2023

It's a SYSOPS manual!

aussieguy1234 · on July 12, 2023

Well, you can now "program" LLMs such as ChatGPT with English.

Using English you can set your chatbots character, personality and purpose.

Will it be a helpful assistant like ChatGPT, or be a digital reincarnation of some historical character? Both can be set up in plan english and the LLM will follow your instructions.

number6 · on July 12, 2023

> English programs are self-documenting and implement literate programming.

Ever read Finnegans Wake

theletterf · on July 12, 2023

Ah, code obfuscation is a classic.

creatonez · on July 12, 2023

This idea is taken from the Esolangs wiki - https://esolangs.org/wiki/English

theletterf · on July 12, 2023

Fun, but I found out about that page _after_ writing the README. Ideas can develop simultaneously.

barrysteve · on July 12, 2023

Is there supposed to be a knowing self-aware irony to this idea?

Idk, it just seems a flat re-interpretation of an obvious concept. Good silliness goes deep, imo.

pknerd · on July 12, 2023

Wasn't PASCAL pretty much ENGLISH language?

andreygrehov · on July 12, 2023

Fantastic repo. Love it!

Looking back, we can see how Machine Code, with its intricate and challenging nature, paved the way for more accessible options. Assembly language then emerged, providing a higher level of abstraction and reducing the complexities of directly working with machine instructions. And of course, C followed suit, offering even greater simplicity and ease of use compared to Assembly.

Imagine a future where programming languages, as we know them today, become akin to CPU instructions – a foundational and low-level primitive. LLMs will revolutionize the way we interact with code, providing a unified interface where the complexities of various languages are distilled into a common representation. The proliferation of individual programming languages will wane. Knowing Java or C++ will become a rare skill, akin to individuals specializing in low-level optimizations using Assembly language these days.

Edit: as time progresses, even the convenience of LLMs may pose challenges, given our inherent tendency towards laziness, so an additional layer of abstraction will be introduced, bridging the gap between LLMs and spoken languages. BCIs will revolutionize the act of coding itself so that individuals can seamlessly "code" by simply "thinking" about their desired actions.