As someone who worked on a 35 million loc cobol and jcl system for several years in the late 90's, I don't find this to be funny at all. Cobol was purported to be readable with english language like syntax to make it more understandable. Giving gems such as:
MULTIPLY TAX-RATE OF STATE(43) BY BALANCE GIVING SALES-TAX-AMOUNT.
There are multiple responses to the above parent noting how readable the above line of code is and yet how it might go wrong.
The first pitfall of using english as a programming language that occurs to me (as a totally Cobol ignorant person):
* Human languages tend to be nebulous around the edges and fluid, often with single word taking up multiple meanings and same function done by multiple words. There are multiple ways to express the same concept. This allows for the language to change and evolve with the times.
* OTOH programming languages need to be specific and exact for the computer to be able to interpret it and ensure it functions as expected across devices and over time.
This means to make English function as a programming language, we will have to take the existing language, whittle down most of the senses and many words, assign one function to one word and use a very trimmed down and reduced version of English.
Now you will have to know two forms of English
* The human one
* The computer-compatible one
Worse, our human version of the language can often trip up our computer-compatible one. I imagine even debugging would be harder, because when you look into the code, the brain would see perfectly good english and not register any issue with punctuation or tokens as expected by the interpreter.
Consider the below version.
MULTIPLY TAXRATE FROM STATE(43) WITH BALANCE GIVING SALES-TAX-AMOUNT.
I have made 3 changes here, which might or might not work with COBOL (I am Cobol ignorant ;) ).
But if a person tries to find the changes or debug, it would difficult for the human brain to register what is wrong as this is perfectly good english.
> I have made 3 changes here, which might or might not work with COBOL (I am Cobol ignorant ;) ).
I've never used COBOL before, but this thread and your post made me curious and I installed GNU Cobol and found a sample program that demonstrates a few language features.
After playing around with it a little, I think I've understood that "OF" is like a struct element (or object property) accessor, the parens are an array index, and the hyphen is a syntactically significant part of the identifier. "BY" is apparently a mandatory part of the multiplication operator.
Conclusion: each of your three changes (while, as you said, perfectly reasonable from the English language point of view) would indeed break the COBOL code!
By the way, I remember this being a bit of an issue when I tried programming in HyperTalk: on occasion I would unintentionally come up with an English synonym for some natural-language HyperTalk code, and it wouldn't be valid HyperTalk.
It's actually a pretty interesting question why programming languages that work like that are a bad idea.
From decades of experience, I know intuitively that they are, but I find myself unable to formulate a concise description of what exactly is wrong with this approach.
Because they turn out to be just as precisely fiddly as any other programming language, but deceptively appear not to be.
If you treat them as English you'll get burned. You have to still treat them as very precise formal languages where apparently trivial/irrelevant details are significant, and minor hard-to-spot mistakes will break your program.
Having more explicitly formal/structured syntax makes it easier to distinguish the different parts of the language, make sense of the details, and figure out what is or isn't allowed.
The best description I have for languages like this, e.g. Applescript, is that they are "read only languages" (in the opposite direction old-style Perl has been called a "write only language"): That is, if you have an already written program in hand it will be easier for a complete novice to read (at least, single isolated lines of code will be). But writing new programs is a huge pain in the ass.
My guess would be that it's because spoken languages and programming languages are fundamentally different, so in trying to make one fit the other you might end up with a programming language that looks and reads a lot like English, but it's almost become self obfuscating because now your brain could automatically try parsing it using the rules for English and not the rules for the programming language, making things look like it should work even if it doesn't. The example the comment you're replying to works pretty well, we probably largely ignore punctuation at the end of sentences in terms of actually consciously seeing it as opposed to just inserting a pause in our mental cadence, so if they're suddenly important pieces of syntax in statements that look like English you could easily start messing up the use of it
In my opinion it’s because of english’s evolution over time. Something that might have made perfect sense in the past becomes an antiquated way of saying the same thing in the future. It also deliberately encodes the authors belief about how language should be spoken ignoring any regional variances you see in the real world.
When you abstract away the English meaning of code into something new and unchanging, you provide stability not seen in natural language.
But why is that a problem? That's literally how we speak every single day.
In fact, given that this is how all the languages that humans are already familiar with work, it's hard to see why this wouldn't be the best approach for constructing a programming language.
- humans can ask back questions to clarify any ambiguity; computers don't have the capacity to understand ambiguity, let alone asking for clarification.
- computers are trusted to work w/o failure almost 100% of the time. humans err (to be human is to err, anyway), so we'd never put a human in charge of critical things we use computers for.
I agree with your sarcastic reply that naturally grown languages are simply horrible for shared comprehension!
For instance, the poem, or should I say program using the example in the post has several different interpretations, and at least some of those teachers would say are flabbergastingly incorrect.
A number of libraries in the Southern states have biased maintainers who have merged in pull requests deleting a large number of classes in bulk, despite the fact that the tests have passed for years and those classes are in use by a large number of language users.
ooh, inform. before you can work out the english grammar for interacting with the game when the game is running, you have to work out how to interact with the english grammar when writing the grame itself. worst software idea ever, in my opinion.
I found it as you describe, but that was after spending decades with more conventional programming languages. Meanwhile, there are folks over in the IF community who, despite to the best of my knowledge never having written a meaningful amount of what you and I would recognize as code, turn out works of complexity I'd never have imagined you could achieve with Inform 7.
It doesn't have all that much value to me, although I've found it fun to play with, but too many people have done too well with it for me to want to write it off entirely. I think we're just no more for it than it is for us.
>This is a typical English program that implements the Frost pathfinding algorithm:
although this is one of my favouite programs (or poems), i have never before noticed that it is one quite long poem (or program) terminated by a period - i wonder why Frost did that? possibly a pascal programmer?
Cool language! I wish there was more standard interpreter support though. Everyone seems to be using a different flavor, which yields different results depending on where the code runs.
The main issue with EPL is that compiling or interpreting the language (it supports both paradigms) is extremely resource intensive and incurs significant compute overhead, often unavailable without resorting to fixed infrastructure. Inference requires access to the massive n-dimensional memetic matrix that comprises the execution environment since most of the processing is done in the data structure itself.
Is silicon this implies a large amount of memory and processing capacity, and in carbon the training process takes decades, even though the cost is (amazingly) lower. The big downside of carbon based solutions is that each instance has to be individually trained, so scaling is a serious problem. Not only that, but the success of the training stage is not predictable, and retraining is usually ineffective.
Interestingly, EPL is considered by some to be an exemplary example of the “data as code” paradigm, since the majority of the processing is pre compiled into the n-dimensional memetic matrix.
Even though the mematrix is so massive that resource consumption is high just to move tiny fragments in and out of working memory, it enables the use of simple algorithms to produce surprising inference performance. The main requirement for extracting inference from the dataset is that the extraction algorithm perform statistical prediction of sequential tokens, which is easily accomplished in both silicon and carbon based neural networks.
It is also worth noting that each data output is also code, which is then recursively be added to the mematrix even if the output is an error. This requires careful code hygiene and error checking and correction or inference results can quickly go off the rails.
To be entirely fair, interpretation on silicon-based processors isn't exactly a supported use case. You can just about make it work nowadays but you're really just emulating carbon-based processors in a box.
Unfortunately there is a serious version problem amongst existing English systems. The language does not provide a method to enforce use of particular versions, and there is no universal naming convention to identify "dialects".
While less problematic at small sites, where it will be assumed that all instances are of the same "dialect", this commonly causes problems at large sites and for internationalization.
Conflicts between versions and/or dialects may be limited to identifier conflicts ("colour" v "color"), may involve terms not recognized by multiple parties, terms with conflicting meaning ("chaps"), or terms unacceptable to some parties ("toilet" v "bathroom") which break the "conversation".
I've often described telling LLMs what to do as "programming in English", and I genuinely think some people have an unusually low opinion of the capabilities of LLMs because they're not so great at expressing ideas and concepts in English.
The original version of my aidev.codes project was a process of refining specifications that would be rerun each time you changed something (picking up where you left off if you just added something). Running meant generating JavaScript code based on the spec which would be executed on page load.
It kind of worked. The biggest challenge was the limitation of gpt-3 or Codex that I was using at the time and the fact that it didn't always return the exact same result. Although with temperature zero it was somewhat stable.
> English has been designed over the course of fourteen centuries.
Wasn't English forked from several other intermediate forks (Anglo-Saxon, Germanic, etc.) going all the way back to Proto-Indo-European? I guess that since none of those carried the name "English" (or its original name "Anglish"), all those beta projects don't count in that timespan, otherwise, it has to be at least twice that age.
A programing language can be seen as UI between a human and the computer.
As such it is similar to a keyboard. The path to keyboard includes phisically wiring the hardware and punchcards. At the moment keyboards provide a stable tradeoff between precision and ergonomics. Maybe in some future we will be able to replace the keyboard with a microphone but there are good reasons why this hasn't happened yet.
Well, you can now "program" LLMs such as ChatGPT with English.
Using English you can set your chatbots character, personality and purpose.
Will it be a helpful assistant like ChatGPT, or be a digital reincarnation of some historical character? Both can be set up in plan english and the LLM will follow your instructions.
Looking back, we can see how Machine Code, with its intricate and challenging nature, paved the way for more accessible options. Assembly language then emerged, providing a higher level of abstraction and reducing the complexities of directly working with machine instructions. And of course, C followed suit, offering even greater simplicity and ease of use compared to Assembly.
Imagine a future where programming languages, as we know them today, become akin to CPU instructions – a foundational and low-level primitive. LLMs will revolutionize the way we interact with code, providing a unified interface where the complexities of various languages are distilled into a common representation. The proliferation of individual programming languages will wane. Knowing Java or C++ will become a rare skill, akin to individuals specializing in low-level optimizations using Assembly language these days.
Edit: as time progresses, even the convenience of LLMs may pose challenges, given our inherent tendency towards laziness, so an additional layer of abstraction will be introduced, bridging the gap between LLMs and spoken languages. BCIs will revolutionize the act of coding itself so that individuals can seamlessly "code" by simply "thinking" about their desired actions.