There's something magically expressive about S-expressions and processing them. For a linguistic project I was thinking about how to represent sentence syntax as data and then I came across Chomsky's phrase structure analysis[1] which parses spoken language exactly like S-expressions in LISP! It's even illustrated with parenthesis or syntax trees, just like the old SICP vids explain. The phrase structure even uses recursive definitions so processing and inflecting the content of sentences can be done with recursive functions etc, (I did it in TypeScript) like something following a design exercise out of HTDP/SICP. I think the inherent parallel between human language phrase structure and S-expressions is beautiful.
> There's something magically expressive about S-expressions and processing them.
A downside I see with S-expressions – you often end up with something like (FOO BAR BAZ), where FOO identifies the type of thing, and BAR and BAZ are parameters/slots/etc – but you have to remember the (sometimes quite arbitrary) order that BAR and BAZ go in.
Some Lisp family languages (such as Common Lisp, Clojure, Scheme with SRFI 88/89) support keywords (aka named parameters), so you can do something like (FOO :bar BAR :baz BAZ), which is considered equivalent to (FOO :baz BAZ :bar BAR). However, this is somewhat of a late development in the Lisp tradition, the status of its adoption is mixed, and (arguably) it is moving away (even if only by a little bit) from pure S-expressions. When people say "S-expressions", they normally aren't thinking of CLOS.
I think JSON's model, in which lists and objects are independent first-class entities, with distinct syntax, has something to say for it. The downside of JSON objects, is their keys are strings, not symbols, and they don't have any kind of "class slot" (some people make up conventions here, for example a reserved property name such as "$class", but those conventions are not uniform or standardised). Oh, and we don't need all those commas. I think, if I was designing a LISP-ish language, I'd have a first-class syntax for objects which used different opening and closing characters – maybe (FOO bar baz) for a list, but [FOO :BAR bar :BAZ baz] for an object.
> I think JSON's model, in which lists and objects are independent first-class entities, with distinct syntax, has something to say for it.
> Oh, and we don't need all those commas. I think, if I was designing a LISP-ish language, I'd have a first-class syntax for objects which used different opening and closing characters – maybe (FOO bar baz) for a list, but [FOO :BAR bar :BAZ baz] for an object.
I think Clojure has what you're looking for, and the syntax is great, much more light-weight than JSON (the syntax even has its own name: EDN). 'Objects' are first-class hash maps, your example could look like this `{:class FOO :bar BAR :baz BAZ}`.
The keyword vs symbol distinction still seems weird to me. :first vs 'first. {} delimited sequence of even length is a nice complement to () delimited sequence of arbitrary length though.
I think of it this way: keywords are like symbols, but they survive serialization/deserialization and macroexpansion intact, so you don't need to be extra careful with quoting to prevent automatic evaluation of the symbol to some other value. And since keywords are used in code very often as keys and enums, it reduces cognitive load immensely.
This makes it easy to implement things like inheritance or trees of lexical environments.
I sometimes wish that Lisp had better ergonomics for hash tables, but normally I am pretty happy that folks don't reach for them first. Frankly, I think positionality is normally good. Can you image how terrible it would be to program a language in which addition is represented as:
Positionality is good only when arguments are interchangeable/commutative like addition or when the types are different which ensure that any mistakes in the order will be detected by the compiler.
In all the other case positionality is bad: see memset for example.
Yes definitely a valid point. For my case I used TypeScript and made a typed array of objects (with recursive definitions to arrays of the same) that cold be arranged and processed like an S-Expression. This typing and having named elements in the objects really hit the sweet spot for me while still being able to process the list/array in a functional/recursive way.
Another flaw in JSON is that there's no built-in way to connote referential integrity. Which is another way of saying that JSON can represent trees but not arbitrary graphs.
Here's a circular list for example in Common Lisp:
#1='(foo bar baz #1#)
[Printing that is a good way to overflow your stack]
More usefully, here is an employee who is an acting supervisor of himself:
'(:employee #1="Frank Smith" :supervisor #1#)
There are ways around this problem in JSON but it requires a bespoke parser. In Common Lisp it's built in to the reader.
Self-reference causes a lot of issues. Sometimes I wonder if it should just be prohibited, forcing everything to be a directed acyclic graph.
If a language has only immutable data structures, it is easy to make it so that self-reference becomes impossible, just by making sure your primitives for constructing values don't permit cycles to be constructed.
Even if you allow mutable data structures, it is still possible. For mutable compound heap objects (i.e. heap objects which possibly container pointers to other heap objects), they can have a back-pointer pointing to their parent. If you think about it, you can flesh out some rules about how to manipulate these objects which will make cyclic references impossible. When the rules are violated, rather than failing the operation, you could just add a pointer to a deep copy instead. (Not completely deep, you can still share rather than copy non-compound heap objects, such as strings–sharing non-compound objects can't produce cycles.)
Banning direct self-reference doesn't mean nothing can ever refer to itself, just not directly. So, you couldn't have #1='(employee :name "Frank Smith" :supervisor #1#). But you could have something like '(employee :id 1 :name "Frank Smith" :supervisor (ref :to employee :id 1)). At a physical level there is no self-reference, but at a logical level there still is.
Similarly, one would want to allow cyclic references via a symbol, so a symbol can occur in its own binding. So not completely banning self-reference, just limiting it, in the hope that limiting it will limit the issues it causes. I think, using the strategies I've outlined above, reference counting will work as a garbage collection strategy, without the possibility of memory leaks from cyclic references. (Allowing a symbol's binding to refer to itself will not leak memory, since a bound symbol should still exist even if nobody refers to it; to deallocate it, you have to explicitly unbind it, which will remove the binding, breaking the self-reference and allowing the reference count to fall to zero.)
> If a language has only immutable data structures, it is easy to make it so that self-reference becomes impossible, just by making sure your primitives for constructing values don't permit cycles to be constructed.
Much like the Y combinator comes out somewhat unexpectedly from the pure lambda calculus, my suspicion is that any sufficiently rich data-construction language will allow, intentionally or unintentionally, for some sort of self-reference.
Causes problems with functions though as you lose straightforward recursion. That leaves you with either define creating cycles, or introducing letrec to create cycles, or the dubious beauty of the Y combinator.
The acyclic heap is a great property for most data but it's annoying for representing programs. It might be reasonable to restrict letrec to creating functions, and ensure they're all statically allocated, which would give an acyclic data heap with direct representation of functions.
I still think cycles should be allowed via symbols – just not directly. Allowing cycles via symbols tends to be less of a problem: (1) you don't want to garbage collect bound symbols, and once the symbol is unbound the cycle is broken (it would remove the ability to treat environments as first class objects, but maybe the other advantages are worth that cost); (2) printing values generally just prints the symbol name not its binding; (3) very often when you recurse through a data structure, you don't recurse into the binding of symbols contained in it, so that entails a significant reduction in the potential for infinite recursion bugs.
It does mean you can't have an anonymous function which recurses, but that's a relatively rare use case. And actually, if you introduce some kind of special syntax which refers to "this lambda", you can even permit direct recursion of anonymous functions. Mutually recursive anonymous functions would be harder to support–maybe not impossible to support, but such an obscure use case it probably isn't worth thinking about.
The referential integrity dreamcompiler refers to isn't just for cycles in the structure, but for substructure sharing. Substructure sharing still occurs in a DAG, unless you rule it out also (e.g. with some strict one reference ownership model or whatever). In the absence of cycles, it is still useful to have a notation which says "these two objects in the DAG are actually the same node". If you implement that, the detection of cycles is almost free, because the assumption that a duplicate node appearing in the print is not a descendant of the previous one doesn't simplify the printing or reading algorithms very much.
In order to print this #1=(... #1#) you have to determine that the given object occurs inside itself. In order to print (#1=(whatever) #1#), you have to determine that the given object occurs later in a sibling expression. Knowing that #1# will not occur inside (whatever) doesn't help all that much with anything; it doesn't help you decide whether or not the #1= prefix needs to be printed.
If anything if the restrictions were reversed, it would be simpler: if the only substructure sharing that were allowed was backward references, that helps. Because then we know that the (whatever) object can only occur again inside (whatever), and not after it. To detect nothing but cycles, without caring about other sharing, we just have to maintain an object path across the recursion so that for every visited object we can ask "is this object already in the path from root to here?" If so we have a backreference.
The sibling case is harder. You can have situations like:
((((#1=(foo)))) (((#1#))))
Deep inside the left item in the list there is an object which occurs again deep in the right item. We need to put the objects we are visiting into a hash, which is pretty expensive.
> Just by making sure your primitives for constructing values don't permit cycles to be constructed.
But lambda calculus teaches us that if all we have is functions being passed to functions and binding arguments we can build a circular reference. See the middle word of the domain in the browser address bar. So you have to further dumb things down from there if you want to absolutely eliminate cycles in the object graph.
(You can have it so that you don't care about function-induced cycles (only the garbage collector does) by not having any printed notation or other API that traverses the internals of a function. The application can still build a graph using functions, but is "on its own" as far as providing its own functions for operating on it, and serializing it and whatnot.)
Keyword parameters may be a late development, but they are based on property lists. Property lists are so ancient, that Lisp 1 and 1.5 used them for storing bindings. For instance, a symbol having a SUBR (machine language routine) or EXPR (interpreted routine) binding was stored in that symbol's property list.
Property lists and keyword arguments aren't tied to CLOS, though of course CLOS uses them for the flexibla constructor arguments passed to make-instance.
Yeah, that really sucks. Objective-S's object literals do. Otherwise they really wouldn't be object literals, would they?
It was really an accident, because Smalltalk's literal arrays use #(), and since I wanted to have dictionary literals use the basic same look, I chose #{}. And also I use {} for blocks and block structure, rather than []. Well, it turns out you can tuck a name between the # and the {}. Yay! Classes. And thus objects!
EDIT:
Just saw that edn uses a very similar mechanism. Neat!
Association lists have some obvious downsides – if they didn't, Common Lisp-style keywords would have never been invented.
One is that keywords are simpler syntax – `(foo :bar 1 :baz 0) is easier on the eyes than its association list equivalent.
Another is that their native implementation (linear search) can be a performance drain if you use them too heavily in an application (especially if you have too many keys). It is a data structure which optimises for insertion over retrieval, but most applications retrieve much more than they insert. The ability to "shadow" (have multiple bindings for the same key, with the first being used and the subsequent being ignored, but potentially becoming active again when the earlier binding is removed) is useful in some cases, but more often than not an unnecessary feature, more likely to induce bugs than be genuinely beneficial – "shadowing" by accident rather than intention.
A sufficiently smart runtime could optimise around some of these issues (this looks an association list, I'll use some optimised data structure to store it, but fall back on a more traditional cons-cell based storage if you try to do with it things the optimised data structure doesn't support.) But, having an actually separate data type enables equivalent performance optimisations, without anywhere near as much "smartness" in the runtime (which can be difficult to understand and maintain.) Plus, that kind of "smartness" can lead to difficult to understand performance regressions (why does adding this new code suddenly make completely unrelated parts of the application a lot slower?)
Lua's tables with metatables is a particularly eloquent solution to this, because as you say, it's always clear when a pair is being shadowed vs. replaced.
What you're describing with parameter ordering is Lisp's Mysterious Tuple Problem [0]. It's also part of why Lisp works so well for one or only a few developers, but breaks down for large teams. Javascript (and Self before it) has really started a structural revolution with the JSON object as a syntactic form. I'm currently working on a language very similar to your last paragraph, where all objects do have such a field indicating their class, so an object in the textual representation might look like:
{ Point x 2 y 3 }
Instead of n-ary tuple operation like Lisp, my design limits all functional application/message sends to a form of a receiver object, message, optional it object, and additional map of args in the preceding syntactic way (still not sure if they would have a class or be classless). Multiple dispatch would be limited to just the receiver and the it object.
You can see in Smalltalk where this syntax would be helpful in the awkward selector syntax it adopts.
bool ifTrue: [block] ifFalse: [block]
Looks better as
bool if {
If
$then block
$else block
}
Where here the $ in slot names means the corresponding value isn't evaluated, a reference to the late John Shutt's Kernel [1] that explored replacing the lambda calculus at the base of Lisp with a lazy vau calculus in order to restore the historically maligned fexpr [2]. Originally they were removed from the language because they made compiling difficult at a time when performance really mattered (1980), and replaced with special forms that weren't first class subjects of the language (and neither really were the syntactic macros that accompanied them). Going back to the original topic, Alan Kay listed Lisp as one of the primary inspirations for Smalltalk, and especially the fexpr, which he saw as underappreciated:
> There were not just EXPRs (which evaluated their arguments), but FEXPRs (which did not). My next question was, why on earth call it a functional language? Why not just base everything on FEXPRs and force evaluation on the receiving side when needed? I could never get a good answer, but the question was very helpful when it came time to invent Smalltalk, because this started a line of thought that said "take the hardest and most profound thing you need to do, make it great, and then build every easier thing out of it". That was the promise of LISP and the lure of lambda—needed was a better "hardest and most profound" thing. Objects should be it. [3]
I'm still in search of the "hardest and most profound thing" - here's to hoping it's real.
I like your syntax, but wouldn't it be better as something like
{ Point :x 2 :y 3 }
or maybe:
{ Point x: 2 y: 3 }
that little bit of extra punctuation (only needs to be a single character) is really helpful in making clear what is the field and what is the value. Syntactic minimalism is great, but sometimes it is taken a bit too far
> Where here the $ in slot names means the corresponding value isn't evaluated
If parameters have type declarations, why not make laziness part of the type instead of part of the name?
I suppose it makes it clear at the invocation site that the parameter is lazy-evaluated. I wonder how important that is though? In Smalltalk, it is very clear, due to [] vs (). In most Lisps, you can't tell from the invocation site whether some symbol is an ordinary defun, or a special form or macro, you have to look up its definition or documentation. In my experience, that isn't a big problem – if you don't remember it, sooner or later you end up having to look it up anyway, to understand what the parameters mean and what it actually does.
I agree my syntax is quite spartan, though that's because I don't intend to rely on it as the user interface, but would rather extend to structured editing more like Self.
On the laziness part I'm torn - quoting is what that amounts to and it never seemed extremely elegant either. The call site disambiguation is nice for knowing whether you can compile something.
As a former professional Smalltalk developer, I would caution against the experience of greybeards who wax lyrical about the incredible power of Lisp and Smalltalk, because even in their heyday that power was largely illusory.
Firstly, anybody who encountered these environments early in their career will always be misty eyed about them.
Secondly, ask them how big the team was they were working in and what the mechanism was they were using to deploy their programs. Smalltalk in particular is particularly difficult to package up into a deployable program that somebody else can run, and collaborating on development is fraught with code sharing issues you just don't get with less exciting environments like .NET or Python or Javascript/Node.
In a modern context, Smalltalk and Lisp make little to no sense, especially when you have available a rich system of code being shared in more prosaic environments. It's a modern miracle that you can build what you need without re-inventing the wheel. There is no better time to be a programmer than now.
Lisp and Smalltalk may well be great for reinventing wheels, for writing un-readable self modifying code and as a challenging intellectual exercise, but all your productivity just went out the window.
Sounds like you're evaluating software configuration management for Smalltalk from before that was hardly a thing for any development platform?
Decades ago, Smalltalk implementations would let you dump some part of a system (say, the part that held your application) to a human-readable ASCII file (which is pretty Git-able), and also load that. If you were using it today, while we're using Git, I can't think of a reason a Smalltalk system couldn't have more tight integration with Git.
Regarding Lisps, I'm not sure which school of thought you're coming from, but there are multiple. To someone without firsthand experience, the things you might've heard someone say, like "Foo is great because code and date are the same darn thing", or "Foo is great because it's dynamic up the wazoo", not only aren't necessarily the strengths that others would claim, or they might not mean the same thing.
For example, imagine some kinds of syntactic extension in Lisps, in which some region of syntax isn't a function call. The region is clearly delimited, and it has an identifier right there that can be looked up to tell you the behavior. Someone without experience with that, but who has used an unfortunate kind of "DSL" in some other language, or, worse, has experienced surprise "operator" overloading in a different language, or who's seen unfortunate choices of when to introduce languages and when not, might think this Lisps thing is more of the same thing, when it's not.
Pharo, at least, has very tight integration with git via Iceberg. Other variants have had (not from personal experience, only reading what others wrote) very good or at least decent (that is, no worse than the rest of industry) version control at least back to the 90s.
I worked on a team of about 20-25 devs. We used the Digitalk toolset, using Team/V for source control. You opened a repository/package browser instead of a class browser. There was versioning at all levels and good package management.
Envy/Developer was in heavy use where I worked, and it was pretty great but still nothing like the packagement management tools we now have.
I'd say the change log in something like VisualWorks was definitely not git-able in any way that would facilitate merging of several developers code. In fact it was extremely difficult to keep images in sync without something like Envy/Developer.
When I used ParcPlace Objectworks Smalltalk 4.1. and then VisualWorks Smalltalk in the early '90s, you'd use a menu item to write a human-readable ASCII serialization of some part of the system (not changelog) to the host filesystem. And it'd be diffable and mergeable and loadable.
Doing Git with just that seems only a little different than what we'd do with a collection of source files today in some other language, including branches and merges.
At the same time, one commercial site I worked at was doing its SCM for C code using kludge scripts over SCCS to an NFS volume. The next was putting C and C++ into Apollo DSEE and making SPARCstations jump through hoops to use that (because ancient DSEE on some Apollo DN10ks for SCM and CI/CD was better than the state of the art almost everyone else was using) and my advanced R&D group had just gotten DSEE-descendant Atria ClearCase for greenfield work in C++ from our Sun and HP workstations. The next site was using some dumpster fire of `.BAT` scripts over Perforce(?) for Windows NT-dominated C++ development.
I'd say VisualWorks Smalltalk's wasn't the problem for team development at the time. The state of practice of everything else was more the problem.
Evidently you have some experience with Smalltalk, but what makes you say that about Lisp? Also which Lisp exactly do you have in mind? To me the Lisp family of languages is more like Classical music: it doesn’t satisfy some modern sensibilities, but it has a timeless elegance and will be around long after the latest fad is gone.
It's a personal opinion more than anything. I see Clojure getting attention but it isn't going to replace Node.js or .NET any time soon.
The major difference being the ecosystem built around something like C# is miles ahead of anything available for even Clojure. Some people might think that's an advantage, but for people like me whose job is to be pragmatic and "git er dun" it's a major impediment.
Hence, lisp has become a kind of toy from a golden age. I find myself reaching for C# or Python for experiments. I have dabbled in lisp occasionally over the years but never found Clojure very compelling.
Clojure has access to the same ecosystem as C# since it runs on the CLR and can use .Net libraries and frameworks.
It can also have access to the whole Java ecosystem or JavaScript ecosystem. Funny enough, it can also use most Python libraries. And if you're willing to go less official, it also can be used to leverage Erlang's ecosystem and recently Dart and Flutter.
Unless you meant something different than being able to leverage libraries, frameworks and some of the tooling?
You come across as someone with minimal -if any- experience with Lisp, to be making some of the generalizations and grandiose proclamations that you do. Also, lots of people don't consider Clojure a Lisp.
I too have a significant physical Lisp library, including TAotMOP. I’m also a bit of a curmudgeon and don’t love some of Clojure’s syntactic novelties. But even so I agree that Clojure is a Lisp. Lacking proper tail recursion really isn’t disqualifying. The workflow with CIDER is quite familiar for anyone with Common Lisp experience. I understand there are solid dev plugins for other editors too.
We had a previous discussion, and my impression was tools don't matter when — "Yeah, there were a lot of side channels for sharing code that made merging things more difficult than they could have been. Poor discipline when devs were in a mad rush."
> Lisp and Smalltalk may well be great for reinventing wheels, for writing un-readable self modifying code and as a challenging intellectual exercise, but all your productivity just went out the window.
This assessment seems very unfair to the current environments of both languages. Perhaps you ought to delve in a second time?
Although it's true that today it is easy for everyone to write, even those who have no experience or have had a disappointing experience, the effects of what they write are often not measurable. That is, it is easier to measure the resources needed to write than to measure its effects.
Smalltalk is a tool for the creative spirit. It's a very powerful tool for those who have something inside and want to develop it. On the other hand, nowadays, the alternatives that are massively propagated have other characteristics. They are for those who feel better (or calmer) absorbing what someone else has done. They are for those who hide behind the phrase "it is better to ride on the shoulders of others" trying to show wisdom, but in reality revealing dependency and unproductivity.
The problem with Smalltalk is that in computing, the "creative spirit" is a dying breed. Smalltalk is for the creative. Most think that creating is expensive and settle for copying (imitating) or using (claiming to re-use!).
Whoever thinks someone has something to contribute, let him use a Smalltalk, whoever doesn't, let him buy the latest.
Contribute to what? If you have something that is fundamentally difficult to share (like a Smalltalk image) what are you contributing to other than entropy?
The Smalltalk image is just a file, so that's not that difficult to share?
Maybe you mean the Smalltalk code that's been "saved" as part of the Smalltalk image snapshot? In which case, why wouldn't we export the Smalltalk code from the image and share those source code files?
Because poor Smalltalkers often extend stupid things like "Object" and none of their code will run on your image without finding all the changes they made.
So you either try to file out your own changes and merge into somebody elses image, or try to extract theirs for your image.
I realise the state of the art must have moved on since the 1990s but change files back then were a constant source of problems along with how you deploy your image once you are happy with it.
Contrast that with pip or npm. Once I've built my requirements.txt or whatever, I have not only solved how to share my environment with my collaborators, I can use it to build a Docker container and also solve my deployment dependencies.
Smalltalk: play whack-a-mole with change sets until it looks like it's working then strip the image of unwanted stuff and hope it still works. It's caveman stuff.
> I realise the state of the art must have moved on since the 1990s…
A few hours ago you told us "Envy/Developer was in heavy use where I worked" so you know that what you describe was not state of the art even back in the 90's.
> especially when you have available a rich system of code being shared in more prosaic environments
I don't think this is as big an issue as you think it is. You can find a number of Lisp family language implementations which compile to JVM bytecode, .Net bytecode, or JavaScript – Clojure, ClojureCLR and ClojureScript are the most notable and active (even if Clojure is somewhat of an unorthodox Lisp), but there are others (such as Armed Bear Common Lisp, ABCL). So you can write your Lisp code, and easily consume those massive library ecosystems. Similarly, Smalltalk has PharoJS, and I know there have been ports of Smalltalk to JVM and .Net too (although I don't think they've been as successful).
Technologies for in-process multiple language integration, such as FFI libraries, have greatly improved over the last 10–20 years. Mixing code from different languages is easier than it ever was in the past, and there are further improvements on the horizon (for example, in Java, the planned replacement of JNI with the FFM API).
I think WebAssembly is really exciting – especially if they improve the WebAssembly-JavaScript integration story, which I know they are working on. Compile whatever language you like to WebAssembly, and then you have full access to the JavaScript library ecosystem; and an improved WebAssembly-JavaScript integration will likely also help simplify cross-language integration between different languages targeting WebAssembly.
We have an abundant choice of technologies for inter-process/inter-machine cross-language code integration – everything from relatively modern approaches such as REST, GraphQL, gRPC, Thrift, through to more old-fashioned approaches such as SOAP, CORBA, DCOM, etc. With the rise of micro-services, server-less, etc, it is easier than it ever was to build an application containing a mix of services in different languages. Few care if your app is composed of two or three or four different containers, and even fewer care if the code in those containers is written in different languages.
This is a very insightful comment, thank you. I have much the same experience from the Lisp of side of things.
I will say though, these languages have their niches where they really shine.
The oft understated power of Lisp is in creating software that is infinitely configurable assuming you know a bit of Lisp. Extensions are a first class citizen in Lisp programs. The configuration is just some lisp code that's compiled into the running program upon loading. The config can even redefine existing functions. One time I used the config file to fix an actual bug in a program that was abandoned by the maintainer. Of course, only programmers want this functionality, and so its usefulness is limited to things like Emacs. But it still has its place.
Interesting. I did 20 years of professional Smalltalk development as well, 6 years at Cincom. This coming Wednesday will mark 10 years out of the balloon, wandering in the wilderness as a polyglot. My conclusions are not the same as yours.
Smalltalk was (for me) unparalleled at modelling things, including its own self (pun?). I think it also thrived in an era where a high percentage of the programming population pursued it as a form of mastery. The percentage of people that still program with some sort of mastery of the art is much lower now (to wit, I think there are more of them total, but less as a percentage). Working with the stack of internet technologies is like becoming a city planning engineer, a little bit of engineering, a whole lot byzantine rules and policies to navigate, jobs creation program itself in the number of "specialties" that need to be mastered to assemble anything.
I left Smalltalk, because the number of places I could advocate its use was shrinking. Desktop was dying. There's not a lot of technologies for browser delivered programming where going outside of the mainstream is worth the impedance mismatch it causes. And when it comes to handhelds, the hardware vendors have cobbled such an oligarchy of "use our approved/supported technologies" that it's rarely worth it bucking the idiomatic trends. All mainstream code movements live and die on the blessing of iOS, Android, or the browser(s). There's a small remainder of competitive evolution that goes on in the server space, but not a lot. Node may be more modern than Fortran, but half of the basic programs running on the backside are simple enough it doesn't really matter what you write them in, you could write them in Fortran or Basic.
I miss Smalltalk's simplicity. I miss Smalltalk's block closures: sure did make ad hoc functional program easy. I miss Smalltalk's very nicely balanced syntax: kind of code like, but also uncannily natural language at times. I miss Smalltalk's ability to try and see code and code artifacts as something other than text files. I miss Smalltalk's code navigation and execution tools.
Once Smalltalk became a commercial offering, with the need to support and maintain an installed base, it quit changing. Any vestiges of Smalltalk today, are basically Smalltalk-80+. For Smalltalk to "go on", they (we) needed to "burn the disk pack" again.
For me, the ultimate modern language would look something like Smalltalkish syntax and IDE, with Elixir/Erlang like execution semantics, but retaining a strong object/class declaration story (instead of let's pretend dictionaries are structures), and a namespace story somewhat akin to Python (but without the shadow binding behavior), and VM extensibility story similar to Smalltalk\X (unlike other Smalltalks which always ran into a <primitive> when you got to the "performance" parts, Smalltalk\X just supported inline C extensions, so you could just optimize right in place, no JNI/FFI two worlds nonsense). And a world where delivery vendors (browsers, handhelds, etc) had an incentive to simplify program development instead of maintain status quos.
Smalltalk deployment was a PITA compared to compiling a command line tool from C, it was improving, but slowely. Ironically, compared to docker images and microservices and kubernetes and everything else that goes into any "complicated" program to deploy now days, Smalltalk was a breeze.
For me there was a huge divide between those who were supplying Smalltalk and worked that side of the fence, and the customers who were trying to solve problems. I saw that as why Smalltalk lost to simpler/arguably stupider tech like Java.
I have sometimes pondered whether to try making a system that kind of rhymed with Smalltalk, but using Roslyn[0] instead. Although I envisage it as being a kind of playground where you code like Smalltalk, with the artifacts being produced as conventional class files and whatnot so the final build could be done with the normal command line tools.
> For me there was a huge divide between those who were supplying Smalltalk and worked that side of the fence, and the customers who were trying to solve problems. I saw that as why Smalltalk lost to simpler/arguably stupider tech like Java.
Possibly. I did the customer side for 15 years at two different companies, and then Cincom (nee ParcPlace). Smalltalk WAS/IS "weird" (er, "pink plane" paradigm shifty, etc) to begin with.
The fact that the total staffs at ParcPlace, IBM Smalltalk team, and Digitalk was smaller than the marketing team alone that Sun allocated for Java, and the insane amount of money spent bootstrapping college kids into training, I'm not sure how the "divide" would mark Smalltalk as different than the other offerings at the time. Sun just did it on a scale easily 2 orders of magnitude greater than Smalltalk.
Java also had the advantage of being marketed to the large pool of C++ programmers while riding the rising popularity of the web. There was heavy emphasis on applets and networking. Meanwhile, Smalltalk was a bit early to the windowing/pc revolution, OOP and VMs.
I was hoping you've stumbled on more ideas in the subsequent years. It's surprisingly hard to come up with simple + general.
Ever since you mentioned thread-locals, I've been seeing the pattern everywhere. Our current C++ codebase passes around a "Chip" object (basically, a global configuration holding the current settings for the compiler), and it's remarkable how many lines of code it ends up touching.
And it's not so easy to port your idea to C++ using thread_local. It turns out that Racket thread cells != thread_local storage. The key difference is that when you spawn a new thread, the new thread needs to inherit the current value -- thread_local doesn't automatically give that!
And storing the current "Chip" as a global doesn't quite work either. If we spin up two different compilation threads for two different chips, each thread needs to see its own Chip independently of the other. We sidestep that for now by explicitly passing a Chip to any new threads, but it's not a general solution like your idea was. The moment I realized that, I thought wistfully about your comment from 6 years ago, and I've wanted to bug you ever since for "more like that, please."
I do not think a Seasoned Lisper thinks in terms of S-expression at all, counting brackets and doing other such shit. In his mind he sees only the 3D representation of pointer trees. S-expression is just a way to represent this tree in textual format.
I always imagine 3D Lisp-evironment, where you only handle those trees with magical hand gestures like Tom Cruise. But this dream always fails when you go further details, like symbols and numbers: Why would use this archaic system of letters at all? Atoms should be just shapes and sizes.
Even in a somewhat flexible language like JavaScript, it is often painful when you are limited by the lack of this kind of flexibility. The code is _right there_, you can look at it, but you cannot do anything with it. (Exceptions exist where an API provides a sane data representation but that is rare.)
As a JS programmer by day and a Clojure hobbyist it feels like being a ghost in one of these movies, where they cannot interact with the world directly anymore. It's very frustrating. I sometimes think about all of the unnecessary work that is being done, the proliferation of libraries, tools and "transpilers" just because we lack this basic thing.
I have never played with Smalltalk or some of its dialects or descendants. But judging from articles like this one, I have the feeling that I will end up liking it and then feel even more constrained by mainstream languages.
If you decide to get a feet wet on the Smalltalk family way of programming, I recommend the Pharo MOOC. I program mostly in Python, but OOP didn't click for me until I took the Pharo MOOC and saw how it was supposed to be: https://mooc.pharo.org/
I have this idea that the dichotomy between functional & imperative (OOP?) programming is the "same as" the dichotomy in mathematics between algebra and geometry. Some more discussion of this here [1].
I'm not sure that objects are data, depending on how data is defined, and I'd like to link to a very interesting discussion between Alan Kay, one of the creator of Smalltalk and Rich Hickey creator of Clojure (a lisp), discussing between themselves the merits of data and the meaning of data: https://news.ycombinator.com/item?id=11945722
Objects, as seen from the outside, are entities with an identity and with behavior (hopefully associated with some interface contract). In addition, state may be part of the observable behavior of objects (and of their internal implementation). Data has no identity and no behavior, it is an immutable value. (Mutable data structures are actually objects.) Of course, you can realize objects in terms of data (e.g. machine code is data, and state is commonly represented by data), and conversely you can create objects that behave like data (like an immutable data structure), that is, realize data in terms of objects. That doesn’t mean that objects are data or vice versa.
The “identity” and “behavior” encapsulated in an “object” is just more data to be interpreted in the right context. x86-encoded behavior of an object will all be gibberish to an ARM CPU. Or 64-bit code to a 32-bit CPU. You can add more data to say how the other data should be interpreted, but then you need even more data somewhere to state what the set of addressable interpreters is, which needs an interpreter as well. Data upon data upon data…
Agreed, but I prefer to think of immutable data as a message because, in line with Alan Kay's thinking, data is always going to be interpreted by something (even if that something is a person).
I think Alan Kay is saying that messages aren't good enough, and so he favors objects over data, because objects are more than data, they are also the interpreter of the data.
My understanding is that he'd favor something where the code to interpret the message is also serialized and stored/retrieved and passed around, which is how I think Smalltalk works.
Thus Alan Kay favors objects, which are not actually data, but they are data+interpreter bundled together. Where as Rich Hickey favors data on its own, and facilities to represent and model data alone, independently of a given interpreter, and then he favors having various external things apply their own interpretation of the data.
I also thought about the same issue, the term "data" was used a bit sloppily in the article. Similarly the quote from pg explains programming Lisp as manipulating ASTs, which isn't true in a strict sense, more precise: you manipulate a representation that is much closer to an AST.
But I don't think these make the larger points any weaker.
I can add my 2 cents to the Smalltalk discussion as I've been watching just recently many of the Alan Kay's videos on the YouTube.
I can summarize Smalltalk thing this way: existing Smalltalk implementations (Squeak, Pharo etc.) are not something we should use in real world, they are just proofs of concept and food for thought. Real Smalltalk-like/Object oriented systems were bespoke written for specific machine and specific customer. Current day examples of OO (object oriented) systems really are... the Internet with all of its nodes, operating systems UI windowing systems to some degree (although very crippled and simple), maybe something like kubernetes, but again, it's very crippled and awkward. The ultimate goal is to create an environment, where there can exist objects (let's say processes/agents), which talk to the environment and other objects using messages. On top of all them you should be able to create full programming language which would allow you to compose/iterate/send messages to objects in any way you want, creating as many levels of abstractions as you wish, maybe creating new wrapper objects keeping some other objects inside etc. The programming language itself also should be the subject of real object orientation, for example the debugger or running program should be an object reacting to messages like "stop", "resume" etc. Languages like Rust/Go/JS should be used only to implement internals of some object and react to incoming messages in its narrow area when managing memory, data structures and IO. This is partially true today, we use OS to send messages/pipe sockets etc to the objects/processes. Sending messages should ignore the physical location of the object, just like the Internet does.
Thinking this way, the glue code we write today should disappear in 99%, we should be able to start thinking in higher abstractions, about running processes, computations in time and space, how they relate to each other, make queries to the processes, create interceptors etc. This is that I see when Alan Kay says the computer revolution haven't happened yet and we're still dealing with bag of bad ideas from 60s and 70s. Just packaged in nicer clothes, because we have faster computers and better screens.
We should start thinking longer term and not focus on immediate money as quickly as you see some results, because that's running in circles and real wealth is not created.
The article says "Smalltalk classes are powerful because they themselves are objects".
This is true of CLOS (the Common Lisp Object System) as well. Every class is also an object. Which itself is an instance of...a metaclass. Which is also an object. Which you can introspect and modify at runtime.
One can get wrapped around the axle by modifying such things casually, but when you really need it, it's quite handy.
> Languages such as Java, C++, and even Python seem to think that “object-oriented” means mostly “classes and inheritance.” Which is sort of like saying that “driving” means mostly “buttons and pedals.”
It's true. The aforementioned languages not only boil down Smalltalk's object system into a basic "code sharing" scheme, they also perpetuate the unnecessary class and instance dichotomy. There is no reason an "instance" cannot be a "class" by itself. Self (a successor to Smalltalk) gets rid of this dichotomy by assuming a prototype-based object system instead of Smalltalk's classes. In Self, a prototype can be copied and that copy can then be used as its own prototype (this is used heavily in its UI toolkit Morphic).
> The aforementioned languages not only boil down Smalltalk's object system into a basic "code sharing" scheme
This is wrong. C++ wasn't based on Smalltalk. It was inspired by ideas in Simula, which predated Smalltalk. Alan Kay tried to convince people (and succeeded with many) that he owned the term object-oriented, because he was the first person to use it. He wanted it mean only his ideas, but from very early on people were applying it to a variety of ideas that were developing concurrently in the sixties and seventies, his being only one flavor among others. He never gave up trying to convince people that all the other competing ideas were a distorted, mistaken version of his own, but the truth is that those other ideas had their own history and earned their success on their own merits.
I think Smalltalk was inspired by Simula and Kay has given it credit. For me the big insight of Simula was making the object first class. Once you make something like that first class you get a lot of benefits and the distinction between the styles of doing that (e.g. Lisp Vs Smalltalk or Simula) are important but not near as important as that first insight.
> In Simula I, Dahl made two changes to the Algol 60 block: small changes, but changes with far-reaching consequences. First, “a block instance is permitted to outlive its calling statement, and to remain in existence for as long as the program needs to refer to it” [11]. Second, references to those block instances are treated as data, which gives the program a way to refer to them as independent objects.
> Alan Kay tried to convince people (and succeeded with many) that he owned the term object-oriented, because he was the first person to use it. He wanted it mean only his ideas, but from very early on people were applying it to a variety of ideas that were developing concurrently in the sixties and seventies, his being only one flavor among others.
Do you have any citations for this? From the quotes I've seen, it sounds like he regrets using the term 'object-oriented' for the approaches he was thinking of:
I think you get a good idea of his attitude in the first sentence:
> terms are also "colonized" for political and fad reasons
His version of the story always casts aspersions on alternative OO ideas and frequently, as here, implies devious and mercenary motives to people who were simply working along different lines.
Don't get me wrong, Alan Kay was brilliant, but like many visionaries he had utopian expectations for how his ideas would impact the world. When the future didn't materialize the way he envisioned it, he committed to this narrative that his ideas had been robbed of their success by imposter ideas that stole away attention and energy. It's a shame because there's a much more positive story to be told about the proliferation of different OO ideas that mutually influenced each other and succeeded in different ways.
Speaking of prototypal object systems, is there still an application for JavaScript prototypes today? I can't remember interacting with them even once in the last five years unless it was some legacy code from long before that.
That said, I don't think classes was much of an improvement. I've seen a lot of profoundly stupid code come out of it, like singletons with a static getInstance(), just because that's what you do in Java, never mind that JavaScript has object literals. Then again before that I saw the same stupid singletons, but implemented with prototypes instead.
Which it is worth pointing out has (relatively recently, in 2015's ES6) added classes too – which are mostly just syntactic sugar for prototypes (but not entirely, since they also add some new semantics for which there is no equivalent in pure prototype-based code.) Looking to the future, they are likely to add more new semantics and syntax to classes, which will have no direct equivalent in pure prototypes, causing JavaScript/ECMAScript to evolve from a purely prototype-based language into more of a hybrid between prototype-based and class-based OO.
I think, even in prototype systems, the class-vs-instance dichotomy still exists at a conceptual level, even if the language/runtime doesn't make it explicit. My "window" prototype might be the same sort of thing as an individual window, but it would be a big mistake to try to actually display the prototype on the screen. Being explicit about the class-vs-instance dichotomy is a kind of strong typing – the language and runtime is aware of the conceptual distinction; whereas prototype-based languages are comparatively weak typing – the conceptual distinction still exists in the mind of the developer, but the language/runtime doesn't know about it. Class-based languages catch the "trying to use the prototype as if it was an actual instance" bug at compile-time, prototype-based languages leave it to blow up in bizarre ways at runtime (accidentally treating the prototype as an instance can lead to modifying the prototype, which can then cause unpredictable consequences for all the other objects having that prototype.)
One big problem with Java, is the lack of meta-classes. Why can't I subclass Class? There are cases where doing so might be a sensible solution, but Java won't let you. In Java, for my instance methods, I can inherit them from a superclass, or have them implement an interface; but, static class methods cannot be inherited, nor can the static methods of a class implement an interface. This forces people into workarounds such as the factory pattern (which many Java applications/libraries massively overuse), when having a static method implement an interface might be a simpler solution – but that would require us to be able to (effectively) subclass Class. It also produces weird things like 'Class<T>' to mean a class extending class (or implementing interface) T, where meta-classes would offer a much more elegant solution. I don't think the real problem with Java is that it is class-based rather than prototype-based, I think the problem is that its class system is half-baked (no metaclasses, no reified generics, etc)
A point this article raises, which is very true – in Smalltalk, reflection is read-write – modifying classes at runtime is as easy as querying them. Java reflection is essentially read-only. It is possible to do read-write reflection in Java, but it involves immense complexity with source code or byte code generation libraries, custom class loaders, etc – to do something which Smalltalk supports trivially. The argument on the Java side, is that read-write makes optimisations (JIT/etc) a lot harder. No doubt true, but there are alternatives which Java has not pursued – for example, support multiple versions of a class, code is JITted to work with the latest version at JIT time, but other versions are detected and cause a fallback to interpreter mode (possibly followed by re-JITting to add support for that other version.)
> My "window" prototype might be the same sort of thing as an individual window, but it would be a big mistake to try to actually display the prototype on the screen.
This is not how Self works. In Self, it is legitimate to take an existing window, clone it, and change properties and methods to taste. The first window serves as prototype for the second, despite also being a regular window. It's also possible to use what I call the Master Mold pattern (after the Sentinel from Marvel Comics whose only purpose is to make more Sentinels). That is, to construct an object that isn't actually used except as a prototype for other objects. In that case, the prototype will indeed function like a class. But this isn't necessary or even encouraged in Self.
> In Self, it is legitimate to take an existing window, clone it, and change properties and methods to taste. The first window serves as prototype for the second, despite also being a regular window.
But doesn't this have the risk, that I change some property on the first window, without realising the second window is using the first as its prototype – and while that property change is beneficial for the first window, it is detrimental to the second?
I suppose the same thing can happen in class-based OO – I might make some change to a superclass which is beneficial for one subclass but breaks another. However, I think the class-vs-instance dichotomy helps here – if I'm trying to evaluate whether a change to the superclass might break a subclass, I can focus on the subclasses, and often I can get away with not studying every piece of code which instantiates or uses the class and its subclasses. I can't use that heuristic in prototype-based languages, since the language syntax and semantics don't distinguish unextended uses from extensions.
> It's also possible to use what I call the Master Mold pattern (after the Sentinel from Marvel Comics whose only purpose is to make more Sentinels). That is, to construct an object that isn't actually used except as a prototype for other objects.
Oh sure, but here we are back to "weak-typing" vs "strong-typing". A class-based language enforces the "Master Mold" pattern, and if you try to violate it, your code won't even compile. A prototype-based language doesn't enforce the pattern, it is just in the programmer's head. Being able to break it in an exceptional case might be helpful; on the other hand, there is the risk the programmer might break it by accident rather than intention, producing bugs which would never happen in a more strongly-typed language.
> But doesn't this have the risk, that I change some property on the first window, without realising the second window is using the first as its prototype – and while that property change is beneficial for the first window, it is detrimental to the second?
As I recall in Self, the two objects are independent of each other aside from the second being derived from the first upon its creation. Meaning changing one does not affect the other.
"But what if you want to change something about both windows, or all windows derived from a prototype?" you might say. "You would have to change them all manually, for every single affected window!" To which I say the Self developers apparently thought the rigidity of class hierarchy and the fragile base class problem were bigger problems than the unergonomic nature of making sweeping changes across an entire set of derived objects.
Class-based object systems present shortcomings because you cannot accurately predict a class hierarchy that will solve your problem especially if it is unknown. Even in a language as flexible as Smalltalk, class-hierarchy decisions made before a problem was fully characterized impeded the programmers' ability to fit a solution to the problem. OO developers have learned to accept these shortcomings, and the extra work it takes to work around them. Self is an experiment in eliminating these problems, and it just presents a different set of tradeoffs.
What was really interesting about Self was its optimizability - they were able to implement a Smalltalk on Self that was much faster than the contemporary Smalltalk systems. The impression I got from poking around at Self was that there was a class/inheritance system informally approximated in the prototypical relations of the objects and traits, but that doesn't mean you can't formally define one.
For anyone curious, I recommend this talk [0] by David Ungar, one of Self's creators, that explains the history, philosophy, successes, and failures of the Self project. It's a shame more didn't get done with it (though V8, arguably the most important language runtime, is very similar). Someone has even been making a Zig version of Self recently [1].
>accidentally treating the prototype as an instance can lead to modifying the prototype, which can then cause unpredictable consequences for all the other objects having that prototype
In my head, that's not how prototypes work -- prototypes are objects that serve as a template for other objects. The behavior of the object is defined by its _traits object_ which is pointed to by a parent slot. Therefore there is no difference between modifying the prototype object and its copies - sending the copy message to them will still yield the same result, and one does not affect the other. The only difference is that the prototype object is in a well-known location for copying convenience. Note that this does change once you modify the traits object, but that's not an operation you do lightly anyway.
> Java reflection is essentially read-only.
I believe part of this comes from the bytecode being read-only, no?
> The argument on the Java side, is that read-write makes optimisations (JIT/etc) a lot harder.
> No doubt true, but there are alternatives which Java has not pursued – for example, support multiple versions of a class, code is JITted to work with the latest version at JIT time, but other versions are detected and cause a fallback to interpreter mode
This is already how it works. You can define any method definition at runtime and the JIT will fallback until it can optmise the new version again. What Java doesn't allow is modifying classes or interfaces because that would compromise the type safety of the language, which AFAIK is not a concern in Smalltalk.
Another power of Lisp is that it can accept functions as first-class arguments and produce functions as returned values.
This means you can do obvious things like sort an array given a comparison function, or produce a generator from a static array. But it also means you can convert one function into another kind of function. Functional analysis calls these operators. Differentiation is an operator, as is integration, as is the Fourier transform, as is function composition. Many such things don't even need macros, but macros can help.
The dynamic part of lisp and its clos (from smalltalk), smalltalk, objective c (from smalltalk?), and javascript (from scheme abd its object prototype heritage) is the interesting one. The system can be modified whilst running … like a mainframe that cannot shut down but its code can be changed in-flight (strange to say that as it is cics etc role).
[1]: This book introduces the phrase structure in a shockingly LISP-like way. https://www.amazon.com/Grammar-Frank-Palmer/dp/B000S5VSAS
https://en.wikipedia.org/wiki/Phrase_structure_rules