Hacker News new | past | comments | ask | show | jobs | submit login

Text as code, for instance, is just an extension of a concept which has been around in one form or another since the scribes of ancient Babylon, or thereabouts. I can imagine more complex ways of representing code but I can't really imagine anything more efficient for translating human concepts into machine language. Except maybe direct, augmented telepathy, but even then people would probably think to the computer in glyphs and symbols. Complexity doesn't always correlate with advancement.



I once had the pleasure of asking Marvin Minsky if he thought we'd ever use something other than text to create computer programs. Without missing a beat he said, "If it's good enough for Plato and Aristotle, it's good enough for me."


The problem with code as text is not the visual representation but the manipulation. From the point of view of tooling for live-coding, it's incredibly painful to deal with an unstructured pile of text that at any given point may be in an invalid state (ie partially edited). Structured editing (http://en.wikipedia.org/wiki/Structure_editor) allows the tooling to deal with consistent, structured data whilst still taking advantage of the expressive power of language.


It's actually not that hard to deal with text, even in partially edited states. It's just most people don't know how to build a decent incremental parser with fairly good error recovery, but some of us do.


But the hard part isn't dealing with invalid parses and error recovery. The hard part is dealing with completely valid bits of code that aren't yet finished:

    (doseq [x (range|cursor|)]
      )
You can wait until the end of time, but that won't finish ;) Whereas I probably wanted to get out (range 10) before it went off and looped forever.

Text also doesn't provide you with stable IDs. If I previously had function "foo" in a file and now that's gone, but there's a function "bar" in the same place with roughly the same code, does that mean it was renamed? Or did foo get deleted and we need to surface that anything using foo is an error? How would you know? The only thing you have to go on is position and position can change for all sorts of other reasons. I ran into this problem with trying to do functions in single editors: when the file can change in any way, reconciling that file with your previous view of the world is tremendously difficult.


I don't have a problem with this. If you accidentally encode an infinite loop, you just break it on the next code change. You can memoize your token stream in an incremental lexer to use tokens as stable IDs: you use the same symbol for bar as you did for foo because the token/tree start position never changed; only the contents of the existing token happened to changed! This is what I do in YinYang, and it works well. Of course, to get here, you have to be incremental 100% and not use existing batch tools (or adapt them first to be incremental).

I'd be happy to share these techniques sometime.


I'd enjoy reading.


Ok, I'll make an effort to really write this up sometime, maybe as an SLE paper.


Please do even if only as an informal blog post or something. :)


I agree that the problem is not error recovery, but is more generaly described by Peter Naur's "Programming as Theory Building"[1]. With that, programming is hard because creating a theory, collectively, and teaching other people about it, is hard. Making it easier is therefore solving the problem Douglas Engelbart envisioned solving: Augmenting the Human Intellect

[1] http://catenary.wordpress.com/2011/04/19/naurs-programming-a...


I apologize for the fanboy post in advance, but Peter Naur was such a badass.


Any chance you could point me at some resources on incremental parsing? It's something I've been interested in for a while.


I wrote a workshop paper last year, but it wasn't very deep. Actually, incremental parsing, at least the way I do, doesn't really involve any special algorithms, here are the three points:

* making your lexer incremental first, memoize tokens between edits so you have something to attach trees to that will still be there after the next edit!

* Match braces (or determine indented blocks) separately before you even bother parsing. This makes incremental parsing more robust, and it is really easy to do without parsing information (unless your language overloads brace lexemes, like Java generics do! Scala didn't have this problem however). Also, autocomplete braces even if your users really hate that, because otherwise you won't have anything nice to reason about 99% of the time (indented languages are really better here for obvious reasons). Tell them to just use emacs if they don't like it.

* The last thing to do is just memoize parse trees (by attaching them to your memoized tokens) and then replay parsing on trees whenever their parses could have changed. So if their is an edit, invalidate the inner most tree, or multiple trees if the edit occurs at a boundary. If a child tree parse changes, invalidate its parents. Log your symbol table additions so you can undo them when they are no longer done on a parse (or the parse tree is deleted); trace your symbol table references if the symbol binding for the name being looked up changes, and don't worry about type checking in a separate pass, because you can just replay the parse multiple times if necessary.

The last point is quite fascinating and something I'm working on right now to turn into a more general programming paradigm. Check out:

http://research.microsoft.com/apps/pubs/default.aspx?id=2112...


Yeah, all those Structure-Editors and Node-Editors and Meta-Node-Editors and all those years and still more and more simple text-editors keep winning. My personal impression is that developers are actually moving away from those uber-tools, IDEs even and move toward vim plus shell tooling...


Fair enough.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: