*But removing or harming the ability to do text analysis [...] does stop us from...

rlpb · on Aug 2, 2013

> Well, OK, but we’ve been using these text-based tools for a year or two now, and I don’t see many radical advances taking place in how we use or combine them. Are you sure you’re not chasing an illusion here?

I'm not claiming recent advances. I'm claiming existing power that has been around for decades, which we would lose if we compromised the text tooling available today.

Have you read TAOUP? Do you understand the extent of the power that existing text tooling gives us today? Are you experienced in the advanced use of the existing tools, so you are able to make comparisons about their power?

> As long as everything is limited to manipulation of freeform text files, perhaps you never will. That doesn’t mean better tools aren’t possible; it just means they aren’t possible within the constraints you’re choosing to impose.

This is backwards. We move forward when people show how it can be done. Please show us how we can improve diffs, patches and merges by moving to semantic data structures over text, without compromising any existing capabilities. Even just illustrating specifics of how these tools might work, rather than implementing them, will do something for your argument. The onus is on you.

Chris_Newton · on Aug 2, 2013

I'm claiming existing power that has been around for decades, which we would lose if we compromised the text tooling available today.

Why would we lose it? The power of those tools isn’t in a particular executable, it’s in the algorithms they embody. For example, it is useful to compare two text streams reasonably efficiently and identify differences. How those differences are then presented obviously matters, but if you’ve got the algorithms and the ideas underlying them, producing a new tool to apply those ideas in a different context is the easy part.

The only significant difference I see is that if you made a major change, for example adopting a more structured storage model or using some sort of action/history analysis to better capture a programmer’s intent, then you would have more data to use in your algorithms, and you might therefore be able to present more interesting results.

Have you read TAOUP? Do you understand the extent of the power that existing text tooling gives us today? Are you experienced in the advanced use of the existing tools, so you are able to make comparisons about their power?

Yes, though I find your emphasis on that one book a little surprising. For one thing, the UNIX philosophy was established for decades before Raymond wrote that particular work. For another, I seem to recall that he gives examples of both text and binary formats being useful in the book. I don’t think his point was that text formats were good and non-text ones bad; I think he was arguing that things like adaptability and composability were good and that flexible and standardised formats helped to achieve those things.

We move forward when people show how it can be done.

Right, so why aren’t the programming language community picking up on decades of research and industrial progress with databases and HCI? Programming languages and the related tools are, fundamentally, just a user interface to design and control a complex, highly structured set of data.

Please show us how we can improve diffs, patches and merges by moving to semantic data structures over text, without compromising any existing capabilities.

You’re begging the question, by starting from the position that having an equivalent to today’s text-based diffs, patches and merge tools is desirable. I don’t think that is necessarily true.

As a programmer, I want to be able to specify how my software should work, and I want to be able to explore and modify that specification effectively, and I need to be able to do these things in collaboration with others. My claim is that to do those things much better than we do them today, we may need to move to a different representation than freeform text and then build new tools that are designed to solve our problems in terms of that new representation.

The problem is that there is so much momentum behind text-based formats today that we are effectively stuck around a local maximum. No one individual could possibly meet your challenge today, and I’m sure you were well aware of that when you made it. That doesn’t mean the community as a whole couldn’t do it, but it would need some serious collaboration, which realistically means one of the heavyweight organisations with the resources to bootstrap a whole new software development ecosystem would need to throw its weight behind such a project. Unfortunately, most if not all such organisations are commercial in nature, and the commercial incentives don’t align with moving in that direction.