I think what we are missing is smarter tools. I think that's the argument here. ...

cousin_it · on May 18, 2014

So, basically you want a uniform way of defining data structures that can use different kind of indices on different fields. Let me add a few more desiderata to your list:

- It would be cool if these data structures could be transparently saved to disk, because data usually lives longer than code;

- It would be cool if they could be used by multiple applications, with some sort of guarantee that each application sees a consistent view of the data at any particular time;

- It would also be cool to have a declarative language for accessing several of these data structures at once, "joining" them on the values of certain fields and dynamically picking the best indices to use. Since you want a marketplace of implementations, it would make sense to standardize that language, so different data structure libraries can support it. A good name for it would be Structured Query Language, or SQL for short.

Tl;dr: You are completely right that network databases (object graphs and special-purpose data structures) are inferior to relational databases (declarative general-purpose data structures that separate querying and indexing, and allow a host of other general-purpose functionality). That idea has the potential to transform the whole software industry and lead to billion-dollar profits, as it has amply demonstrated since the 1970s. But since today's programmers are a forgetful lot, perhaps you could rebrand the idea and sell it to them as "Big Data 2.0" or something.

SolarNet · on May 21, 2014

I understand how SQL ties into what I want, but there are a few points you might have overlooked for what I wanted:

* Micro-optimizations. The implementations should be granular, not monolithic like SQL servers (let alone the fact that optimized SQL rarely translates between the implementations, as most optimization commands are not cross platform).

* Embeddable DSL, this one is honestly a problem with programming languages in general, but I've yet to see a static programming language which can tell me when my query is malformed at compile time (or hell, even provide decent syntax highlighting). Let alone importable, custom, extensions to SQL, or embedding lazy functions within the data structure.

* Intent vs Exact. The ability to describe partial intent, rather than exactly how to organize, the data (e.g. describing the operations expected to be executed across it, rather than the indexes to build). As well as having information available from the compiler about how the data structure is used (so that without the explicit declarations, the intent can still be optimized for).

SQL is a fine direction to approach the big picture problem from, but it isn't anywhere near the goal I care about. I'm not interested in dealing with big data. I'm just tired of having to know the difference between arcane implementation names for ordered sequences [1], I want to be able to describe what I want my ordered sequence to do, and let the compiler figure out the best implementation available.

[1] http://www.programcreek.com/2013/03/arraylist-vs-linkedlist-...

jamii · on May 17, 2014

Absolutely. This is actually part of our plan. We're writing mid-level-ish specifications in a logic language and then compiling them into an incremental dataflow network. Our current prototype just uses the same implementation for each rule but later we intend to be able to pick and choose different data structures and join algorithms per index/rule.

I have a lot to say about separating specification from implementation and trying to better capture programmer intent (eg http://www.freelists.org/post/luajit/Ramblings-on-languages-...) but the blog post was already pretty long.

platz · on May 17, 2014

To me it seems the only way to smarter tools is static analysis, but there's only so much one can do in a dynamic language.

There seems to be this dissonance between dynamic languages, which tend to be popular with new programmers for a variety of reasons, and better tools enabled by static analysis of (usually) static langues.

The static languages seem to present this "hump" that a lot of people are averse to, and if we really want these smarter tools, we're going to have to find a way to get programmers to use more static techniques; we need to get them over that "hump", possibly then these smarter tools would ease things in the long run, but doesn't have quite the same immediate-gratification aspect of dynamic languages due to the up-front effort.

seanmcdirmid · on May 18, 2014

There is also so much one can do with a static language.

Dynamic languages are preferred by those who don't want the conservative verbose static type system to get in the way of writing code. You can design a static type system that is less verbose (e.g. via more inference), but then it can become more conservative (H&M's inability to deal very well with semi-unification in the form of subtyping and assignment). That is the "hump."

Smarter "more magical" compilers are a nice idea in theory, but hard to realize in practice. There are even limits to the kind of analysis we can do dynamically, but they are a bit less constrained then what we can do statically.

raverbashing · on May 18, 2014

You can do a lot of static analysis of dynamic languages. Check what PyLint/PyFlakes does for example

It's probably not the dynamic language that is the problem, but its constructs.