Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I think what we are missing is smarter tools. I think that's the argument here. What we really need is the ability to say something like "I want a data structure D holding objects of type Q, with an ordering using this key-accessor(A) called Alpha, and an index with this key-accessor(B) called Beta" and then later in the code say "I want to observe D for the following conditions ..." and let the compiler figure out how to store that data in memory, how to most efficiently build a data structure to trigger those conditions, etc. (Instead of saying List<> or IList<> or writing a custom data structure, etc).

And then if you really care (and this is where the non-text/denser ASTs come in; as well as a deeper understanding of computer science and programming) you can add annotations to your description of the data structure like "I want reverse traversal of ordering Alpha to be O(n)", annotations could even cause a compile error like, "I want to insert via index Beta in O(1) time, with an O(n) in order traversal time of Beta". And even fine grained annotations like "For index Beta use a red-black tree" or more specifically "To implement index Beta use module 'foo.bar.baz' by 'FancyBeansCorp', using class 'AwesomeEnterpriseIndex'"

Let the tools do the heavy lifting! Let data structure and systems programming experts write data structure implementation strategies. Hell, I bet you could build a marketplace of code just for optimized data structures, if it was easy to swap them in and out. Heck those sorts of dependencies should be at the company level, without even touching the project in question (a company level code policy to use a specific library of data structure implementation strategies, regardless of project being build (unless the project or user overrides it)). Let alone a market place of implementations for algorithms, general code patterns (conditionals, whiles), AI, networking, UI, etc.

Compilers should take a long time. A very long time. Sure we can short circuit them for debugging, and just use the best workable code we got in a few seconds, so we can test (and besides, with constant compilation, it should have already solved most of the problems reasonably well by the time you click run). Our compilers should be smart enough they can change their output for different machines not just at an assembly optimization level (better intrinsic, etc), but for different cache sizes, for different memory performance characteristics (changing between intensive memory data algorithm to intensive processor algorithm) as necessary. And yes, I expect it to be deterministic for every platform.



So, basically you want a uniform way of defining data structures that can use different kind of indices on different fields. Let me add a few more desiderata to your list:

- It would be cool if these data structures could be transparently saved to disk, because data usually lives longer than code;

- It would be cool if they could be used by multiple applications, with some sort of guarantee that each application sees a consistent view of the data at any particular time;

- It would also be cool to have a declarative language for accessing several of these data structures at once, "joining" them on the values of certain fields and dynamically picking the best indices to use. Since you want a marketplace of implementations, it would make sense to standardize that language, so different data structure libraries can support it. A good name for it would be Structured Query Language, or SQL for short.

Tl;dr: You are completely right that network databases (object graphs and special-purpose data structures) are inferior to relational databases (declarative general-purpose data structures that separate querying and indexing, and allow a host of other general-purpose functionality). That idea has the potential to transform the whole software industry and lead to billion-dollar profits, as it has amply demonstrated since the 1970s. But since today's programmers are a forgetful lot, perhaps you could rebrand the idea and sell it to them as "Big Data 2.0" or something.


I understand how SQL ties into what I want, but there are a few points you might have overlooked for what I wanted:

* Micro-optimizations. The implementations should be granular, not monolithic like SQL servers (let alone the fact that optimized SQL rarely translates between the implementations, as most optimization commands are not cross platform).

* Embeddable DSL, this one is honestly a problem with programming languages in general, but I've yet to see a static programming language which can tell me when my query is malformed at compile time (or hell, even provide decent syntax highlighting). Let alone importable, custom, extensions to SQL, or embedding lazy functions within the data structure.

* Intent vs Exact. The ability to describe partial intent, rather than exactly how to organize, the data (e.g. describing the operations expected to be executed across it, rather than the indexes to build). As well as having information available from the compiler about how the data structure is used (so that without the explicit declarations, the intent can still be optimized for).

SQL is a fine direction to approach the big picture problem from, but it isn't anywhere near the goal I care about. I'm not interested in dealing with big data. I'm just tired of having to know the difference between arcane implementation names for ordered sequences [1], I want to be able to describe what I want my ordered sequence to do, and let the compiler figure out the best implementation available.

[1] http://www.programcreek.com/2013/03/arraylist-vs-linkedlist-...


Absolutely. This is actually part of our plan. We're writing mid-level-ish specifications in a logic language and then compiling them into an incremental dataflow network. Our current prototype just uses the same implementation for each rule but later we intend to be able to pick and choose different data structures and join algorithms per index/rule.

I have a lot to say about separating specification from implementation and trying to better capture programmer intent (eg http://www.freelists.org/post/luajit/Ramblings-on-languages-...) but the blog post was already pretty long.


To me it seems the only way to smarter tools is static analysis, but there's only so much one can do in a dynamic language.

There seems to be this dissonance between dynamic languages, which tend to be popular with new programmers for a variety of reasons, and better tools enabled by static analysis of (usually) static langues.

The static languages seem to present this "hump" that a lot of people are averse to, and if we really want these smarter tools, we're going to have to find a way to get programmers to use more static techniques; we need to get them over that "hump", possibly then these smarter tools would ease things in the long run, but doesn't have quite the same immediate-gratification aspect of dynamic languages due to the up-front effort.


There is also so much one can do with a static language.

Dynamic languages are preferred by those who don't want the conservative verbose static type system to get in the way of writing code. You can design a static type system that is less verbose (e.g. via more inference), but then it can become more conservative (H&M's inability to deal very well with semi-unification in the form of subtyping and assignment). That is the "hump."

Smarter "more magical" compilers are a nice idea in theory, but hard to realize in practice. There are even limits to the kind of analysis we can do dynamically, but they are a bit less constrained then what we can do statically.


You can do a lot of static analysis of dynamic languages. Check what PyLint/PyFlakes does for example

It's probably not the dynamic language that is the problem, but its constructs.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: