Hacker News new | past | comments | ask | show | jobs | submit login

Kinda with you until "terrible query language" as, in my mind, it's the best query language.

You can also opt our of rigid schema by using JSON columns. Which I generally promote as a new best practice when the database is being uses as a dumb store for a smart application. It comes down to who should be the source of truth for the schema.




> Kinda with you until "terrible query language" as, in my mind, it's the best query language.

It's a decent language for ad-hoc querying by humans; the problem is it's the only interface to the database that you get. It was never designed for machine use; in a modern RDBMS, 3/4 of the time to execute a pkey lookup is spent parsing the SQL string. Yes, prepared statements can help in some cases, but they come with their own overheads that make them difficult to use safely in a large system.

> You can also opt our of rigid schema by using JSON columns.

You can, but usually in a database-specific way, and support for that in drivers and especially at the ORM level is pretty spotty.


All the databases seem to have database-specific ways. SQL is the least vulnerable to this. Obviously if you leave SQL for e.g. Mongo, then you could not have picked a more bespoke and unique interface. Notionally SQL is standardized, but who migrates databases anyway.

Network latency is the real performance drag, not string parsing.

I think JSON support will improve, but SQLAlchemy for example is ok with a common JSON interface over mySQL and Postgres. I am sure this will resolve itself in time, its just a bit new for SQL.


> Obviously if you leave SQL for e.g. Mongo, then you could not have picked a more bespoke and unique interface.

In principle yes; in practice you can expect to find full-featured drivers in all major languages, and anything higher-level that claims support for Mongo will also have support. Certainly even the most basic Mongo drivers will let you have things like collection columns.

> Network latency is the real performance drag, not string parsing.

Depends what kind of network, if any, is between the two endpoints. But the performance aspect is just illustrative of what a poor format for expressing machine-level queries it is.


> I think JSON support will improve, but SQLAlchemy for example is ok with a common JSON interface over mySQL and Postgres.

This works extremely well with Postgres. The MySQL one is I think just saving it as a text string, which may become a killer.


> In a modern RDBMS, 3/4 of the time to execute a pkey lookup is spent parsing the SQL string

These times are on the order of sub-milliseconds; about 0.1ms with PostgreSQL on my modest laptop with ~30 million row table.

You make it sound like it's some sort of horrible performance hog, but 0.1ms for parsing a dynamic language really isn't that bad. Actually fetching the row takes about 1ms (without anything cached, faster otherwise), so that's hardly "3/4th" either. With cache it's about 0.1ms, which is about half.

But in reality most queries are more complex, and the parsing time becomes negligible; even for a fairly complex query it's about 0.6ms, which is hardly worth thinking about if you consider that the query execution takes about 60ms.


> But in reality most queries are more complex, and the parsing time becomes negligible

Depends on your usage pattern. If you're actually doing a bunch of different ad-hoc aggregations (which is what SQL was designed for) then yes, query parse time is irrelevant. If you're using an RDBMS as a glorified key-value store (which is what most web applications tend to do) then it's very possible for pkey lookups to be the majority of your queries. (My point isn't really about performance, it's about SQL not being designed for that use style in general).


I have never seen SQL being used as a key-value store. That most web apps use it like that is an incredible claim to me. Do you have any source for that?


I've seen database profiling numbers from some of my clients, but obviously I can't publish those. To be clear all I'm claiming is that for most webapps the overwhelming majority of database queries are simple indexed column lookups, not that most webapps are putting document blobs in the database or using EAV style or anything like that.


Claiming that most queries are simple indexed columns lookups is a far cry from claiming most SQL databases are used as simple key-value stores.


Well, "key-value store" means different things to different people. If most queries are simple indexed column lookups then that carries my original point: query parse time is actually a significant proportion of overall datastore request processing time.


You're whinging about performance and then desire for a full featured ORM. Choose one.


The performance aspect is just illustrating of what a poor representation SQL is for machine-level use. And I'm by no means demanding a "full-fledged" ORM; even basic querying requires a layer or two above SQL to do cleanly.




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: