A bit shameless to ask a question under top HN record, but I always wondered why do not SQL engines give access to their SQL AST to generate queries. Of course, one can combine a query from strings with placeholders, and it takes not so much comparing to actual querying process, but it feels so hacky. Why not provide something like:
> A bit shameless to ask a question under top HN record, but I always wondered why do not SQL engines give access to their SQL AST to generate queries.
A number of reasons from the PostgreSQL POV (work on it):
- our internal parser representation isn't stable and frequently changes from major release to major release, occasionally even in minor releases.
- we'd need a lot of additional verbose error checking code to make sure only sensible AST can be submitted
- there's two forms of parser AST "raw" and "analyzed". The former is produced without access to the database, the latter has done validity checks, type lookups, etc. The former is what you could validly produce remotely - but it's also more cumbersome to deal with.
I think there's arguments for supporting a useful different representation than SQL, but I think doing this with the already existing AST isn't going to fly. And a new proper language / representation would be a lot of work...
> And a new proper language / representation would be a lot of work...
I don't think a new language would be needed, and most of the work has already been done in projects like LINQ. You'd just need to expose a stable AST front-end, which would be a lot easier than a whole new parser. I mean, it's just relational algebra right? ;-)
> You'd just need to expose a stable AST front-end, which would be a lot easier than a whole new parser.
"Just". You still need a parser for that AST if it comes from the client, you need to convert that into the actual internal representation with validation and everything.
> I mean, it's just relational algebra right? ;-)
It's really not. A lot of SQL can't conveniently expressed in relational algebra (could luck with recursive CTEs) and a lot of other queries aren't as efficiently representable.
> "Just". You still need a parser for that AST if it comes from the client, you need to convert that into the actual internal representation with validation and everything.
> A lot of SQL can't conveniently expressed in relational algebra (could luck with recursive CTEs) and a lot of other queries aren't as efficiently representable.
Clearly some language is being used to reason about these things, and evidently, this language is reflected in some SQL-like surface syntax. It would be no more a burden to provide an extensible AST than an extensible SQL-like language. In fact, it'd be considerably simpler since tokenizing, parsing and verification would all be simpler.
This isn't usable for all sorts of all statements as, among other things, it's hard to maintain a stable API for all aspects. But might be useful for common cases. If there's more you need I'd be interested in a feature request!
(I'm working in MySQL's clients team)
There shouldn't be much of a difference in performance, there are some areas on the server we can improve in the longer run and hope is that clients can write requests more efficiently ... in a typical situation the parsing of the query hopefully isn't the bottleneck, but the actual execution or processing of the result in the application ;-)
This is essentially what an ORM does, e.g. SQLAlchemy [1]. Except instead of providing a slightly nicer API to generate query strings, an ORM tends to be more tied to the host language and use its object system to abstract out most of the SQL complexity and make declaring SQL tables/queries look like normal operations on the data themselves instead of the database. Django's ORM [2] for example autogenerates joins for you under certain query conditions, which is super handy.
I'm aware of SQLAlchemy Core/ORM (and ORMs in general), my question is more like why do we have to concatenate strings in these project's internals instead of using some API to pass query structures without intermediate composing/decomposing. For small queries that ORMs often do, it feels like needless and annoying overhead.
For now, even '?' placeholder escaping is actually manually-printf'ed under the driver's hood, afaik, so in the end we send plain text to the server. SQLite probably avoids that via sqlite3_bind() call, which allows to pass SQLITE3_STATIC string args if you want, but that's the benefit of in-process engine.
You need some way of serializing your data structures to a wire format, and SQL has worked ok so far.
Besides a potential of a performance gain in parsing at the server side, I'm not quite sure there's much gain you would get - there might not be that much difference in serializing those data structures to "plain text" vs any other format you come up with.
See the part about support-costs in a different link.
But even more, I think it's "in the nature of programming" that for a given system you have some parts are regularly structured and externally visible but other parts that seem like they ought to be visible and changable from outside but which aren't and can't easily be made so.
This might be because they use intermediate variable or it might be because they have changed the implementation over time or for a variety of reasons.
And this ultimately comes down to the difference between the way intuitively you'd think X feature should be implemented and real, much-more-messy way X feature is actually implemented - the difference internals and an API.
Having internals and externals be equally easy to work with was the dream of the object orientation but sadly that has remained a fantasy and unfortunately looks like it will remain so.
I highly doubt that placeholders are interpolated into the SQL statement. Doing so will defeat the purpose of having them in the first place. Rather, they are tranferred out-of-band to the SQL "engine", as it were, and are used as straight up variables.
I checked it for PostgreSQL right now, and it seems that libpq can send them OOB via PQexecParams() and networked integer buffers. But that's not the case with OpenResty's pgmoon driver, which doesn't have such an interface. E.g. Lapis framework built on top of that simply escape_literal()-izes all arguments and interpolates right into query text. So, YMMW with specific (or generic) drivers.
Placeholders are there for those who tries to concatenate queries by hand; having them properly escaped in protocol should not defeat client-side security purpose.
How come it's impossible to properly escape user input? I get that it's hard if the format is complex, but I don't see what makes it categorically impossible, especially for a simple format like an SQL string. You just double every single quote and then surround the whole thing with single quotes.
Do you know any decent server that interprets 6-char \uxxxx sequences as part of its top-level query syntax? I mean, how do you safely pass '\u0027' literal to it anyway? If you can't, then its literals and escaping are broken by design, so it is good to know from the start of using it.
Edit: removed double backslash from literal to not confuse it with host language's escaping.
I know a lot (a lot) of non-technical SQL users. SQL is a widespread abstraction from the underlying databases. It's very powerful.
What you describe is useful from a programmer's point of view, but maybe the wrong abstraction. You could write such a thing which actually generates SQL from an API like you've defined. You could write an alternate ODBC engine.
But there is a lot of overhead for database authours to actually provide an API like you describe. It's far simpler to provide something SQL-compliant with an ODBC or JDBC driver and now you can plug in almost anywhere.
I have used a database engine that exposes primitives to build filters and similar programs to what you describe, and selectively push them to the server. It's not a very deep abstraction.
Part of the reason is support. Long ago Britton Lee[1] in fact provided an AST API mainly for tool vendors. They stopped doing that when one of their customers called them asking for help with the "GUIDE BY" clause. Nobody at BL knew what a "GUIDE BY" was since it was something implemented by the frontend, but that didn't matter to the customer.
Product Manager for MySQL here. Yes, I agree with you that the reason is most likely support.
Consider SQL; it's declarative and highly expressive. Your users can craft queries that you did not consider, and you have to make sure they work consistently between versions. It's a very hard problem.
Creating more public APIs? It's a hard sell that vendors are sensitive to. Reducing the surface area makes it easier to add future improvements.
This is worse than a useless comment. It is downright harmful to getting people to use KDB. Maybe you should explain it to people?
For those reading:
parse is a function that takes a string and returns the parse tree and -3! (gotta love APL) is execute that takes a tree and turns it back into a string.
The parsed syntax it spits out allows you to compose pieces of queries or modify them (there are easier ways such as queries return tables that can always be in inputs to other queries).
Oh no, please don't attack others like this in comments here. Even if you clearly mean well (which I'd say is the case here), it can only cause harm.
geocar's comments are consistently excellent (including for explaining the niceties of kdb and array languages to the community) and having fewer of them would make this place worse. Not everything needs to be spelled out, and besides, sometimes there just isn't time to do so. It can be a lot more work!
As a long time K evangelist (before KDB 2 even) I understand that but this post just played into all the bad stereotypes of KDB+ being jibberish and line noise. Even for somebody veered K it took me a minute to figure out -3!. He isn't a@kx.com posting to the K mailing list.
I did vote the comment up before responding to him as KDB, but I'm not sure why really.
Well, "companies" is a string, isnt't it? Ultimately, there will be strings down there.
Working with pure AST is a huge pain. One can trivially construct something that can not realistically be represented in an SQL dialect of choice.
LINQ is the (IMO) one of the best ways to do query composition, and with the nameof() operator you can really get away without a single hardcoded string in your application-level code.
Ehhh...something as simple as a left-outer join is near ungrokkable for me in LINQ. Cross apply / lateral join on a table-valued function? Forget about it.
Not to mention - due to the way LINQ works a lot of errors that seem like they should be caught at compile time don't get caught until run time (e.g. you wrote something in a where clause that can't be translated to SQL).
LINQ is like flying a plane. It will take you where you need to go extremely fast, but will also crash you into the ground if you don't know what you're doing.. Personally I get immense value from it daily, but I know when not to use it as well.
from c in categories
join p in products on c.Category equals p.Category into ps
from p in ps.DefaultIfEmpty()
select new { Category = c, ProductName = p == null ? "(No products)" : p.ProductName };
You're doing something called a 'group join' (second line) 'into' a new identifier (ps) above. I'm guessing it stuffs each matching result in 'p' into a list, which sorta makes sense. The second part mystifies me though - somehow 'from p in ps.DefaultifEmpty()' flattens the list back out into a denormalized result set while simultaneously preserving entries in 'c' with no matches.
What does 'p' point to now? Why is it seemingly created twice? Is 'ps' still available? For a simple left join this is way, WAY too convoluted IMO.
The second thing I'm talking about you can read about by finding out when EF throws a 'NotSupportedException' at runtime. (e.g. https://stackoverflow.com/questions/25337974/entity-framewor...). Basically not everything you can do in LINQ can be translated to SQL. But it won't tell you at compile time. You have to wait till run time, which is garbage IMO. The example I linked is trivial but it's easy to miss this kind of thing when you're optimizing ORM use (and trying to move as much computation to the DB as possible).
Dynamicaly adding filters or parts of the query, be able to have different parties to construct query, do transformations, replacing objects,... that would be great. And having it in well known, standard format and supported in languages, and have execution plans the similar way, that would be the dream... That would really simplify the life.
I started to write something a bit like this a while ago for SQLite and C++, https://github.com/Gloopio/SQLighter we also have a more complete/full featured version that we use internally, although I was not sure if there are enough people that would want it - for it to be turned into an open source project.
IIRC, stored procedures let you do something like this. If you compile your query as a stored proc, there's usually an API using which you can pass values to it as parameters in their "native" form, i.e. without having to convert them to strings first.
Yes please! A couple years ago I wrote a sql-like layer to federate queries across a couple different databases, with the intent of unifying the data we had spread out over postgres, cassandra and S3. The best decision I made was not writing a real query parser and just using s-expressions, which is pretty much what you're asking for. It was great.
Generating an AST client side, then serializing it, can take a substantial amount of CPU usage especially in languages like JavaScript or Python. If you do include the ability to make an AST, it would be good to have a facility for parameterized queries so that you only have to build and serialize the AST once.
The language to access a relational database is always SQL. So much than a lot of people doesn't make any difference between SQL and a DBMS.
Aren't we in the same situation as Javascript on the web?
I always had wondered why there is no effort like Webasm for relational database.
I would love to see a more vibrant scene with different syntax, paradigm and ideas.
What would be the rust equivalent of a relational language, the Go equivalent?
Still waiting for SQLite4 with it's awesome LSM engine :P (shameless plug I ported the engine over to windows https://github.com/maxpert/lsm-windows and it works :D). I wish somebody can do a detailed architecture about SQLite4, that kind of knowledge is scares and valuable.
FWIW, I'm presenting a talk on exactly this topic tomorrow at 12:30 at the http://www.db-tech-showcase.com/dbts/tokyo - unfortunately I do not think it will be recorded.
Slightly off topic, but thank you for your work on sqlite and fossil. I've used both daily for years and am always impressed by how consistently fantastic they are.
LSM engines tend to have great average but inconsistent performance. How does sqlite4's LSM fair?
Last time I looked at benchmarks, sqlite4 with an LMDB backend was faster on average and had much lower variation in performance. Is that still the case?
To be honest, it always shocks me how long it takes for the average modern webpage to load. All the build up of things, the time it takes, and how unexpected it still is that you think you can click somewhere, and just 0.1 second before you click something else renders and moves everything up or down, making you click something else, which causes another page to load etc. I have a relatively new Mac with 8GB RAM - it should just work, but no...
There is Firebird which is essentially an opensouce InterBase (IIRC it is forked from IB5 source). It is probably more widely used than new commercial versions of IB. One interesting feature is that it is probably only opensource DBMS that not only works with multiple engines accessing same database stored on Windows share but it is actually supported usecase.
Typical approach to replication is to somehow record changes made to database and somehow push these to slave replicas. There are different approaches to both "somehows".
For the format of the changes themselves there is continuum of different levels of abstraction that range from propagating the user's commands to slave servers directly (this is essentially what mysql does) to propagating commited writes to on-disk files (which is what default pgsql replication is about). Both extremes have their tradeoffs: in the SQL-level case the user has to make sure that results of transactions are deterministic (which is more complex than just not calling functions like rand() and now()) and in the physical page case all involved servers need to have exactly same on-disk layout (ie. same cpu architecture and you have to copy the actual on-disk files for initial synchronisation, dump and restore usually does not work)
For the actual propagation there are two approaches: write the change records somewhere and let the slaves read that at whatever pace they can or push the changes to slaves before the transaction is reported as commited to the client.
The full spectrum of PostgreSQL replication solutions in fact completely fills the space of these two choices and you probably should select carefully what matches your usecase best. (The two significant questions to ask is whether you can live with replicating the whole database cluster as one indivisible unit and whether you can live with possibly stale state on readonly replicas. When both answers are yes then streaming replication is what you want. If answer to second one is no, then you should probably redesign your application such that the data that requires this kind of guarantees is as small as possible)
"I called the janitor the other day to see what he could do about my dingy linoleum floor. He said he would have been happy to loan me a polisher, but that he hadn't the slightest idea what he had done with it. I told him not to worry about it - that as a programmer it wasn't the first time I had experienced a buffer allocation failure due to a memory error."
---
"Q: Why did the concurrent chicken cross the road?
The most common strategy for database replication is to copy over an initial snapshot then synchronize the replica by replaying transaction logs. So essentially replication is continuous incremental recovery on another server.
I built webserver from scratch as a term project at university. Due that the fact that it actually did something self-contained and useful I got away with turning in C99 project for C++ course :) (and probably also because I wrote my own talloc/apr_pool-style malloc wrapper to base that on, when presenting the thing that was about as far as I got and then was told, "well, this is C++ course, but you got A and now go away")
For another MSc. level course the whole term project assignment was: "choose or design some language and try to somehow conjure compiler and if required VM for that". "somehow conjure" is significant because nobody said you have to do that from scratch, hacking tcc/gcc/lcc/llvm to accept your language was fine and basing your language on top of some Smalltalk VM (preferrably ST/X ;)) or Common Lisp was essentially encouraged.
i think that topic is larger than the presentation thus far. hopefully memory mapped flash will obviate the need for alot of this but:
o you need to make an on-disk structure (i.e. btree) to let you search your tables (indices)
o that has to be fronted by an in memory cache
o your on-disk structure should maintain consistency even if the power fails and some writes get lost
- often but not always this is facilitated by keeping a separate write-optimized structure called a write-ahead-log
- if you use a WAL, you'll also need to implement the replay mechanisms to get the primary indices back up to date on a failure
o because the disk operations are expensive and high latency you'll need to start managing concurrency explicitly
back ends are alot of work. unfortunately this is a place where the lack of decent concurrency mechanisms in your language and OS interface can really cause alot of headache.
pedagogically, i guess you would start with a completely synchronous non fault tolerant btree? or a maybe just the log, and introduce trees as a read accelerator?
thats a really good question, unfortunately everything I learned was by working with people who knew more than I did. its pretty sad that alot of systems work is that way.
this book looks to be pretty comprehensive, and explicitly discusses on-disk structures, write ahead logging, and recovery.
http://infolab.stanford.edu/~ullman/dscb.html
relational languages and transactions get alot more playin the academy..i think its because you can make sense of them in some abstract way without getting sucked into involved discussions about caching heuristics and OS write guarentees, scheduling and fussy performance characteristics
> Very nice! If you wanted it to save to disk instead of living wholly in memory, how would you do that in C?
The idea would be to map your memory space to disk space (using random access as provided by C stdlib); the problem would then be:
1. Caching - i guess you should then provide your cache implementation. And i would guess this opens a can of sync problems...
2. Finding a way to efficiently write the data on disk (i.e. which data should stay contiguous?; how much "slack" space should i leave on "extents" (oracle slang for a container for many data blocks)? Should you do column-store?
C++ solution (the solution you didn't ask for, sorry) for saving to disk from memory (as some sort of periodic expensive sync) would be to write a stream operator for your objects and stream them to disk. Loading would involve reading them back; would only work on the same OS and architecture.
Alternative approach: For Windows, you could just use disk but pass FILE_ATTRIBUTE_TEMPORARY to CreateFile to force it to be in memory if the file is small enough.
I mean, not really, but it's a surprisingly viable starting point for simple problems. I've shipped it. Some production-grade software still uses mmap, albeit with a bunch of additional complexity to make sure it works okay in weird cases.
It seems like Redis might currently make use of mmap for some of its data, but I couldn't find an up-to-date source for that, just some old blog posts by the developer.
The trouble is that naive mmap doesn't give you the control over write ordering that you need to implement reliability.
Sadly, about 50% of database design is trying to ensure the thing has a decent chance of starting up again if it crashes or loses power. This mostly consists of fighting file systems and disk caches.
I used to work on a system that used mmap for basically everything, including a proprietary database that processed financial transactions. The original production system ran on a commercial Unix. (It's been about 15+ years, but I think it was AIX.)
At one point, we ported it to a different Unix platform which had slightly different mmap semantics. It looked like everything worked, except for one minor detail: the data was never actually synced to disk. Ever. Until shutdown. Unfortunately, since the system ran 24/7, it effectively never synced. First time the system had a power failure, there was massive data loss on startup.
That sounds like a terrible idea even if it were 100% reliable. When you shut down it will have to flush the entire memory at once. If you're syncing to a hdd it can mean several minutes of downtime.
A correctly working system will "flush behind", so there should be no difference between a clean shutdown and a power loss. If there's several minutes of unsynced data that's data that will be lost on power off.
Just going to chip in here to mention LMDB, which essentially provides an mmap btree database with single writer + multi-reader snapshot concurrency, ACID transactions, and is extremely robust.
If you think something may benefit from a shared memory data store, lmdb may be worth considering as a fast, reliable, high concurrency alternative to reinventing wheels with raw mmap + manual sync.
The poster's comment reads a bit like an advertisement, but LMDB would be a good solution for persisting data, in my opinion. It has a pretty simple C API and it's been described by its authors as "crash-proof".
You think you can, but you can't! Linux always sync's the whole file; see https://lwn.net/Articles/502612/ for a discussion of why fixing this horrible bug is too dangerous.
Beginning with version 3.7.17 (2013-05-20), SQLite has the option of accessing disk content directly using memory-mapped I/O and the new xFetch() and xUnfetch() methods on sqlite3_io_methods.
I would say its less that you can’t or shouldn’t use mmap, and more that its not as simple as throwing mmap at it - if you care about crash recovery/not corrupting data (which most databases should care about), but that certainly doesn’t mean that mmap can’t be a part of the solution.
> .. The ultimate milestone is to be able to compile the SQLite3 source code and have it working without modification. This will be the 1.0.0 release.
So if that project does work out, there will be a Go SQLite implementation of sorts. No idea how messy/usable the result would be as a base for development though.
Note - haven't personally been tracking the project at all. More just that your statement prompted remembering it. :)
Use case is mostly to learn how to be a Go ninja by building something as powerful as SQlite 100% in Go. For anything production grade the existing Go drivers or wrappers for SQLite are perfectly good. It would be nice to get a single binary, no dependency Go program though.
I'd just write protobuf "blobs" or other serialization format n in an sqlite text column. If you need to query across the typed data, you could build an index in another column from the data in your protobuf blob.
I didn't know SQLite was dynamically typed. That's a strange choice for a database, to say the least.
I would look into Berkeley DB (BDB) which is one of the most widely-used embedded databases, albeit not a SQL-based one. Otherwise take a look at the list of embedded databases on Wikipedia: https://en.wikipedia.org/wiki/Embedded_database
Initially (v < 3), all data fields in sqlite were text internally - type information was accepted for SQL compatibility, but ignored. This is actually an improvement.
I'm not sure of the rationale; perhaps it's easier to keep the code / database files portable amongst different systems this way, and since it was designed for embedded use, many systems (and thus incompatibilites) were envisioned?
I don't know why but unfortunately the various parts are effectively unreadable on an iPhone.
The screen zooms in and out as you scroll, so that when you scroll the text is a little too tiny and when you stop the text is cut off on the left edge.
I'm going to give it a try on my iPad. Just a heads up.
SQLite had a very rigorous testing methodology to comply with aeronautics standards. Cloning the tests would be a decade worth of work. But cool idea nonetheless. https://sqlite.org/testing.html
Yes, but the article also says that file is generated by a makefile from more than a hundred .c and .h files, and that it's done for ease of distribution and because optimizers do better with a single compilation unit.
It's not like the authors spend a lot of time working on the amalgamation and that's the way the project is developed.
Why? SQLite is one of the highest quality codebases out there! For fun, sure. But I think you are underestimating the size of the SQLite project and it's tests. Just because the library is small it doesn't mean it's simple.
How would a rewrite change that? Addressing that objection would require some sort of open standards process and commitment to follow it, which is a rather orthogonal concern and would be somewhat expensive as well.
You'd also need to turn "whatever sqlitev3.x does" into a spec, indeed which is a lot of work that will never happen as there's only one implementation that passes the test suite.
It would be a colossal waste of time. At 125KSLOC code + 91MSLOC tests, the engineering effort is near-prohibitive. There is no single area where SQLite3 would be severely lacking. Just look at SQLite4 and say if you care about any of the differences.