Chasing flavor-of-the-month seems like a pretty lost cause. I would like to see the patch in as well, but this seems like a pretty terrible reason for it.
You misunderstand me. I am not commenting on the nature of the feature, but rather the motivation cited in the SP for holding the release open to get it in. That reason could be summarized as "node.js and mongo!!".
Its not flavor-of-the-month, its flavor-of-the-generation. That's a 4 or 5 year span- not a ship you want to miss. Django and Rails have brought significant Postgres adoption. If widespread OS adoption and support is a goal then it is wise to pay attention to the features and platforms that will drive adoption.
no, Django and Rails have been a generation. Node.js and Go are potentially the incoming generation (according to the article)
Node has encouraged MongoDB usage because there were nice packages that made working with Mongo very easy. If Postgres supports JSON with powerful features then developers will make npm packages that support it and this will make Postgres very attractive.
I disagree. Node and MongoDb have both had very strong growth for several years now across a wide variety of users from individual hackers to the largest companies. While I'm skeptical of MongoDb's long term prospects due to its underlying technology and tight sponsorship I can understand the appeal of it's ease of use, especially for development. Node and JavaScript, on the other hand, seem clearly here to stay. While far from perfect are becoming deeply embedded in many key areas.
And even if you don't like MongoDb or Node, JSON is clearly one of the most popular data formats - better support is surely a good thing.
I agree in the sense that nobody should re-architect their product around a trend.
But this is different because the postgres architecture is very much capable of great support here, and getting even better. Postgres has the best support for adding in new types, with amazing capabilities around indexing.
I'm not just talking about different kinds of BTrees, I mean real non-scalar indexing strategies like inverted indexes (used for full text, but also great for JSON), similarity matching indexes, unbalanced indexes, spatial indexes, etc.
The author merely wants to use this infrastructure as intended for the JSON use case. Easier said than done, of course, but no architectural compromises need to be made in the process of providing this support.
> While I don't mind seeing some non relational additions like JSON, it shouldn't be detrimental to Postgres relational core.
The patchset in question is an improvement of hstore[0] (typed hstore, nested hstore) and unification of json and hstore (by using the binary hstore format for both). There is no more detriment to the relational core than when hstore or XML were originally added. Also, next-GIN[1] by the same people[2], not sure it's part of the patch mentioned.
I know that the GIN compression patch has been committed and some of the patch for speeding up searches by skipping posting list entires, but I have not seen anything about the third part, speeding up FTS by adding ordering info to the indexes.
> PostgreSQL is the best open source relational database and I hope it remains that way.
Yes, and it will. In fact the best post-relational features of PostgreSQL work best when used in a relational way.
> While I don't mind seeing some non relational additions like JSON, it shouldn't be detrimental to Postgres relational core.
As I usually tell people "Good object-relational design is usually good object-oriented design and good relational design, and that makes it really, really hard."
> It's not a pissing contest : if MongoDB is more popular than Postgres for non relational (NoSql) data so what ?
I would agree. The fact is that for the last 10 years, PostgreSQL has been the choice of open source rdbms's in business apps for the last 12 years and I don't see that changing.
> It's not like Postgres is a commercial product and need wider adoption to keep the parent company afloat.
Adding to what masklinn said, JSON is just another type with related functions and operators. It uses the already implemented extensible type system, and is no different from any other type including varchar and integer.
I am working on a not-yet-released (needs test cases!) CPAN module called PGObject::Type::JSON.[1] This module would let you effectively grab JSON types from the db, serialize and deserialize on the db query so you don't have to worry about it.
The serialization and deserialization is cool, but the question is what you can do with it. With the json functions if they become more functional, we could use JSON as an input to stored procedures, allowing arbitrarily complex data types to be fed into or out of stored procedures easily without having to serialize in tuple/array format.
A single, cannonical, well supported format would be huge in that it would allow you to do things with the database with extraordinary ease that are not trivial to do today. For example, really good json support + composite types as tuple elements (including arrays of composite types) + really good indexing support should give you an ability to approximate the power (for at least some workloads) of full custom type development, without needing to go to C.
I don't tkink people care so much for storing json, rather than "talk" json to the database. A different interface with simple insert and update with json is typically what people are after. A sort of ORM that simplifies inserts and update (and retrieval per primary key even) where data is provided as json would be a much better tool to attract the newer generation of programmers.
And like some of the comments said, PLEASE auto-partitionning!
Just having a JSON type should fix this without the need for a JSON based protocol. Actually the current JSON support in 9.3 should be enough for this. Just call a stored procedure with the JSON object describing the transactions.
It's reasonably easy to wrap the SQL involved in calling stored procedures.[1]
The JSON in 9.3 is almost good enough for this. It doesn't handle nested data structures to sql nested complex types properly, and so it won't quite get you there but otherwise agreed.
I.e. to make it work currently you have to slice and dice the JSON in some fun ways first. I find it less painful to just handle the complex type serialization/deserialization on the client side.
Yes, it is easy, but people using Mongo, CouchDB etc... will argue that not having to do it at all is better, and will use that argument to justify their choice.
Same idea as ORMs... Making any CRUD operation is easy in SQL, why did we need ORMs? Yet, there is a fragment of the population who wouldn't use a database they couldn't talk to through an ORM. Today, there is a segment of programers that won't use a database if they can't PUT({"name": "Big Bird", "address": "123 Sesame street"}), because, you know, that's the modern way to do things.
No. Adding support for a json NoSQL like interface doesn't remove the SQL access. You could let the front end guys do Create Update Retrieve by primary key using the json interface, while the backend analysis guys use SQL through the normal SQL driver.
I think you misunderstood my point. The "modern" way seems to be to avoid SQL using either NoSQL or something like an ORM.
Personally I prefer to tackle this from the other side with stored procedures and service locators so my application code is free of SQL (aside from the service locator), and all my SQL is in .sql files.....
> This is not trivial to do right now. Top-notch JSON support would make it a breeze
What does it matter if it is JSON or not. In financial transactions you want:
* Maximum fault tolerance
* Auditability
* Security
* Performance
* Ability to roll it back with all the cleanly defined states in between
* Ability to talk to legacy / standard systems
JSON-ifying the contents of the transaction is nice. But I wouldn't think that is the main pain point in the industry at the moment. It is really the behavior of the systems outside the transaction contents itself that matters.
As far as I can tell hstore is one of the most popular extensions to PostgreSQL, so people do indeed care about storing non-relational data in their database.
I'd also add that SPA[1]/client-side applications are also an important consumer of databases with decent JSON support.
The "backend" for these types of applications are often relatively thin API wrappers around a database, with the bulk of application logic on the client-side.
Postgres' JSON improvements over the last couple of years have made it a pretty good choice already for these sorts of things, but the JSON indexing improving would make this a no-brainer.
I think it's valuable to show how an "old, traditional"[1] system like postgres can actually be much more flexible and powerful even working in a fashionable environment.
If nothing else, it's a great exercise to make sure that postgres isn't missing out on a real, fundamental use case. Fashionable technologies don't always have something new to offer, but sometimes they help you see something from a new perspective.
[1] Postgres is very non-traditional in many respects, and I personally believe it's revolutionary; but many people perceive it as more traditional.
Postgres has never been popular. Why the sudden race for popularity now? MongoDB is replacing MySQL not Postgres. I would be interested to know what proportion of Postgres users use the JSON features, I suspect it is still small, even amongst the Node users. The popularity of Postgres is growing at a manageable rate, not an exploding rate and that seems fine to me.
> Postgres has never been popular. [..] MongoDB is replacing MySQL not Postgres
I don't know what you are basing that on, but it's the complete opposite of what I am seeing. Postgres is now the default open source database as far as I can see, especially for Rails. MongoDB enjoyed a brief hype-fuelled day in the sun but is now viewed as a specialised tool whose choice over a more general RDBMS would require considerable justification.
I suspect the question of whether its popularity is "exploding" or not depends on your definition of the word. I have certainly noticed a very marked uptick in the last few years. Not exploding perhaps, but certainly ballooning.
Thats what I meant by "growing at a manageable rate".
Well maybe Postgres and Mongo are replacing MySQL. I haven't seen Postgres being the default, but who knows, I mostly hang around with people who have been long term postgres users anyway.
I suppose my main experience is with rails people, and as of rails 4 postgres is pretty clearly the "default" DB, emphatically displacing MySQL.
The author appears to come from much more of a node.js angle and so probably what he sees is different. I'm probably underestimating the number of node.js projects so the author likely has a good point. Still, I'd raise an eyebrow pretty highly at anyone who considered MongoDB a solid default choice for a general purpose application. Most companies I know who opted for MongoDB regret it, for reasons that have been rehashed here any number of times.
Yeah, I realised after typing it that there's probably a huge amount of availability bias in my statement. It wouldn't surprise me if Wordpress MySQL deploys outnumber all rails/django postgres deploys combined. And the PHP mainstream isn't moving away anytime soon.
MongoDB, though - I do question that, however I have no figures and it again probably depends on the circles in which you move whether you see a lot of it or not. I don't move in node circles much, so I hardly see it at all.
Mea culpa. That said, in the startup-filled tech incubator in which I work, and in which rails is pretty dominant, postgres is king. Make of that what you will.
I work in the enterprise space and MongoDB is very popular. It isn't replacing MySQL or PostgreSQL at all (neither of which I've ever seen) but rather Oracle. And it's big in the higher end internet space as well:
Postgres is the "default" database for django installations.
> MongoDB is replacing MySQL not Postgres.
That's part of the point, if user bases are switching databases, that's a chance for postgres.
> I would be interested to know what proportion of Postgres users use the JSON features, I suspect it is still small
For good reasons since it's still young (first introduced in 9.2) and fairly limited (9.3 did add operator and indexation as well as a bunch of functions). I'm in a postgres shop and we aren't quite ready yet to drop support for all postgres versions preceding 9.3 (to my dismay).
Hence "default". You can use any db you want, but if you follow the documentation or ask which one you ought use, Postgres will be suggested and recommended. Sounds like a default to me.
I've only used Django with SQLite but I would be shocked if it was actively recommending a single database. It would imply that somehow Django doesn't work as well with the others.
the migration document also clearly notes that postgres has the best migration support, although it does not outright recommend it (probably because migration is generally an afterthought and production database has already been decided by then)
South does complain every time a MySQL migration has problems and you get recommended to use Postgres there. Other than that I haven't noticed any preference in the Django documentation in 3 years.
Not a Django guy but my guess is that it has nothing to do with South and all to do with that PostgreSQL supports transactional DDL while MySQL does not.
This only demonstrates that rollbacking an insertion after creating a table within a transaction now works. That is, it shows DDL does not completely break a transaction it's performed in.
According to the documentation DDL remains non-transactional in MySQL 5.5[0]
> The CREATE TABLE statement in InnoDB is processed as a single transaction. This means that a ROLLBACK from the user does not undo CREATE TABLE statements the user made during that transaction.
Not my experience. I have found that PostgreSQL has a remarkably stealthy popularity. I went to a conference a couple years ago and everyone knew PostgreSQL, and even there were more booths advertising Pg services than MySQL services.
Postgres is not popular if we talk enterprise world that has been buying Oracle. Oracle is an immense behemoth in that space with skilled salesmen, and they make lots of money.
I once asked an Oracle salesman what he thought of MySQL and PostgreSQL and he laughed, and basically said they are seen as joke by his customers. Maybe he was brainwashed sure, but I suspect that reflects (reflected? that was 2-4 years ago) the reality in that market.
Now with companies outside the enterprise (startups) and for internal projects, I feel Postgres has been very popular. MySQL effectively become unappealing after its trademark issues. Postgres gained a some of those users. It also gained disillusioned NoSQL people, some didn't RTFM, some were mislead, some didn't understand the problem well enough and so on.
I think this addition is extremely valuable, but courting node users is a weak argument for it. For me, it is more about the growing importance of semi-structured data, as well as other forms of data that are difficult to model outside of row-tuples and joins. Hstore was a hack...really useful, but a hack nonetheless. Hstore2 is a legitimate kv-store. Now if we could just do the same with a graph-store.
The linked post says : "I think we shouldn't release 9.4 unless it goes in."
I haven't been following the PG developers mailing list that closely recently and the article doesn't go into any details as to why it may be delayed.
From the link that @masklinn posted (Thanks, very interesting!) to a hstore presentation, it looks like the hstore side was developed by November last year, so I'm wondering why it wouldn't be ready for release by this September or whether its just the integration of hstore/JSON side of things that is the hold up.
For me, the new PG feature - although its an outside project - which I am most looking forward to in the 9.4 time frame is a newer JDBC driver for Postgres ( https://github.com/impossibl/pgjdbc-ng ).
http://postgresql.1045698.n5.nabble.com/jsonb-and-nested-hst... seems to be the relevant threads. Apparently there are issues over wire format, and jsonb being in core (and unless I'm misreading things jsonb is to be a new type separate from json, which is unwelcome news, I didn't know about that although it does make sense from a "binary copy of database files" sense)
I would like to suggest a modification to the author's thought about an ORM. What would be very cool would be a PG extension that wraps a mongodb compatible Api on top of postgresql.
That should completely short circuit the whole value proposition discussion of pg vs mongo.
This could be done fairly easily, perhaps more easily, as a standalone program, i think. Something that exposes a MongoDB-compatible API, and translates requests into libpq to go to a PostgreSQL backend.
You could run it either on the PG host, or on the same host as the apps wanting to talk to PG.
Based on a quick look at the documentation, i think the API Waterline exposes to clients is not an imitation of the MongoDB API, which would allow PostgreSQL to be used as a drop-in replacement for MongoDB. Rather, it's its own thing.
Waterline does support both MongoDB and PostgreSQL, though, so you could use it as part of a migration strategy: start with MongoDB, install Waterline backed by MongoDB, migrate your app to use Waterline rather than using MongoDB directly, replace MongoDB with PostgreSQL.
Just by observing from the outside, Node.js and MongoDB, don't know what it is about those two platforms, but the communities formed around them seem to attract a large number of people I wouldn't want to interact with. There is a lot of immaturity and drama. Maybe it is the unreasonable and over-hyped marketing, not sure, can't quite put my figure on it.
Go is also a new language and framework that is popular but it has a different feel, a better one. Python so far, has one of the largest and most pleasant community (given its size) I've seen.
Json/KV friendly features are welcome, but perhaps looking to impress Node.js and MongoDB users is just barking at the wrong tree.