Hacker News new | past | comments | ask | show | jobs | submit login
JSON API (jsonapi.org)
227 points by steveklabnik on May 4, 2013 | hide | past | favorite | 135 comments



I think it's great that we're talking about standardising how we build RESTful JSON APIS. However I don't think this has got it quite right yet.

What's the reason for the top level rel? It seems like it's just there to stop the urls from being repeated and to save space, but isn't that what gzip is for? Why complicate the data format and require all that extra logic and gzip would remove most of the redundancy before transmission anyway?

Also the name is a bit of an annoying land grab. It'll make it hard to talk about JSON APIs without getting them confused with JSON APIs that specifically use "JSON API".

Lastly, it really seems based on a rails active record style data store, it's assuming ids are the most important thing and that links are all relations that point to other objects within the system. Proper hyperlinks and point anywhere and can link together disperate systems which don't necessarily all use the exact same formats.


> What's the reason for the top level rel? It seems like it's just there to stop the urls from being repeated and to save space, but isn't that what gzip is for?

It also makes it possible to cache things locally indexed on their IDs, and to form URLs that make requests for just the precise documents that aren't available locally. In order to achieve this, it's necessary to have both (1) IDs, and (2) a way to convert a list of IDs into a single request for all of the documents at once.

> Also the name is a bit of an annoying land grab

HAL is "The Hypertext Application Language". In general, people tend to be using generic names for these things, so I chose an available, generic name.

> Lastly, it really seems based on a rails active record style data store, it's assuming ids are the most important thing and that links are all relations that point to other objects within the system

I reviewed a large number of server-side solutions (Firebase, Parse, CouchDB, Django, Rails) and they all had the concept of an ID for the document. As I said above, this ID is useful to keep track of which documents have already been cached locally, and how to formulate a URL that makes a request for just the missing URLs. I don't consider this solution to be particularly tied to ActiveRecord.

Less importantly, it is also more convenient to cache documents on the server using their IDs (or slugs, or whatever the storage wants to use), and allow a top-level configuration to define how to generate URLs. This allows server-side solutions to serialize and cache documents without having to be plugged into the router architecture, but enforces a URL-centric view once the HTTP response is built.


The thing is that "JSON API" was not available, it's already in common use to describe APIs that use JSON. "HAL" is totally not the same, it's clearly a name chosen to not conflict with existing terminology.

The proper layer for caching is HTTP, do we really want to end up with duplicated overlapping functionality between the layers? I see you have an application specific need for a certain thing but I don't think it generalises enough.

A few other things I noticed:

- There's no name spaceing of your special properties, they're just mixed up with the domain-specific properties. So I can't have a property of my object named "rel" and worse still you can't ever add features to the spec without breaking backwards compatibility.

- No type information in items, which kind of breaks automatic caching.

- Multiple entities returned for a URL with no indication as to which is the main one represented by that URL and which are just extra associated ones.


> The proper layer for caching is HTTP, do we really want to end up with duplicated overlapping functionality between the layers?

HTTP caching doesn't work well with compound documents that represent a graph of objects. The kind of caching described in JSON API allows an application to group together requests for documents while avoiding making requests for documents it already has. The only way HTTP caching works is if every request for a document is 1:1 with an HTTP request, which doesn't map onto my experience (and our experience with Ember Data) at all.

> I see you have an application specific need for a certain thing but I don't think it generalises enough.

This functionality was extracted out of a general purpose framework used by a number of applications that made heavy use of it. The ability to send a normalized graph of objects in a single payload, and then make a small number of additional requests, only for the documents that the client doesn't already have, is a huge win for clients working with a non-trivial number of related objects.

The rest of your concerns are very valid. FWIW, I envision future extensions being added to the `meta` section, which is already reserved. ID, URL, and rel are the building blocks of the graph, while other kinds of extensions (like pagination, etc.) are optional metadata. But you raise a good point and I will give it some thought :)


> This functionality was extracted out of a general purpose framework used by a number of applications that made heavy use of it

That's exactly it! This looks like a sensible design for an API served from rails and consumed by Ember.JS. It's just over selling it a bit to imply that this should be how all JSON APIs should be.


I don't think "served from Rails" and "consumed by Ember.js" is quite right. It was designed to work with existing servers that have facilities for easily generating JSON (Rails, Django, various Node frameworks), and smart clients that want to index local data by ID (Angular, Backbone, Ember, etc).

In short, it's an attempt to extract the learnings about efficiently transporting a non-trivial number of objects over a REST-like transport in a sometimes-incremental way. Many different server/client combinations have been attempting to do this in an ad-hoc way for years, and Ember Data was simply an attempt to try something general-purpose out in the real world.


"HAL" describes something specific and has uniqueness to it- seriously dude, pick a name that's something short of "this is the internet," even if it is doing something core such as defining interlinked data in 21c.

How's that, Interlinked Json, IJ? Relational Json? Seriously, pick something that's not going to muddy the highest level of namespace we swim in wycats.


> It also makes it possible to cache things locally indexed on their IDs, and to form URLs that make requests for just the precise documents that aren't available locally. In order to achieve this, it's necessary to have both (1) IDs, and (2) a way to convert a list of IDs into a single request for all of the documents at once.

Couldn't the id column contain a canonical url then? E.g.:

    {
      "posts": {
        "id": "http://example.com/posts/1",
        "title": "Rails is Omakase",
        "rels": {
          "author": "http://example.com/people/9"
        }
      },
      "people": [{
        "id": "http://example.com/people/9",
        "name": "@d2h"
      }]
    }


I think biggest benefit from this kind of specification would be for data (partly) distributed and (partly) shared between different hosts, as there would be at least some common ground for both clients and server how to communicate. IDs are very abstract and does not necessarily tie data under particular host. Of course for some data domains ID could be URL, but that is decision made by data provider. Spec decision would be crippling for overall use.


I'd say it's the other way around. Url's are opaque for the client - id's imply more knowledge of the implementation. E.g. the client would have to know which host to communicate with and how to structure url's from id's. With hyperlinked documents, all the client needs to know is http.


URLs are opaque, and can often serve as very useful IDs, but, alone, they imply a one-at-a-time model of fetching documents, and this spec is trying to provide a way to easily request only the documents a client needs in a compound document.

Keep in mind that this spec actually requires that every ID be able to be readily converted into a URL based on information found in the same payload, so URLs are still front-and-center in the design. It just separates out the notion of a unique identifier, so that it can be used in other kinds of requests.


How would you combine multiple authors into a single request? With IDs + a template, you can form a single request for all of the "people" you don't have yet. If you use URLs as IDs, which is one of the primary goals here.


If I understand your question correctly, I would say that http pipelining solves that issue. It can only be used for GET requests, so there are limitations.


At least if you're going to dive into the transport layer, pick a capability that hasn't been added and then unceremoniously backed out- https://insouciant.org/tech/status-of-http-pipelining-in-chr...

SPDY begins to offer some very appealing alternatives where when sending a document the transport can push all of the individual dependent documents. It really does fix things, begins pushing all the data at once, in a glorious resource-oriented fashion. That said, I would also enjoy a spec that does resource-description of subresources so we can send linked data around without having to have every piece of data be an endpoint.

That said, the immediate follow on question arises- now that we're sending sub-resources, can we get the most important agent to understand and grok our sub-resources- can the browser follow our subresourcing and those subresources to their canonical URLs, and serve those subresources if it's seen them inside another document? There are two questions- first, is your spec good enough to enable that facility where addressing can be well known- here, in this Json Resource Description spec presented yes, via URI templates, very good- and second, does the browser bother to inspect the JSON it sees? No? Well, I'm not super bothered by this academic interest not being materialized, knowing at least in principle the specs make it possible.


Thanks for the link - I weren't aware there were so many issues with pipelining. I have mainly used it server to server, where it seems to present less problems (not surprisingly really - I'm in much better control of the chain of components).


It can only be used for GET requests, has problems in web browsers, and still requires the overhead of individual requests on the server side to construct and return many responses.

In theory, things like pipelining allow you to never have to worry about compound documents. In practice, I don't know anyone who has gotten this to work well for browser clients and general-purpose frameworks when dealing with non-trivial numbers of documents.


I see.

The server side support seems like the wrong thing to base the protocol design on, but I admit that is probably just me showing my limited experience with very high traffic api's. I would think though that much of this could be alleviated by proper caching. As requests would naturally be finely granulated into individual resources, presumably that could be done efficiently.

The point of browser support is probably more pressing. I'm curious as to how big an issue that still is? Which browsers support it properly these days and which don't?

I wonder if it would be worth to build an api around the assumption of support for pipelining and then provide a fallback hack for those that lack support. E.g. something similar to the good old _method hidden-field hack for lack of http method support. I'm thinking something like an optional "batch request endpoint", that would tunnel through multiple requests, similar to what a pipeline would do. I believe Facebook is offering something similar in their api's.


N.B. Pipelining can be used for any idempotent request (so PUTs and DELETEs work too). That said, lack of broad implementation support for pipelining is still an issue. Since HTTP/2.0 is being based on SPDY, hopefully we'll see a day where this is less of an issue.


It amazes me how programming languages and APIs look more and more like Lisp. Modern languages copy essential features from Lisp, and JSON as one of the most popular JS libs almost look identical to Lisp s-expressions.

Someday also more people will realize how useful and effective the equivalence of control structures and data really is.

JSON: { "posts": { "id": "1", "title": "Rails is Omakase", "rels": { "author": 9, "comments": [ 5, 12, 17, 20 ] } } }

LISP: (posts (id 1) (title "Rails is Omakase") (rels (author 9) (comments (5 12 17 20))))


Yeah. Well. Also known as "in the end, everything is just an abstract syntax tree". But while a few people will always delight and excel in reading and writing everything as s-expressions, many of us will probably always find either line-breaks or different kinds of braces, brackets and little syntactic doodads make for easier and saner writing and reading -- even if the parser needs to do a bit more work.

No matter what kind of code I'm looking at: "could I express this as s-expressions?" Sure. "Would I want to?" Hell no.


> "could I express this as s-expressions?" Sure. "Would I want to?" Hell no.

Of course it is possible to implement syntactic sugar in Lisp which supports JSON style expressions. DSLs are common in Lisp, and that actually became a weakness of Lisp (so-called "DSL hell").

The interesting thing about s-expr is that Lisp doesn't need special data conversion tools to handle them. Even control structures are expressed as s-expr, and they can be created and modified dynamically which means that even code can be exchanged at runtime on the fly.


This is one of the most attractive things of lisp. The fact that the language has no notion of compiletime, evaltime and runtime. The user just doesnt have to care abouut it. Very powerful


Everyone says that, but the thing I loved most about dabbling in elisp was how much nicer things become when your text editor works at the s-expression level. Moving around and manipulating a lisp program is just plain easier than it is in nearly any other language.


This actually isn't true for Common Lisp.

There is a distinction between reader macros and compiler macros, for example, which is relevant for allowing using special syntax be optional for end users.

Certain things also need to be defined if you want them to be available in the compile-time environment. And, sometimes you have to do a bit of extra work if you want to have literal objects in your compilation environment and pass them to runtime.

Check out http://www.lispworks.com/documentation/HyperSpec/Body/03_bc.... though it will probably take you a few readings to make sense of it; I know it did for me.

But for the most part things happen automatically.


> easier and saner writing

No. Just no. I could agree about reading, but writing is much easier with s-exps. There's just less symbols to type. And remember, you still can put newlines and indent however you want.


It's JavaScript, that's the syntax notation for JavaScript objects. Sure it resembles lisp and python, but that's just how languages come to be. If I were to invent a new language, if likely use {} for dictionaries and [] for lists too.

Lisp is harder to read than JavaScript because it doesn't disambiguation between lists and dictionaries in a clear way like js and python do.


I think it's more about the tree data structure than about LISP. LISP just makes the underlying data structure obvious, but as you implied we are far from seeing any LISP-semantics in JS(ON).


Or in Rebol:

  [posts: [id: 1 title: "Rails is Omakase" rels: [author: 9 comments: [5 12 17 20]]]]
See relevant|related|interesting blog post On JSON and REBOL by Carl Sassenrath - http://www.rebol.com/cgi-bin/blog.r?view=0522


S-exps are "just" data structures, so that you are reading s-exps when you see other data-structures presented is no surprise.

This is easy to identify: nearly all json is just atoms thought- very few json definitions encode any kind of machine-works or program-code.

So, to put it another way: it amazes you how all Lisp looks like data structures. And hopefully I've helped let you know why you are so amazed here, thanks for reading.


Or in Clojure:

  { "posts" { "id" "1", "title" "Rails is Omakase", "rels" { "author" 9, "comments" [ 5, 12, 17, 20 ] } } }
Commas are unnecessary, I kept them for clarity.


I'm not sure I like this syntax. It's unfamiliar both to lispers and "mainstreamers". The former will be irritated with { and commas, the latter will try to insert : everywhere.

I read an interview with Rich Hickey where he said that adding these three syntax constructs reduced cognitive burden on programmers. He meant () for lists, [] for vectors and {} for hashes.

I should implement it in Racket and just see how it feels.


There was a discussion on the mailing list concerning colon in maps, here's the JIRA issue [1]. IIRC it was kind of decided to let it die, but maybe someone will implement it.

Personally I think that differentiating between lists and maps is important and the way Clojure does it (as well as JSON) is quite easy to write and read. FWIW example given by bitcracker is broken because there's no way to understand what does the following mean:

   (rels (author 9) (comments (5 12 17 20)))
The outermost element is it a list or a map?

[1] http://dev.clojure.org/jira/browse/CLJ-899


Wouldn't it be better to use keywords here?:

  { :posts { :id 1 :title "Rails is Omakase" :rels { :author 9 :comments [5 12 17 20] }}}


Yes, this is the usual way in Common Lisp.



I read a lot of RFCs and drafts for media types lately and what strikes me reading this spec is the very liberal use of MUST which seems to me like an unnecessary violation of the robustness principle, that Jon Postel introduced first in RFC761 (TCP). Mike Amundsen describes in his book 'Building Hypermedia APIs […]':

    Media type designers should keep Postel in mind. Designers can make
    supporting the Robustness Principle easier for implementor by keeping the
    number of MUST elements in the compliance profile to a minimum. The fewer
    MUST elements implementors need to support, the more likely it is that they
    will be able to craft compliant representations using that media type.


Postels law is the worst kind law ever made in IT. Take a look at how many security bugs that comes from being 'accepting' of bad HTML/CSS/Javacscrip/PDF files.

Be a nazi in what you accept, fail loudly and early enough and there will be no problem for the user to correct his error - it is only when you 50000 php pages full of echo statements that you can't change. If it had complained when he made the first, he wouldn't be in that situation now.

And don't tell me that nobody would use it then. C++, C# and Java are the top most used languages and they are all anal-retentive.


Yes, I want to reduce some of the MUSTs, or at least justify them more strongly. Right now they're based on what our running code absolutely needs.


Indeed. In my experience, looser requirements in this kind of thing just leads to tears on the part of client and server implementations. I'll happily reduce some of the MUSTs to SHOULDs or MAYs if it makes sense for the communication channel to consider them optional.


That's great to hear. One particular example which I found needlessly strict is this:

    The request MUST contain a Content-Type header whose value is
    application/json.  It MUST also include application/json as the only or
    highest quality factor.
It makes sense for fully compliant implementation to have those headers, but they way I understand MUST here is that a server would reject any request without them.


I've filed an issue for you about this: https://github.com/json-api/json-api/issues/2


It's great to see this laid out in a single place.

Couple of things that might be nice to see here:

* Pagination concerns

You call out "meta: meta-information about a resource, such as pagination", but that doesn't say whether things are 0/1 based, what the names of the values for per-page are, how to indicate length of the underlying collection, etc.

* Search concerns

I don't know whether this is an area that has best practices yet, but having it said and decided on something called "jsonapi.org" could save many people many hours of pain in the future.

* Elective compound documents

Bit more of a reach, but there have been a bunch of times I wanted to say "this resource, and these relations of it, and those relations of those". And in some cases partial documents (id and name alone) of those tertiary relations.


I am literally sitting on a plane right now, but as soon as I get off, will be registering this type with IANA. So that helps #2.

#3 will be do-able I think. I'm interested in this too.

#1 is something that needs a good answer, yes. I THINK it's out of the scope of this, as there are already registered REL values that handle this, but we should clarify.

I will be happy to answer anyone else's questions after I land, it's time to turn off electronic devices.


I worded #2 really, really poorly.

I'd love to see a spec for how to search a collection against an API that is jsonapi compliant. Different folks are all over the place here.

The base URL for search is different for different API producers. I like

  GET /teams?homecity=los+angeles
but have also seen things like

  GET /search/teams?homecity=los+angeles
or

  GET /teams/search?homecity=los+angeles

How to specify mildly complex search parameters is also unclear in the world. For referenced objects "/teams?homecity.state=ca" seems to fit with the style used for URL templates.

For bounded queries, should it be mongodb style?

  GET /teams?homecity.population.$lt=100000&homecity.population.$gt=100
And since we're talking about searches and pagination, do we need to specify a parameter for ordering results?

  GET /teams?wins.$gt=3&$order=homecity.population.$desc


GET /teams/search?homecity=los+angeles

Wouldn't this be:

GET /teams/los+angeles

We could have query-able attributes, this is where I think OPTION verb should come into play. So I would OPTION /teams. This returns some JSON with the meta information about the collection, for instance: `{key :'wins', type: 'string', description: 'some api description', sortable => true, filter: true, group_by: false }`, this should probably hint at the abilities of the attribute (sortable/filter/group_by). This really raises the question of what is a good querying api for collections? I think our answer may be in the form SQL-like interface.

GET /teams?wins[gt]=1&losses[gt]=1&wins[sort]=desc&limit=10&offset=0

SELECT * FROM teams WHERE wins > 1 AND losses gt > 1 ORDER BY wins LIMIT 10 OFFSET 0

So what about support ORs or Unions and Joins? Well, ORs are simple they could act like a named scope.

GET /teams?wins_or_losses[gt]=1&wins[sort]=desc&limit=10&offset=0

SELECT * FROM teams WHERE (wins > 1 OR losses gt > 1) ORDER BY wins LIMIT 10 OFFSET 0

What about a Union or Join? These should be handled by a new endpoint (joins) or avoided by querying individual collections (unions).


Isn't this what OData is trying to establish?

Quick intro video http://www.odata.org

Query formatting documentation http://www.odata.org/documentation/odata-v2-documentation/ur...

JSON representation documentation http://www.odata.org/documentation/odata-v2-documentation/js...


It's close in a lot of ways, I didn't want to mention a specific format and get bogged down on it's discussion but more over what I think a JSON API standard should be.

So, yes, I guess, in some ways, more OData like.


> I'd love to see a spec for how to search a collection against an API that is jsonapi compliant.

I would imagine they'd use OpenSearch in some form. Not 100% sure yet.

> The base URL for search is different for different API producers.

You've hit upon the reason why this needs to not be ad-hoc anymore; a few dozen ways of searching isn't good!


Of all people.

What about hypermedia instead of rels?

I feel like `ids` param is a hack, clearly the system has a group of objects, should that not be it's own collection?

For instances:

`GET /friends?ids=1,34,54`

Could be:

`GET /friends/best.json` and `best` to the system in some form, represents 1,34,54.

I don't want a RESTful JSON API. I want a REST JSON API.


Things bothering me in this proposal: the need for the relation; the use of "posts/1,2,3" to represent a list of resources; the "resource template". None of these are needed with REST/Hypermedia.

I've been using djangorestframework, and one of the things that I love about it (instead of tastypie) is the "Browsable API" renderer/parser mode. It's an excellent way to see if your API makes sense: if I'm not able to discover/manipulate resources just by clicking on the links or posting forms, it is a sign that the API design is ill-conceived.


I'm sorry, I'm very confused here. Why does hypermedia let you get rid of link relations?

How is a 'resource template' different than a <form> hypermedia control in HTML?

Why does the URI format matter? It's hypermedia, that's basically irrelevant.

It seems you want a "REST"ful API, and that's great. Build that. But then why the first sentence being seemingly upset with some sort of lack of imagined hypermedia support?


> What about hypermedia instead of rels?

... can you elaborate a bit more on what this means? I don't understand what you're trying to say.

> I feel like `ids` param is a hack, clearly the system has a group of objects, should that not be it's own collection?

I'm not 100% sure what you mean here either, but I'm reading it as "Why not use a comma rather than passing a list of GET parameters?"

The answer is "I don't think that's particularly important either." Constructing your own URIs is against the very spirit of REST. Let the server do that for you.


I think 1qaz2wsx3edc means naming the key "hypermedia" instead of "rels". I don't like "rels" because it's hard to pronounce (do I say relationships or rails?) and because it it's an abbreviation. Unfortunately I can't think of something better.

When you're competing against nothing, lack of elegance is a difficult problem. The need to be elegant is stronger and you can't simply be as or more elegant than your competition.


You shouldn't ever request ids from a server, you should request id -- as a single item -- or a collection defined by the server and named (such as user/friends.json, not users?id=for,bar,baz,foobar).

Basically rest apis map exactly one resource to a url and should never use the hack that is ?.


That is simply not true. Can you provide me with some sort of citation on this? Fielding doesn't talk about URL construction in his thesis, and, in fact, specifically mentions things like collections and multiple entities residing in one resource.

Also, ? is not a 'hack', I don't know where you're getting that from either.


I used URL Templates (RFC 6570 - http://tools.ietf.org/html/rfc6570) so that the request to the server is not ad-hoc, but instead fully described by the resource body.


What if I've selected 5 arbitrary people to unfriend?

`PUT /friends/???.json`


OP here.

Indeed. And importantly, this API makes it possible to restrict the query to a list of documents that the client doesn't already have from another source. In practice, once you start using compound documents with an identity map, there are many cases where existing documents exist on the client and shouldn't be re-fetched.

One really simple example is having a number documents that are related to "people" documents. For example, imagine a blog with blog posts and comments, each of which point at an author. When a list of comments for a post is downloaded, each comment will point to an author. Some of those authors will be the same, and some of them will already have been seen from previous posts. This structure allows the client to request only the authors that have not yet been seen (and if you're lucky, it may even be the empty set!).


Yes, the first point is valid, data could be re-fetched.

Compounded documents or unions/nested data should only be avoid. I think strong normalization is a good idea.

In the second case. The flow could be:

GET /posts GET /posts/1/comments GET /posts/1/comments/authors

Comments only need to rel to authors, which can be loaded next. Data-binding for the win.

I'm not against embedded data as an optimization. It's difficult to fit.


Think of it like you're trash bin, you move the files there, one at a time, then empty. State is captured. Then invoked. This is only necessary for transactional requirements.

Another method is having a collection sorted and slices.

DELETE /teams?id[gt]=10&id[lt]=13

As for non-sequel ids or random ids. It could be possible to support ?id[]=12&id[]=9&id[]=4. Sure it could be /teams/1,2,3 this is just a small detail of the parser. I'm more interesting in how a collection is queried and represented.


Actually, the current Ember Data protocol uses ?id[]=12&id[]=9&id[]=4. There's no good reason for the extra verbosity; it's pretty easy to split over commas and the current format is tricky to parse for platforms that don't already understand it.



I second this, what's the point of not using hal?


There are several reasons I chose not to use HAL:

* HAL embeds child documents recursively, while JSON API flattens the entire graph of objects at the top level. This means that if the same "people" are referenced from different kinds of objects (say, the author of both posts and comments), this format ensures that there is only a single representation of each person document in the payload.

* Similarly, JSON API uses IDs for linkage, which makes it possible to cache documents from compound responses and then limit subsequent requests to only the documents that aren't already present locally. If you're lucky, this can even completely eliminate HTTP requests.

* HAL is a serialization format, but says nothing about how to update documents. JSON API thinks through how to update existing records (leaning on PATCH and JSON Patch), and how those updates interact with compound documents returned from GET requests. It also describes how to create and delete documents, and what 200 and 204 responses from those updates mean.

In short, JSON API is an attempt to formalize similar ad hoc client-server interfaces that use JSON as an interchange format. It is specifically focused around using those APIs with a smart client that knows how to cache documents it has already seen and avoid asking for them again.

It is extracted from a real-world library already used by a number of projects, which has informed both the request/response aspects (absent from HAL) and the interchange format itself.


Well, I'm guessing your second and third point could be tacked on to HAL.

I see your point with the first one, although I must say in our APIs we have rarely encountered duplication. And the recursive nature of HAL makes it really easy to generate on the server-side.

(By the way, we've introduced a sideloading convention that only when you pass along ?embedded=author,comments those sub-resources will be present under _embedded. This way the clients can easily request only what's needed.)


Wouldn't (most of) the shortcomings of HAL you mention, be overcome by Collection+JSON [0]? It seems better suited as an already existing media type that accomplishes what you try to do with JSON API.

[0] http://amundsen.com/media-types/collection/


This is a reasonable response.


speaking of 200 vs 204: any way for the client to signal he's OK with just a 204 after updates/deletes?


this protocol seems to be a solution for ember; there are already other very similar protocols; why is this called jsonapi.org and not emberjsonapi.org?

as noted in other comments, there is JSON HAL;

there is also OData (you might not appreciate that it is an ms initiative, but its pretty well established and has many providers) http://www.odata.org/libraries/


JSON HAL is an document format only; it does not formalize a protocol. JSON API is a solution for any "smart" client that is capable of caching documents and intelligently limiting subsequent requests. In general, I believe it will be broadly useful for JavaScript frameworks (and native libraries) that want to abstract the nitty gritty of how a document comes over the wire from its "model" representation.


I think this is an excellent proposal. My only comment would be to clarify, next to the references to Ember, that this isn't exclusive to Ember (it is kind of obvious given the name/domain, but I think it may trip some people up).


got it; so how would you compare it to odata?



:) yes i'm very familiar with it...


And [this][1] looks similar to JSON API to you?

[1]: http://www.odata.org/documentation/odata-v2-documentation/js...


The format is not identical! I was looking for a philosophical comparison between the standard (odata) and jsonapi; I am not seeing much more than a simplified response format (a transform really).

Was hopping to see something more substantive in a comparison other than: 'results are returned under "d" instead of directly "posts"'.

Most project need to answer the simple question of 'why?' - 'why do I as a project exist'; jsonapi.org's is full of 'Ember'; hence why my original comment hinted that it should possibly be called 'emberjsonapi'.

If this is intended to be generic; it should ditch references to Ember and instead refer to similar standards; explaining 'why?' it is better than them.

By all means if your intent is to make a more elegant standard (again comparing to odata) then that's a worthy goal. Stating _that_ will help your 'consumers' understand what they are getting

edit: clean up


The references to Ember are in the introduction only and provide some historical context for the project. I believe that it's important for standards to come out of real experience, and that the context of that real experience will help others understand the goals. That's why I provided the historical background.

JSON API's design is based around a smart client that wants the ability to avoid making unnecessary requests for documents it already has, and to provide a format that avoids unnecessary duplication in compound documents. It also aims to be relatively easy to implement, both on the client and server, using tools and frameworks that are already widely in use using familiar idioms.


But you haven't yet answered Akamel's question of why this exists, with comparisons to the existing standards (OData in this case).

Everything I've seen just says that this is simply a subset of OData's functionality. And the features that are being asked for by users commenting on the OP are ones already provided by the existing standard.

And what about querying the data? There have already been questions elsewhere in this thread about how this would work. OData already provides very rich query support, including document projection (returning a subset of a document) through to complex queries navigating multiple relationships (I easily could ask, through a GET query string, to return all the actors who have starred in films that belong to the comedy genre - navigating 3 collections, actors,films,genres). If queries are out of the scope, then fine, but its clear people want to query their data.

So please answer what this offers over existing standards if you want to compete.

Oh, and in response to your earlier post of what OData JSON looks like, you're much better served by linking to v4, not v2 as you did:

http://docs.oasis-open.org/odata/odata-json-format/v4.0/cspr...

That doesn't look so dissimilar.


Thank you; That clears it up :)


Is there any reason the top level objects are represented as an array of objects vs an object keyed on ID? Sure the ID would have to then be a string, but I feel that keying it on ID with direct lookup far outweighs having to search for your item each time.

I feel that if you have an author with many comments:

{ authors: { '1': { id: '1', comments: [1, 2] } }, comments: { '1': {}, '2': {} } }

it would be easier to say results['comments']['1'] instead of performing some type of search across them each time.


Ordering. The ECMAScript standard does not specify property enumeration order.


The related documents are indexed by id from the primary document, so ordering doesn't matter, and it makes sense to use a format suggested by calebio, as it's both more efficient and terse. However, the primary document and other nested entities which require ordering should return an array of objects.

Edit: clarification


We did something similar to this alexkcd at Mavenlink using our Brainstem gem (you can see an example of the JSON at https://github.com/mavenlink/brainstem).

We generate a results array that is ordered based on the default order or supplied order. You can easily iterate over that then do direct lookups in the top level keys.


I may be missing something, but how is searching an array of JSON objects more efficient than direct lookup by ID?


I agree with you. "OP" was referring to your suggestion. I edited my pervious comment to clarify.


OOP here. There are a lot of these kinds of decisions that need to be made for an API like this. In general, I went with what we're already doing if there was a toss-up. A big strength of JSON API, imho, is that it's an extraction from a real world system that a number of people are already using in some form.

It's important to note that the goal of JSON API is to be consumed by a general-purpose client (like Ember Data), so the JSON will likely be processed once and indexed as needed. In the system we extracted this from, the Array is loaded into a Store, which indexes the documents by type and id, so future lookups are quite efficient.


That's a fair point, but using arrays only where ordering matters adds nice semantics that can be used by a general-purpose client. Using a map for related documents makes it explicit that document entries are unique and unordered (it's implicitly assumed to be true in OOP's case).

In addition, you'll find that you're using an array at the toplevel only for the primary document when it is a collection, and for nested collections within documents (such as comment ids). Semantics that general-purpose clients can make use of.


Also a fair point.

The original reason for using Arrays (and something that still carries some weight with me), is that people expected that Arrays be presented in a particular order returned by the server. Indeed, the semantics of a to-many relationship need to be set-like (in order to avoid nasty concurrent modification issues), but people really wanted the ability to return an array and have it "work as expected". In general, the right way to handle position, imho, is to use a `position` attribute and sort on the client. After saying all of that, perhaps this is a good reason to use ID indices, so people don't get the wrong idea.

I'll sleep on it :)


I'm not saying you shouldn't use arrays altogether, just that you shouldn't use them when you have uniqueness & no order.

For example, consider the posts.comments.users relationship. Here "posts" is the primary document (and let's say a collection), "comments" and "users" are related documents. The same user may have commented in multiple posts within a single response, so which `position` attribute would you use in the related "users" document? The answer is you don't, you can't, because the user appears in different positions in different posts. The order of comments is defined by the "post" document's "comments" array. Each comment document contains a user id. You look up the user from the unique "users" map by that id. There is no order that makes sense for related documents, since by nature of being related documents their entries may appear in different places/positions in the parent document, where they are already referred to from ordered collections (e.g. arrays of ids) or singular fields.

Hope that clears it all up :)


First, I think it's awesome that you guys are documenting this for others to re-use, whether it ends up being the one true format or not. Having options to choose from is doubtlessly good.

Some thoughts:

1. I'm not sure about the name. There will definitely many JSON APIs that don't use (your) JSON API for a long time, even if this becomes hugely popular. I don't see how this will not lead to avoidable confusion in the future. Given that it's very document-centric, why not use something like "JSON Doc API" or similar?

2. In the ID approach, why are the base URIs the client needs to know about not always discoverable, e.g. using standard link relations? Or phrased differently, why would I ever not want to use the "URL Template Shorthands" approach mentioned later?

3. Why use application/json and not something more specific? I can see some reasons, but would be interested in yours.

4. On creation, if I accept the pain of generating an ID on the client and can construct the URI using the template, why can't I use PUT instead of POST?

5. If I use a POST to create something, why don't I get a 201 Created with a Location header?

6. I'd suggest to upgrade the "MAY" for caching to a "SHOULD".

/edited to match @steveklabnik's numbers


1. All media types are 'document centric.' And real REST APIs serve up documents. So seems fine to me. Also, IANA does not have a 'api+json' type registered (until I did so last night), so the name isn't taken.

2. If you're transitioning _to_ this kind from some sort of older kind. Remember, this is extracted from real, working software; it's not some sort of thing we imagined up. Not everyone is super on the hypermedia bandwagon yet, and some will need to transition kind of slowly.

3. I filed for 'appplication/vnd.api+json' yesterday, and so we'll be changing the document as soon as the IANA gets back to me.

4. You could, in theory. Allowing PUT seems fine, it just doesn't often seem to be the case, so we didn't include it. I wouldn't mind having that in there.

5. You should be, this is an oversight.

6. That's very possible.


> 1. All media types are 'document centric.'

What I meant is that this is a particular kind of backend API, a very "model-centric" one. Nothing wrong with that, I just don't think this is the one and only kind and thus should take on the generic name.

> 2. If you're transitioning _to_ this kind from some sort of older kind.

Understood. Maybe an approach is to allow for this to specified optionally, with the fallback of being hard-coded if it's not present?


I think that the examples appear to be a 'model-centric' one, because we're trying to reach people that build very model-centric sites as of now, but resources can be anything, so I don't think that it's super specific. This is a good thing to think about though.

> with the fallback of being hard-coded if it's not present?

See above for some other good stuff about the IDs that wycats knew that I wasn't as current on.


> I'm not sure about the name

Good feedback. Many people have said this, and I'm thinking about an alternative.

> Or phrased differently, why would I ever not want to use the "URL Template Shorthands" approach mentioned later?

You can think of the ID-based approach as just coming with a set of default URL templates in a top-level rel. The primary reason I included it (and I considered not including it), is that it may be easier for servers, when getting started, to adhere to both strict URL naming and skip generating URLs in the JSON serialization layer of their application. I'm working on some tooling for Rails that will pretty much eliminate these considerations, but I weighed ease of server-implementation when I built this. Again, don't think of the ID form as being URL-less, think of it as coming with a default, easy to implement URL template.

> Why use application/json and not something more specific? I can see some reasons, but would be interested in yours.

Two reasons: (1) Many existing clients and servers already support easy generation of JSON requests and JSON responses. (2) I haven't yet registered an alternative MIME type.

> On creation, if I accept the pain of generating an ID on the client and can construct the URI using the template, why can't I use PUT instead of POST?

Good point. That seems fine.

> If I use a POST to create something, why don't I get a 201 Created with a Location header?

Also good point, and an embarrassing oversight on my part. That should be how it works.

> I'd suggest to upgrade the "MAY" for caching to a "SHOULD".

Hmm. You think a server SHOULD use HTTP caching? RFC terminology is pretty dodgy, but caching really seems more like an optional feature ("One vendor may choose to include the item because a particular marketplace requires it or because the vendor feels that it enhances the product while another vendor may omit the same item") than a strong recommendation ("This word, or the adjective "RECOMMENDED", mean that there may exist valid reasons in particular circumstances to ignore a particular item, but the full implications must be understood and carefully weighed before choosing a different course").

Thanks for all the feedback. It was really valuable.


> You can think of the ID-based approach as just coming with a set of default URL templates in a top-level rel.

Then I'd be perfectly happy. But why is the paragraph that mentions exactly this ("... The top-level of a JSON API document MAY have the following keys") in the "URL-Based JSON API" section?

> You think a server SHOULD use HTTP caching?

No, you're probably right - a SHOULD is too strong. I guess my reaction was more negative to the "MAY" in your text about caching than anything else. The caching section (which is currently in the writing document?) doesn't seem to actually add much value beyond what HTTP says anyway. Maybe dropping it is the easiest path?


I've filed an Issue for you about PUT: https://github.com/json-api/json-api/issues/4


In terms of http://en.wikipedia.org/wiki/Linked_data , there are a number of standard (overlapping) URI-based schema for describing data with structured attributes:

* http://schema.org/docs/full.html

* http://schema.rdfs.org/all.json

* http://schema.rdfs.org/all.ttl (Turtle RDF Triples)

* http://rdfs.org/sioc/spec/

* http://json-ld.org/

* http://json-ld.org/spec/latest/json-ld/

* http://json-ld.org/spec/latest/json-ld-api/

* http://www.w3.org/TR/ldp/ Linked Data Platform TR defines a RESTful API standard

* http://wiki.apache.org/incubator/MarmottaProposal implements LDP 1.0 Draft and SPARQL 1.1


The twitter API originally used pages, but they realized it was a mistake: https://dev.twitter.com/docs/working-with-timelines . The way the facebook API does it is a lot more sane: http://developers.facebook.com/docs/reference/api/pagination... .

I think that you should specify the format for cursor based paging of resource collections. One way to do it would be to require a url to get more results:

    {
      "posts": [...]
      "meta": {
        "next":"/posts/search?q=baseball&after=1234"
      }
    }
Another option would be for it to be a key/value pair that must be added to the url:

    {
      "posts": [...]
      "meta": {
        "next":"after=1234"
      }
    }
Either way, rest clients should treat it as a meaningless string.


Currently, we don't say anything about searching. That's really an application-level concern, not something that needs to be in this spec.

(So you'd define your own rels and use them, doesn't affect this level of abstraction.)


Agreed. If I standardize pagination, I'll be using a "since" token, not pages (this is actually something Ember Data already supports, but weakly).


A few additions that I'd like to see:

* Standardized paging

* Optional side-loading - GET /albums.json?include=artists,songs

* Multiple meta elements - so we'd have "albums_meta", "artists_meta" and "songs_meta" in the example above. This allows us to include 'has_n' relationship paging data.


Thanks for the feedback :)

Standardized paging seems to come up a lot, so it seems like a good thing to add once the core spec stabilizes. Optional sideloading also has come up a few times, and seems easy to add as a MAY in the spec. I need to flesh out the meta stuff in general, and the ability to have "meta anywhere" as well as top-level metas is coming.


Nice, thanks. I've been building an alternative to ActiveModel::Serializers which provides these features.

https://github.com/RestPack/restpack-serializer

In the light of your proposals, I'll either implement JSON API or switch back to AM:Serializers. Side-loading, paging and Ember Data compatibility are my main goals.


I'd love to have your feedback from what you've learned about building stuff with restpack. https://github.com/json-api/json-api/ is the repo, please open up issues for any questions/comments :)


I've filed an issue for you about sideloading https://github.com/json-api/json-api/issues/5


This is important. We've just been implementing a SOAP interface to our software, and have been discussing the best way to provide a JSON API using the same mechanism. The sticking point is probably that there is no standard way to define an API like there is with SOAP.

One question though: I notice that the language they use if very similar to a typical RFC. Is this an RFC? and if not, why? I'm a little naive about the process for submitting them and getting them accepted, but it would be great if this ended up as an official RFC that could be referred to just like SOAP/WSDL.


I strongly suggest you have a look at OData. The latest developments define a similarly lightweight JSON data format. It does everything this proposal does, and more (methinks the things this doesn't do will likely be added as time goes, simply reinventing the wheel - e.g. see the discussions above on pagination or rich query support).

To address your issues around defining a standard JSON API - OData is itself the standard API for any OData service. All that would differ between your OData service and mine is the schema and the data inside. How you explore that schema and access that data is what OData defines.


I find the MUST/SHOULD/MAY (NOT) wording so natural and obvious that I end up using it pretty much anywhere it's applicable, and I assume I'm not alone in that. I do sometimes edit out the caps, though.


RFC vocabulary, and indeed, the entire format is perfect for writing specifications. And software people have become good at reading them. So it's common to see it outside IETF standards and docs.


1) I'm assuming this is Poe's Law trolling about SOAP. If not, then I apologize, but I'm not sure I can assume good faith on this one.

If you're not, then this is significantly different than SOAP, so no, this isn't what you're looking for.

This is not an RFC. It may eventually become an internet-draft and then (hopefully) an RFC.


What timing! I just released the first version of a rails engine that automatically builds APIs to match Ember Data. It also uses active model serializers. Check it out here: https://github.com/southpolesteve/api_engine

I would love to get some feedback on the initial version. I will definitely be implementing more of the OP's spec this weekend.


This is great, but I can't seem to be able to find the NoSQL flavored support. Seems great for relational style data models, but (I'm sure I'm missing something) I didn't see support for full nested document models. Did I entirely miss the point?


The implementation detail of "SQL or NoSQL" should not bubble up to your API. The data store you use is totally irrelevant, that's the entire point of encapsulation.


You've just summed up what bothers me about this format.


This sounds to me as if you're not really in search of a loosely coupled hypermedia API schema, but an RPC mechanism.


Exactly the oposite, I don't want a scheme that's tied to assumptions based on relational databases.


The PATCH mechanisms seem like RPC to me. Having an operation that is passed in the payload, i.e. "replace", is awkward.

Why can't you just PATCH a resource?

    PATCH /resource

    {
        "src": "newvalue.png"
    }


The `PATCH` mechanism is an HTTP verb (RFC 5789: http://tools.ietf.org/html/rfc5789) using a standard patching mechanism (RFC 6902: http://tools.ietf.org/html/rfc6902). Both are RFCs that seemed like good foundations to build on.


Ah, I hadn't heard of RFC6902. I'll give it a read.

I'm still of the opinion that it might be overkill to use only replace from that RFC when you can just PATCH the actual field to change.

Is there some bit of wisdom or experience that I'm missing?


The main reason was to unify patches to attributes with patches to relationships, which do require richer semantics.

It also makes it really easy to add a compound PATCH (updates to posts/1/title, posts/1/rels/author, posts/2/body, etc. all at the same time) in a single format. Once I bought into JSON Patch for the rest of this stuff, I figured I may as well use it for attributes :)


Interesting, how can I get more involved with how this is going to shape up?



You'll see me link to a repo soon with the text of this.


It's hard to do key removal that way, you end up reasserting all of the other fields in that subdocument, whose values may have changed since your last retrieval.

There's also some value in the patch format looking somewhat different to the main format, to avoid confusion.


ah, key removal is indeed much easier this way.


The biggest advantage, though, is the same as any reason why you follow a standard: JSON-patch and the PATCH verb are understood by everyone, and so re-using them to take advantage of generality is better than inventing your own special take on updating JSON documents.


This is fantastic, looking forward to promoting it for widespread adoption once it's stable.


That scheme is horrible. ? should never be part of a rest-like url and you certainly shouldn't request more than one id at a time -- the data you should show should be included in the JSON string.


> ? should never be part of a rest-like url

Why would you say something like that? There's no basis for that at all. From a REST POV, URIs are just opaque identifiers, the characters they're made up from don't matter a bit.


I always interpreted /collection?xxx=yyy as adding constraints to a collection - that's perfectly "REST-like" (actually URIs don't really matter, but I suppose you're talking about conventions). And in that respect /collection?ids=1,2,3,4 is indeed just a subset of /collection

For a single resource you can still use /collection/1


I was a little confused that an "author" rel would be retrieved from a "people" resource. Is there a way to define that author's are people other than the client knowing this?


1. I think that this is a typo

2. This spec is for protocol-level semantics, you define your application semantics with a profile link in the meta section.


Needs an introduction paragraph. E.g. what the purpose is of the site.


please at least include referer link in the response


That'd normally be included in headers, it's not really relevant to a media type.




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: