Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
JSON vs. YAML – try it yourself (json2yaml.com)
49 points by lfender6445 on April 7, 2015 | hide | past | favorite | 68 comments


This is valid JSON in the same LOC: http://i.imgur.com/gv79uNM.png

It just seems like YAML is making things more implicit. I like explicitness. Not against YAML, but it's no holy grail in my mind.

I'm more experienced with JSON than YAML and could be swayed on this, but so far I'm not. They seem like two roughly equal ways to do the same thing.

This seems oddly similar to the nit-picky problems people have when learning Python from a C-like background and being annoyed that semicolons and flexible indentation aren't the norm. When I spend months working on JS code, looking at Python slightly annoys me, and vice-versa, so I think I get it. I just don't think it's an important, game-changing thing.

I'll be learning about Ansible soon, so I'll have to also learn YAML syntax. I guess I'll have a more educated view once I've gotten deep into that.


It may not look like it but it's just as explicit in YAML as it is in JSON. In fact JSON is valid YAML even: http://viewsourcecode.org/why/redhanded/inspect/yamlIsJson.h...


YAML has a couple big advantages (in the proper situations) over JSON. Self-referencing, complex datatypes, and (my personal favorite) embedded block literals. I've, perhaps unwisely, embedded entire webpages a single indentation level after a variable and retained all of its formatting.

I'm no YAML zealot though, I do use JSON most of the time, but in the past couple years I've started writing most of my config files in YAML where I have the opportunity.


We recently moved much of our application's schema definitions and configurations out of code/JSON and into YAML files (we generate JSON during the build process in placess where it is necessary). A pleasant side effect was that the YAML file set began to double as basic documentation.

e.g.

    name: user
    plural: users
    label: User
    ...
    primary_key: uid
    display_key: full_name
    unique_keys:
        - username
        - uid
    client:
        storage:
            type: IndexedDB
    ...
        actions:
            create:
                label: Save
                permission: create_user
    ...
    server:
        storage:
            type: mysql
    ...
        api:
            create_user:
                url: /users
                method: post
    fields:
        username:
            name: username
            label: Username
            type: string
            length: 20
            required: true
            editable: false


This is very interesting.

So is this the equivalent of multiple JSON documents? I'm looking at:

http://yaml.org/spec/1.1/#id857577 and http://yaml.org/spec/1.1/#stream/information%20model

I'm actually having a hard time, even with some google-fu, finding information about how this might be used. YAML->json parsers have failed to handle docs with a '...' separating parts of a document so far. Would this triple-dot structure be used to create multiple JS objects? How would I find more information about how this might be used in practice?

I have a feeling this would be easier to figure out if I just went and played around with it in practice, but it also seems like something that could be documented in more beginner-friendly terms.

I'll probably be kicking myself once I figure it out, but I also only have a cursory understanding of how streams work. Maybe this is an opportunity to fix that. Once I figure it out, I'll respond to this if nobody else has.


I think the GP was using '...' as a standard "stuff removed for brevity" ellipsis.

However, since you asked:

In a YAML document, there is an implicit top-level object, either a hash/dict/associative array or a list/array depending on your formatting.

The '...' separates streams, or top-level objects, within a single file or IO stream.

YAML parsers generally stop when they hit the separator, even in languages that can do multiple assignment. If your IO stream behaves like a file handle, you can read it repeatedly into different variables until EOF.

Multiple streams in a YAML document are fairly uncommon. I think most people don't know they exist, but I appreciate the flexibility and use it whenever it makes sense.


Actually the '...' is for parts I've omitted for brevity.


The stack overflow page is disappointing because in my opinion all of the answers miss the point: Yaml is designed to be readable and editable by humans. Json is only designed to be human readable - and intentionally does not have features to support editing by people.

Json intentionally does not support comments. Think about that for a bit and you'll realise what Json is for and what it isn't for.

As SeoxyS said: use yaml for config files, Json for APIs.


Actually, after finding out about TOML[1], I would suggest to look at it. It is a lot simpler than YAML, explicitly does not support dangerous features like deserializing arbitrary data structures, while being very readable. Spec is not entirely stable yet, though.

1: https://github.com/toml-lang/toml


Agreed. After using both YAML and TOML, I'd say that TOML is much more readable as compared to YAML.


TOML does indeed look good, reminds me of a extended .ini format


can you give me dumbed down example of 'deserializing arbitrary data structures' not sure i follow


We don't need two different languages for this.


Oh, yes, we do. We need a language targeted at humans that is machines can process and we need a language targeted at machine processing that is inspectable by humans. Those are two contradicting targets: the latter calls for simplicity, but the former calls for shortcuts[#] to make human's work easier, which adds, not reduces, complexity.

[#] Shortcuts like not quoting keys and omitting braces and commas in hash definition.


I like both JSON & YAML for different reasons. YAML is a great configuration format; it's fantastic for being read and written by humans. JSON is much better as a serialization format.

Write config files in YAML. Write APIs in JSON.


Why is it a better serialization format compared to yaml?


it's more explicit.


I don't really see how it is more explicit. Can you explain? Because of the braces?


i would say json is better because it is natively supported by browsers. assuming yaml was as well, would it come out a winner?


Biggest thing about yaml for me was that you can have comments. This is very useful especially when you use it for configuration.

For data interchange, I don't see a lot of advantages to yaml though.


One thing that comes to mind is (for data interchange), it allows for custom objects to be explicitly marked as such

http://www.yaml.org/spec/1.2/spec.html#id2805712

additionally, it allows multiple top level objects in one file (which JSON can't do (not counting 3rd party modifications))

other than that, not much other use


> additionally, it allows multiple top level objects in one file (which JSON can't do (not counting 3rd party modifications))

That's just different syntax for a top-level list of objects.


I hadn't thought about comments - that is a clear advantage to YAML. Great point to bring up, and supports the popular view I'm seeing that YAML is maybe better for config files. I can think of numerous scenarios where a configuration decision could use some context.


For some use cases - and configuration is a prime example - it can make sense to add the comment directly to the data, as another property of the relevant object. that then allows you to use the comment e.g. in a front-end config editor.


I could see comments being really useful in an API.


HHVM (FB's PHP/Hack) moved from YAML (hdf flavor) to INI for its configuration file format. Sad but true, it was just awful to work with - good that they changed it. https://github.com/facebook/hhvm/wiki/Runtime-options


yaml was awful to work with? can you provide explanation


YAML IS INSECURE.

Your parser might not support instantiating arbitrary objects, but those your programs interact with might... :(

Anyone else recall the problems with YAML vulnerabilities? An old blog post by me: http://williamedwardscoder.tumblr.com/post/43394068341/rubys...

So never ever use YAML for tainted input.


One problem I've hit a few times is parsing of strings with escapes in YAML. In JSON it's absolutely clear how escaping is works, in YAML, in practice, different parsers do subtly different things.


I have had a similar experience with spaces handling in yaml: the application configuration was producing error because of tabs were being used instead of spaces ....


Shameless plug: Human JSON, http://hjson.org

A different approach to using JSON for configuration files. It sits between JSON and YAML.


I like the idea. I like that you even round-trip comments. I suppose that doesn't add too much complexity to the implementation?

Have you ever been bitten by the dwimmy typing of numeric and true/false/null data?


No, the whole implementation is actually very simple. It's based on the 'standard' parsers for JS/C#/Python and only differs for handling quoteless stings and the optional syntax.

No problems with dwimmy typing ;) - not saying that it couldn't happen but I think that would be the exception. Not having to use quotes/escape characters helps a lot more.


I hadn't really looked into YAML before, so I fed it some data for a web app I'm working on.

Man, It sure is condensed, but because of that, not very readable to me. If this is the standard for YAML, it's definitely interesting, but not my cup of tea. It's sometimes hard to parse where a list ends or an object begins. If you want to argue that it's more space efficient, I'd say, just use gzipped JSON. If you want to say you're using it for the spacing and/or line breaks, I'd just say find a viewer that prettifies your JSON well.

Having a delimiter, at least when you're representing chunks of data, is really useful, easier to code for, and easier to read than tabs or other systems. This, not so much. It's kind of a hot mess. Maybe that thinking works well in python, but I'm not so sure the principle translates away from code. If I have a list of objects, I really need to know where one thing ends and the next begins without having to keep track of more than one thing at a time.


Your argument is inconsistent. You're saying that YAML is hard to read, but you suggest people who find JSON hard to read ought to use a viewer that prettifies it. Why aren't you applying your own logic to YAML, and finding a viewer that makes it easier for you to read?


YAML requires a specific indentation and spacing to work. There's no changing the layout.

JSON is free to do whatever. You could have everything in one line if you wanted. You could use indentations of 8 spaces or 21 tabs. White space does not matter! Point is, it is not hard to figure out a sensible indentation and line breaking scheme if you need it. This is the beauty of braces when dealing with data.


Yaml is in fact a superset of JSON. You can parse a json with yaml parser. You can also mix json and yaml in the same document and use [..., ..] for annotating arrays and { "bar" : "foo"} for objects. Some yaml libraries allow you to generate yaml that has one style for top level and json style for nested elements / long arrays to improve readability.

However I have also learned the following:

- in python, parsing yaml file is hundred times slower than json (in fact, my benchmark was showing around 400x slowdown). Therefore you can't really use it in cases where performance matters at least a little bit. A yaml 1k line yaml can load more than half a second (yrmv)

- if your yaml doc is longer than two screens, it loses its readability benefits.

Therefore it is best to use a whole directory of yaml files, each describing a specific feature. E.g. Ansible is a good example of how to use yaml files.


> Ansible and YAML

I am searching for an replacement of / alternative to Ansible written in Go or Rust that uses JSON or INI as config format.

(Ansible is written in Python. Both Python and YAML rely on outline indentation which causes many headaches)


He should compare the specs too. I can learn JSON in less than 2 minutes. I cannot say the same about YAML.


And it might just be confirmation bias, but I've seen far more exploits in YAML parsers than in JSON parsers. YAML might be okay for configuration files, but I would never use it for data exchange…


YAML = configuration, JSON = serialization.


I wish they'd called it YACL instead. At least JSON is an accurate acronym!


Hmm, I don't like the fact that strings don't have quotes; 1 vs '1', false vs 'false', with javascript's === operator, it's nice to be explicit about the type of data, since javascript is where most json (and later YAML maybe?) gets consumed...


YAML seems inherently unsafe as it's indentation based, which makes copy-pasting from different levels very difficult. The best (and one of the oldest) serialization format is S-expressions. You have the best of both worlds, compactness and non-significant whitespace.


Comments, trailing commas, and 64-bit integers. That's all I want. Is that really so much? :(


I really like yaml. Especially for writing config files. But I still wish there was a cut down version without the bells and whistles that aren't really useful/relevant to human read/written documents. Just a superset of json syntax, not semantics.



I've heard of YAML, it seems ok, but it'll never catch on. JSON is too established.


I got my own small language too: https://github.com/Cirru/cirru-json


I'm suprised nobody mentioned that JSON lacks date data type.

This is annoying as not everybody using ISO 8601 strings as dates.


As several comments here point out, the intended usage of JSON is as a serialization format. In which case, I would expect dates to be in epoch time (if you need to include timezone data, that should be a seperate field).

What annoys me is that JSON lacks a binary data type. The best you can do is base64, which really sucks if 99% of your binary data falls into the ascii range, but you have the occasional high bit character and you explicitly aren't trying to treat it as unicode.


"I would expect ..." that's the problem. Expectations fail when both sides don't have the same expectations. As there is no way in JSON to specify what type a field is and as dates aren't even specified in the JSON, you aren't sure if dates are in epoch time or any common or uncommon variant of ISO 8601 (or worse, non of these two options).

JSON hasn't been designed for binary data, so it's not surprising that it lacks a binary data type. There are several options besides base64, you could e.g. use yEnc or BSON.

But unless you have really large binary data (in that case I would instead of embed it in JSON, only embed an URL in JSON and let the client download the data separately), I wouldn't bother with another encoding than base64. It is easily compressable and this is handled transparently, so you reach such a low overhead that it is hard to justify using a non-standard option like yEnc.


Is compression rate nowadays still a factor? If so, is the performance of YAML worse than JSON?


wow, there are some really amazing points here. the take away i've gathered is that yaml is preferred for configuration, with json being the clear winner for data interchange. any suggestions for improvements to the site or content?


I'd already be pretty happy if we moved from XML to JSON. YAML would be icing.


Think again if you really want that for every service.

I've encountered many poorly written services where I would have been happy if there were a schema to be at least sure what kind of messages are exchanged.


JSON is the new XML.


Yes, BUT. The other day someone proposed a JSON 'standard' that mimicked SOAP - don't do that.

JSON is beloved because it's easy and simply. XML was originally simple too with just DTD as Schema. Then they come up with XML-RPC, XMLSchema, XSLT, SOAP and many other complex concepts that in the end more or less failed.


> XML was originally simple too with just DTD as Schema.

The complexity of XML has not changed since its inception.

> Then they come up with XML-RPC, XMLSchema, XSLT, SOAP and many other complex concepts that in the end more or less failed.

This doesn't really has any bearing on the specification of XML itself. XML is simple (with some caveats like entity expansion), and extremely flexible. The problem is that it is designed to be read and written by machines, while still being debugable by people. It is extremely verbose. It shines in some use cases, eg for documents with a complex structure and a lot of semantic metadata. But it is even less suitable than JSON for configuration files, for instance.


the problems with xml (as in "outside the XML-RPC, XMLSchema, XSLT, SOAP stuff") IMHO arises these two / three points

* it's a little too verbose (<xml></xml>.. while even S-expressions use just ')' as terminator)

* there is no obvious way to transform any xml to an object (pojo is the best aproxymation, but how do you differentiate between a sub-tag, a text node, and an attribute?)

* it's essentially typeless (how do you serialize a number? how to differentiate it from a string? from a boolean?)


I agree that it is verbose, but I am not convinced that S-expressions are the solution. In a complex, deeply nested XML document, you should be able to tell where a given tag is inserted without counting parentheses.

> there is no obvious way to transform any xml to an object (pojo is the best aproxymation, but how do you differentiate between a sub-tag, a text node, and an attribute?)

That's OK. Just don't use XML to serialize data structures. IMHO, the use cases at which it is good are a lot closer to use cases for which HTML works than when you hesitate between XML and JSON.

> it's essentially typeless (how do you serialize a number? how to differentiate it from a string? from a boolean?)

That's not entirely wrong, but you can enforce a lot of things via XML schemas (eg, something like XSD or RelaxNG).


> you can enforce a lot of things via XML schemas (eg, something like XSD or RelaxNG).

XML is structured text that can be validated. XML's biggest selling point!



The conciseness of XML with the power of JSON!


I wished the author would have _not_ used flash for the buttons.


there was difficulty getting the copy feature to work outside of flash. did you run into a specific issue?


Nothing too specific, but I usually have flash disabled on the browser I use the most and had to view the website in a basic Chrome install instead (which I do have installed for this kind of cases). ;)

I thought, honestly, that html5 api was able to provide the same capability as flash, as far as file/clipboard was concerned.


thanks, i'll investigate




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: