Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

JSON is a serialization format that was based on the data structure notation for Javascript. It is, however, a serialization format. Javascript objects are a superset of JSON, as they can contain arbitrary objects and functions, which JSON can not, and "true" Javascript object notation can elide quote marks or use apostrophes for keys, whereas JSON strictly specifies double-quotes around keys.

The problem that arises in Python and other dynamically-typed languages is that there exists a default deserialization that is so good it very strongly tempts the programmer to use that exclusively. However, as good as the default serialization may be, it's also quite dangerous, for the reasons you mention and more. In strongly-typed languages there's a stronger focus on parsing the JSON instead, which has the advantage of producing objects with stronger guarantees (which isn't quite the same as pointing out there's stronger typing here, you could theoretically get the same guarantees in Python with some sort of JSON schema library or something), but has the disadvantage of generally being more challenging, since it's hard to beat the conciseness of "json.loads(s)" in Python. There generally is a default serialization in strongly-typed languages, but it's far more likely to become inconvenient if you need anything beyond simple numbers and strings, and people generally learn to prefer true deserialization in my experience, unless they, alas, start their program out from day 1 inputting and outputting JSON and accidentally structure their entire program around the default JSON structures and end up with the exact same problems as you'd get in Python. But as long as JSON isn't the very first thing to go in you're generally in better shape.

(I have witnessed a Java program primarily written as a map of string to map of string to map of string to string. It was unsalvagable. And for all the cynicism I may occasionally muster, I don't say that often, because refactoring can be pretty powerful in Java, but this was beyond help. It actually had no JSON in site, but the same fundamental forces were in play.)

Personally, despite preferring the more strongly typed approach in most ways, I must confess that when I'm working in Perl I am generally unable to resist the temptation to just JSON::XS::decode_json, and cover over the differences with unit testing rather than dealing with "true" deserialization. I make myself feel better by also telling myself that if I do anything else, I will confuse my fellow programmers who don't generally expect to see fancy deserialization routines when dealing with JSON, which is true enough, but in my heart I still know guilt.



> In strongly-typed languages there's a stronger focus on parsing the JSON instead

I think you mean _statically_ typed languages here. Python is a strongly typed language.


The definition of "strongly-typed" that Python conforms to is a useless one, because very few weakly-typed languages exist anymore, and even those are only very, very partially "weakly typed" by having operators that are defined to do automatic coercion, generally only between strings and numbers, which isn't even the way the original "weakly typed" was meant. The only truly "weakly typed" language I know of that is still extant is assembler/machine language, which can never really go away, where all you ever have are numbers, and thus absolutely nothing stops you from adding a string pointer to the first element of some structure. (Modulo some distinctions that still may exist between floats and integers and such... even assembler isn't as weak as it used to be, though it is still by no means a strongly-typed language.)

So I don't use the useless definition. By any useful definition of strong vs. weak typing, that is, one that actually creates two or more non-empty, non-trivial sets of members in the universe of discourse, Python is a weakly-typed language.


You can use whatever definition you want in your own head, but you cannot expect anyone else to accept it.

Furthermore, one of the most popular languages in use today is weakly typed: JavaScript!


Here's a plausible definition: Strongly typed languages disallow programs to escape the type system (e.g. put an integer into into memory, then treat it as a float, or vice versa, as in the famous Quake fast-inverse-sqrt hack.). Oh, look, Python is strongly-typed and C is weakly-typed.

Note that I do not endorse using "strongly-typed" to mean this definition, or any other. There are no useful definitions of this phrase, don't use it, except when correcting people.


I guess you need a term for "typing of moderate strength"


I'm not sure I prefer the strongly typed approach: I come from Lisp, so the approach is: "read it, validate it, and then wrap it in functions to hide the implementation in case we change it."

This works well in Lisp, where the line where objects end, and structs, lists and functions begin is hazy at best. Besides with a bit of wrangling, you could probably just pass your validated serialized data to the object constructor as the arguments. Or you could just write a struct, which is simpler than an object, provides O(1) access, and you can still probably easily pass your datastructure, or something close, into the constructor as the arglist.




Consider applying for YC's Fall 2025 batch! Applications are open till Aug 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: