> jq appears to be largely unmaintained at this point, which is sad
They might be happy to learn that after several years of halted development and unreleased improvements & bugfixes, jq ownership got transferred from the original author to a new organization, which allowed to resume operations:
The real lesson here is that sometimes building the correct interface calls for one to build an entirely new grammar to better express the task at hand.
Languages like Lisp make this easy. Ruby a little less so. C-type languages way less so!
DSLs in Java, C or JavaScript are so atrociously ugly and illegible that it almost defeats the purpose of a custom grammar, IMO.
C actually wins the family DSL award because it has a preprocessor with macros, a foot gun if I ever saw one!
My pet theory is that LLMs will soon be competent enough to build the tools for a little language without much hassle, allowing for software to better express the task at hand.
When jq came out about a decade ago I fell victim to a nerdsnipe and decided to try and mimic its syntax by building a parser in javascript: https://github.com/tjoekbezoer/jsq. Never completely finished it, nor used in a professional context but I learned a fair bit and it works quite well.
It's old code — still using google's java-based compiler — so be gentle please :)
It can be rather argued that antirez has chosen his language to fit into a small code. Indeed, such mini-languages are not uncommon in C: not only printf and scanf, but for example Python's PyArg_ParseTuple and friends [1]. It is also common that they are underdocumented, have unexpected pitfalls (for example, this selector library won't tell you that you have a typo in the typecheck string), have strange contortions to keep up with C's lack of features (for example, this library reserves * in the keys because C makes string concatenation difficult, so you are forbidden to use that character in verbatim), and tend to be difficult to change because everything is a bare string.
ADDED (just in case that the comment sounds like a mindless C bashing): I wanted to highlight the point that this particular design is not general, but stems from the language choice in many ways. Those weaknesses largely overlap with those of DSLs in general, so another way to phrase this is that C kinda fosters a creation of DSLs, which is an interesting observation! By comparison the language choice itself is out of concern for me---I wrote a number of sizable softwares in C after all.
json1 [1] is basically part of SQLite3 those days and I don't see the -DSQLITE_OMIT_JSON flag on this code.
I find the choice of building and bundling yet-another-thing that specifically is already "there" (not sure if by mistake or intentionally) on a minimalist system very strange.
I also trust things in the SQLite3 codebase way more that my own code.
My knowledge of SQLite features kind of got frozen in time when I stopped doing python work as frequently as I used to a few years ago. I recall lamenting that I couldn't use their fancy new features because I was more or less constrained to whatever version Python bundled for its default. Can I now trust modern python installations to have native-json enabled sqlite built in?
> My knowledge of SQLite features kind of got frozen in time when I stopped doing python work as frequently as I used to a few years ago. I recall lamenting that I couldn't use their fancy new features because I was more or less constrained to whatever version Python bundled for its default.
Python uses an SQLite3 shared library, all you need to do to upgrade the SQLite3 engine is replace the shared library (.DLL on windows, .so.0 or something like that on Linux). You've never been locked into a version of the SQLite engine (well, if SQLite4 came out, you'd probably have to wait for Python to support that instead of only SQLite3, but...)
> Can I now trust modern python installations to have native-json enabled sqlite built in?
JSON has been compiled in by default since 3.38, my Python 3.10 installation (3.12 is current python) has 3.40 bundled, so, depending on what you mean by "modern" that seems reasonably safe (but, again, you aren't limited by what is bundled.)
OTOH, the JSON5 support is only from 3.42 onward, so even reasonably modern Python might not have that built in.
> Can I now trust modern python installations to have native-json enabled sqlite built in?
There is no absolute guarantee because distros can use any supported SQLite version to build Python (though Debian [1] and thus Ubuntu has long enabled -DSQLITE_ENABLE_JSON1), but you can expect that any Python version released shortly after 2022-03-16 [2] supports JSON functions in the Windows and macOS builds.
jq has a very simple `jv_getpath()` that is a smaller "language" still (the "language" being an array of string and integer indices). I believe Jansson and others also have similar get-path functionality. They don't have the asterisk feature that this has (which is not like .[] in jq), and which looks admittedly quite useful (in terms of ease of use but also in terms of heap allocation avoidance). OTOH this doesn't allow access to non-ident-like object fields except via the asterisk feature. Neat!
Goto in C is the tamed and civilized structured-programming goto, it has not much to do with the "considered harmful" unstructured goto from days of yore (e.g. you can't jump out of a function into a completely unrelated part of the program with C's goto like you can in BASIC.
It's the normal use of goto in C: you run into some exceptional circumstance, you don't know how to deal with it, so you goto some code that cleans up any unfinished business you might have, like memory allocations – and then you return some error code as soon as you can, once everything is in a good state.
'goto' is more universal and may make cleaner code even in loops. E.g. to loop backwards without 'goto' one has to resort to an idiom like that:
for (i = n; i--; )
and I wouldn't say this idiom is that clear. You can get used to it, of course. Another interesting case is printing a list of items with separators. C loop constructs are biased cases of frequently used loops but they do not cover all the possible cases. 'goto' is unbiased.
A similar case are Python loops. There are “Pythonic” ways:
for item in iterable
...
They look sweet but are useful only in some cases. E.g. if you need to modify `iterable`, the Pythonic way won’t work. There are Pythonic workarounds, but they may involve copying a whole iterable, which is hardly a good idea. But there is also a less Pythonic way:
i = 0; n = len(iterable)
while i < n:
...
This works all the time and serves all possible cases.
> This works all the time and serves all possible cases.
Unless you want the check to be always against the initial length of your iterable, rather than its modified state, you'd have to repeat the `n = len(iterable)` inside the loop, or introduce some syntactic caramel of the walrus kind (though I'm not sure if it would work here).
Yes, all this works. These are Pythonic ways. But see, here the `enumerate` function will allocate a new number and a tuple for the number and the item. Then `i, item` will unpack that tuple and deallocate the tuple memory. All this happens quickly (in a loop it may be essentially the same memory for every tuple) and CPython is well optimized for that, of course, but it is not free.
With an integer cursor (`i`) there are no unique numbers for each item and no temporary tuples. This is slightly cheaper. And also more flexible, which is the main point I'm trying to make. Other loop constructs are meant to be used in certain way and will not work otherwise. Integer cursors are not meant to be used in any special way, you can combine them as you wish. They give the maximum possible flexibility from the beginning. You can loop in any direction, by multiple items or by a variable number of items, over many iterables at once, changing the iterables as you go, and so on.
(A special case in Python are list comprehensions; these are useful because most of that “comprehending” happens inside CPython and thus they are faster.)
There are languages that tried to make these kinds of loops a bit more explicit, like Ada with the reverse keyword, and the whole named loop/block which used with the exit keyword can make for more compact, succinct loops that still read without mental gymnastics.
These kinds of choice don't come alone, but together with the 'step' keyword, the opiniated absence of the 'continue' keyword and the availability or inner/embedded functions/procedures.
For me classic foreach iterators that mask the actual iterator fall apart when I need some kind of 'join' operation (do something specific on each first, last and neither elements of a collection).
It's known that many developers implement various flavors of similar simple languages. What would be nice is to have a library of functions which make implementing such languages much simpler than in C. One of benefits of J language - and other APL descendants - is the set of built-in operations which really simplifies working with structures of such kind, even though it's not the only their purpose.
The query string is treated as format string in the json lib, whilst for the similar sqlite wrapper varargs are used.
Here it is easier to use a format string.
Btw please let's not argue about this program being written in C—it's not the spirit of the OP and it's mostly a boring digression.