I'm not sure about a cheese being prescribed as a cure... And also, I'm very tired about all those articles that talk about another new 'miraculous food'.
Very true. I've noticed in the past couple of years that the quantity of these kinds of articles on the BBC seems to be increasing, which is a shame because I've long considered them 'above' this kind of journalism. Just looking at their front page it seems over half of the articles are like this. On the other hand, I can understand why - people click and share these kinds of articles much more frequently, so I imagine it makes their advertisers very happy.
That paper is pretty good but it's comparing bw-tree with much simpler in memory data structures. I think bw-tress might work specially well for fast-disk storage.
This was my interpretation as well. I'm going to compare a disk-backed bwtree with a disk-backed ART, both backed by the same pagecache, and maybe end up with an ART that scatters partial pages on disk, bwtree style. But I need to measure apples to apples on the metrics that matter for storage first. The pagecache is where most of the complexity is in my implementation, and it makes building different kinds of persistent structures on top of it pretty easy. docs.rs/pagecache
I understand the benefits of a single numeric type, but a type that doesn't support full 64 bits is just painful. It makes documentation smaller and benchmarks look good but often becomes a problem in real usages.
Wren does support 64 bit floats. For 64 bit ints, I find it's most awkward for interop with other languages, such as when using protobufs. If you control both ends, you can avoid this.
The storage engine is and always was a fairly heavily modified asynchronous version of sqlite's btree. It's been extremely reliable, which was always our top priority, and the performance isn't bad. But honestly when there was a problem with it our development velocity improving it wasn't great.
It's super easily pluggable[1], so now that it is open source people can experiment with other engines. I think there is a lot of room for improvement. Also architecturally it's designed in anticipation of being able to run different storage engines for different key ranges and for different replicas. For example, you might keep one replica in a btree on SSD (for random reads) and two on spinning disks in a log structured engine.
It looks to me like Apple has made a pretty complete release of the key/value store. What's missing is
(1) Layers! Everything from relational databases to full text search engines to message queues
(2) Monitoring stuff. Unsurprisingly it doesn't look like we have the tools for monitoring log files, etc. Wavefront (also a major user!) is a great commercial solution, but there should be something OSS
Truthfully at Wavefront we've taken the json status directly into telegraf. Plus a bunch of python tooling to massage additional telemetry on a clusters health (coordinator reachability for example).
Plus even more tooling (mostly Ansible) for managing large fleets.
I did in my previous job. It was a good experience except around the time we started seeing corrupt LDTs (Large Data Type). I think a proper solution was never found and they eventually deprecated the feature. I'd recommend it IF you really think you can model your business on top of the KV interface.