Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

> Reading data, at even a GB/second from disk (which is currently not possible) is going to mean a second spent of a GB of data, just to read, let alone deserialize.

If that's just saying that startup time can be an issue, I agree. There are a variety of techniques to mitigate that, though. The simplest is to compress snapshots and/or put them on RAID, boosting read speed. The most complicated is just to have mirrored servers and only restart the one not in use right now.

> I'm talking about having more data than fits in an reasonable amount of RAM (say 1TB).

For something where you need transactions across all of that? This architecture's probably not a reasonable approach, then. The basic precondition is that everything fits in RAM. However, sharding is certainly possible if you can break your data into domains across which you don't require consistent transactions.

> So it's going to at the speed of the disk.

Sort of.

Because it's just writing to a log, mutations go at the speed of streaming writes, which is very fast on modern disks. And there are a variety of techniques for speeding that up, so I'm not aware of a NoDB system for which write speed is the major problem.

Regardless, it's a lot better for writes than the performance of an SQL database on the same hardware.

> My other question is how much of a pain in the ass is it to debug such a system?

It seemed fine. A big upside is that you have a full log of every change, so there's no more "how did X get like Y"; if you want to know you just replay the log until you see it change.

Last I did this we used BeanShell to let us rummage through the running system. It was basically like the Rails console.



Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: