Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Sure, for aggregating many small databases this makes sense. Disaggregation of storage and compute has been popular at many times in database history, the tradeoffs are well-understood. Some databases are not particularly performance sensitive and it eliminates resource waste from resource overhang when databases only need fractions of server. For single large databases that occupy many servers, the economics tend to follow the size of the cluster which can be significantly smaller with direct-attached storage. Scaling storage independent of compute introduces some interesting edge cases operationally because they are intrinsically coupled to some extent — scaling one creates resource pressures and bottlenecks elsewhere.

An alternative model to disaggregation that can produce a similar dynamic balancing of compute to storage without sacrificing bandwidth in scale-out systems is to use a dynamic mix of heterogeneous nodes. The storage is still tightly coupled to the compute but the ratio of compute to storage of the aggregate system can be quickly adjusted on the fly by adjusting the mix of server types. I haven’t seen a lot of work on heterogeneous cluster architectures in a long time — it was impractical when database clusters were capex — but it is a proven model albeit complex to implement, and easier to deploy today thanks to the cloud.

Many of the emerging data problems at the edge look like heterogeneous cluster architecture problems too if you squint, so it likely has value beyond the data center. The edge has a lot of surprisingly-shaped problems that don’t fit any of our current tools e.g. some of the most extreme scale-up database problems I’ve seen in any context.



Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: