Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

> Well it turns 100k inserts into tens of millions.

I don't think "number of inserts" is the right metric because the total amount of bytes being written is almost the same, it's just written in different areas and still mostly contiguously. I think "number of distinct tables being written" is a better metric. Assuming all 400 columns become 400 records in the data type tables, say evenly divided among the 4 most common data types (int, decimal, text, date), that would be more like (4 or 5) x 100k = 400k-500k. I would still hesitate to naively compare it this way without a benchmark though, because with 4 or 5 tables being written there's also less contention than there is when everyone is writing to 1 table.

Regarding indexes, you can index the field tables but obviously this applies to the whole table. Without more understanding of your domain I can't really say if this breaks down. There's also the possibility that this one table is serving two competing goals, eg. perhaps the user should be able to add data incrementally (so this pattern applies), but then at some point when all of the data is filled it should be migrated to an actual table with all non-null columns that can be indexed and queried/processed as you usually do.

In any case, what I've sketched out isn't really new, it's been used at least since the 90s under "Dynamic Object Model" and "Entity-Attribute-Value" [1,2], so if they were using on hardware in the 90s I can't imagine the pattern would be unusable on modern hardware.

[1] https://www.cs.sjsu.edu/~pearce/oom/patterns/new/Riehle.pdf

[2] https://softwarepatternslexicon.com/patterns-sql/4/5/



Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: