Hacker News new | past | comments | ask | show | jobs | submit login

Utter insanity. So much cost and complexity, and for what? Startups don’t think about costs or runway anymore, all they care about is “modern infrastructure”.

The argument for RDS seems to be “we can’t automate backups”. What on earth?




Is spending time to make it reliable worth it vs working on your actual product? Databases are THE most critical things your company has.


I see this argument a lot. Then most startups use that time to create rushed half-assed features instead of spending a week on their db that'll end up saving hundreds of thousands of dollars. Forever.

For me that's short-sighted.


Startups are in the job of earning millions. If they can spend $100k on a managed DB now and just spend every braincell on getting their product right, it's a win for their investors.


That mentality is working wonders right now, isn't it?

Hundreds of dead startups because after all that unnecessary spending, they still have unnecessary buggy software that got sold to other startups that, when push comes to shove, will cut spending in those same startups that offer half-baked buggy products.

What you say is definitely what they preach. But I don't agree or see that as a good logic.

Way too many startup founders decide to build shitty products with short-sighted solutions like these, following whatever is trendy (crypto, AI, etc) because investors advise them to. Guess what: the investor doesn't care about creating a good business. He wants a unicorn. So they advise them to make all-or-nothing moves knowing it will most likely kill the startup.

It's definitely "a strategy". But I think it's short-sighted as hell.


That's the whole point of startups and VC: not spending money on safe investments that provide a 10% return, but spending large amounts of money on risky investments that provide, when averaged together (rare unicorns on top of a mound of dead startups), a 20% return. Both numbers completely arbitrary, of course.


Isn't this only the latest definition of "startup"? What exactly defines a startup? Is it growth at all costs? Is it reckless spending? Complete focus on short-term gains? Something else?

These investment rounds are only there to provide money to start. Unless the founders sign away the control of their company to investors, they are still at the helm of the ship. They can choose how they will approach growth.

We can see how the entire scene is gearing towards profitability now that the money dried up, so this growth focus is no longer the only game in town.

You can still take investments and accelerate growth without having to recklessly go all-in. But I've never taken VC money. Maybe that's baked into the contracts?


All that infra doesn’t integrate itself. Everywhere I’ve worked that had this kind of stack employed at least one if not a team of DevOps people to maintain it all, full time, the year round. Automating a database backup and testing it works takes half a day unless you’re doing something weird


Setting up a multi-az db with automatic failover, incremental backups and PiTR, automated runbooks and monitoring all that doesn't take half a day, not even with RDS.


No, but again, that sounds like a lot of complexity your average startup does not need. Multi-az? Why?


Because their Enterprise client requires it on their due diligence paperwork.


Which makes little sense anyway as in practice the real problems you have are from region/connectivity issues, not AZ failures.


A startup sized company using this many tools? They're for sure doing something weird (and that's not a compliment :) )

Totally on your side with this one - but alas, people associate value with complexity.


> Automating a database backup and testing it works takes half a day unless you’re doing something weird

True story bro

I'm sure that's possible if you're storing the backup on the same server you're restoring on and everything is on top of the line nvme storage. Otherwise your backup just started to run and will need another few days to finish. And that's only if you're running single master.

You're massively underestimating the challenge to get that kind of automation done in a stable manner - and the maintenance required to keep it working over the years.


I’ve implemented such a process for companies multiple times, bro. I know what I’m talking about.


And that's the problem. "It's easy for me because I've done it a dozen times so it's easy for everyone" is a very common fallacy.


This is an oversimplification, but! Dumping postgres to a file is one command. scp the file to a different server is two commands. (Granted you need to setup ssh keys there too). I have implemented backups this way.

With sqlite you only need the scp part.

You can even push your backup file to an S3 bucket... with one command!

Honestly, this argument mystifies me.

Of course you can make it as complicated as you want to, too. I've also worked on replicating anonymized data from a production OLTP database to a data warehouse. That's a lot more work.


And that works right until you get to publish an incident report like this:

https://about.gitlab.com/blog/2017/02/01/gitlab-dot-com-data...


> Our backups to S3 apparently don’t work either: the bucket is empty

It took them a data loss incident to find this out? This is just one of the many red flags mentioned in the article, IMO this incident isn't about relying on cloud backups vs self managing it


Yeah. Testing your backup works can be almost as much work as setting the thing up in the first place, too, but you do need to do it


What happened to having people trained by external trainers for what you need? That’s much cheaper than having everything externally “managed” and still having to integrate all of it. The number of services listed in TFA is just ridiculous.


I've done it before,too. For toy project, it's easy as you said. It's not once you're at scale. It's hilarious that people are down voting my comment. I guess there are a lot of juniors suffering from the dunning Kruger syndrome around right now


I worked at a place with its own colo where they ran several multi TB MySQL database servers. We did weekly backups and it could take days. Our backups were stored on external USB disks. The I/O performance was abysmal. Taking a filesystem snapshot and copying it to USB could take days. The disks would occasionally lock up and someone would have to power cycle them. Total clown show.

I would rather pay for RDS. Databases are the one thing you don't want to screw up.


So investing in a critical part of my business is the bad thing to do?


> The argument for RDS seems to be “we can’t automate backups”. What on earth?

I can automate backups and I'm extremely happy they with some extra cost in RDS, I don't have to do that.

Also, at some size automating the database backup becomes non-trivial. I mean, I can manage a replica (which needs to be updated at specific times after the writer), then regularly stop replication for a snapshot, which is then encrypted, shipped to storage, then manage the lifecycle of that storage, then setup monitoring for all of that, then... Or I can set one parameter on the Aurora cluster and have all of that happen automatically.


The argument for RDS (and other services along those lines) is "we can't do it as good, for less".

And, when factoring in all costs and considering all things the service takes care of, it seems like a reasonable assumption that in a free market a team that specializes in optimizing this entire operation will sell you a db service at a better net rate than you would be able to achieve on your own.

Which might still turn out to be false, but I don't think it's obvious why.


I agree but also I'm not entirely sure how much of this is avoidable. Even the most simple web applications are full of what feels like needless complexity, but I think actually a lot of it is surprisingly essential. That said, there is definitely a huge amount of "I'm using this because I'm told that we should" over "I'm using this because we actually need it"


As the famous quote goes, "If I'd had more time, I would've written a shorter letter".


Also does primary / secondary global clusters with automated failover. Saves a ton of time not to manage that manually


Everyone who says they can run a database better than Amazon is probably lying or Has a story about how they had to miss a family event because of an outage.

The point isn’t that you can’t do it, the point is that it’s less work for extremely high standards. It is not easy to configure multi region failover without an entire network team and database team unless you don’t give a shit about it actually working. Oh yea, and wait until you see how much SOC2 costs if you roll your own database.


One don’t necessarily need to run a DB better than Amazon. Just sufficiently good for the product/service you’re are working on. And depending on specifics it may costs much less (but your mileage may vary).


There are other providers with better value for service within AWS or GCP, like Crunchy.




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: