RDS pricing is deranged at the scales I've seen too.
$60k/year for something I could run on just a slice of one of my on-prem $20k servers. This is something we would have run 10s of. $600k/year operational against sub-$100k capital cost pays DBAs, backups, etc with money to spare.
Sure, maybe if you are some sort of SaaS with a need for a small single DB, that also needs to be resilient, backed up, rock solid bulletproof.. it makes sense? But how many cases are there of this? If its so fundamental to your product and needs such uptime & redundancy, what are the odds its also reasonably small?
> Sure, maybe if you are some sort of SaaS with a need for a small single DB, that also needs to be resilient, backed up, rock solid bulletproof.. it makes sense? But how many cases are there of this?
Most software startups these days? The blog post is about work done at a startup after all. By the time your db is big enough to cost an unreasonable amount on RDS, you’re likely a big enough team to have options. If you’re a small startup, saving a couple hundred bucks a month by self managing your database is rarely a good choice. There’re more valuable things to work on.
>By the time your db is big enough to cost an unreasonable amount on RDS, you’re likely a big enough team to have options.
By the time your db is big enough to cost an unreasonable amount on RDS, you've likely got so much momentum that getting off is nearly impossible as you bleed cash.
You can buy a used server and find colocation space and still be pennies on the dollar for even the smallest database. If you're doing more than prototyping, you're probably wasting money.
In the small SaaS startup case, I’d say the production database is typically the most critical single piece of infra, so self hosting is just not a compelling proposition unless you have a strong technical reason where having super powerful database hardware is important, or a team with multiple people who have sysadmin or DBA experience. I think both of those cases are unusual.
I’ve been the guy managing a critical self-hosted database in a small team, and it’s such a distraction from focusing on the actual core product.
To me, the cost of RDS covers tons of risks and time sinks: having to document the db server setup so I’m not the only one on the team who actually knows how to operate it, setting up monitoring, foolproof backups so I don’t need to worry that they’re silently failing because a volume is full and I misconfigured the monitoring, PITR for when someone ships a bad migration, one click HA so the database itself is very unlikely to wake me at 3am, blue/green deploys to make major version upgrades totally painless, never having to think about hardware failures or borked dist-upgrades, and so on.
Each of those is ultimately either undifferentiated work to develop in-house RDS features that could have been better spent on product, or a risk of significant data loss, downtime, or firefighting. RDS looks like a pretty good deal, up to a point.
I like fiddling with databases, but I totally agree with this. Unless you really need a big database and are going to save 100k+ per year by going self managed then RDS or similar just saves you so much stress. We've been using it for the best part of 10 years and uptime and latency have consistently been excellent, and functionality is all rock solid. I never have to think about it, which is just what I want from something so core to the business.
I am good at databases (have been a DBA in the past), and 100% agree with this. RDS is easy to standup and get all the things you mentioned, and not have to think about again. If we grow to the point where the overhead is more than a FT DBA, awesome. It means we are successful, and are fortunate to have options.
Unfortunately there are so many people and teams who thinks that simply running their databases on RDS means that they're backed up, highly-available and can be easily load balanced, upgraded, partitioned, migrated and so on which is simply not the case with the basic configuration.
RDS is a great choice, for prototyping and only for production if you know what you're doing when setting it up.
FWIW, this is common in all cloud deployments, people assume that running something "severless" is a magical silver bullet.
Well…just using the defaults when creating an RDS Postgres in the console give you an HA cluster with two read replicas, 7 days of backups restorable to any point in time, automatic minor version upgrades, and very easy major upgrades. So unless you start actively unchecking stuff those are not entirely invalid assumptions.
I agree, but I also classify some of these as "learn them once and you're all set".
Maybe it takes you a month the first time around and a week the 10th time around. First product suffers, the other products not so much. Now it just takes a week of your time and does not require you to pay large AWS fees, which means you are not bleeding money
I like to set up scrappy products that do not rack up large monthly fees. This means I can let them run unprofitable for longer and I don't have to seek an investor early, which would light up a large fire under everyone's butts and start influencing timelines because now they have the money and want a return asap.
I'll launch a week later - no biggie usually. I could have come up with the idea a month later, so I'm still 3 weeks early ;)
It doesn't work for all projects, obviously, but I've seen plenty of SaaS start out with a shopping spree, then pay monthly fees and purchase licenses for stuff that they could have set up for free if they put some (usually not a lot) effort into it. When times get rough, the shorter runway becomes a hard fact of life. Maybe they wouldn't have needed a VC and could have bootstrapped and also survived for longer.
Learning it all is what gave me an appreciation for RDS! I’ve self managed a number of Postgres and MySQL databases, including a 10TB Postgres cluster with all of the HA and backup niceties.
While I generally agree as far as initial setup time goes, I favor RDS because I can forget about it, whereas the hand rolled version demands ongoing maintenance, and incurs a nonzero chance of simple mistakes that, if made, could result in a 100% dataloss unrecoverable scenario.
I’m also mostly talking about typical, funded startups here, as opposed to indie/solo devs. If you’re flying solo launching a tiny proof of concept that may only ever have a few users, by all means run it yourself if you’d like, but if you’ve raised money to grow faster and are paying employees to iterate rapidly searching for PMF…just pay for RDS and make sure as much time as possible is spent on product features that provide actual business value. It starts at like $15/month. The cost of simply not being laser-focused on product is far greater.
> you've likely got so much momentum that getting off is nearly impossible as you bleed cash.
Databases are not particularly difficult to migrate between machines. Of all the cloud services to migrate, they might actually be the easiest, since the databases don't have different API's that need to be rewritten for, and database replication is a well-established thing.
Getting off is quite the opposite of nearly impossible.
That’s just another way of saying the opportunity cost isn’t worth paying to do the migration.
Optionality and flexibility are extremely valuable, and that is why cloud compute continues to be popular, especially for rapidly/burstily growing businesses like startups.
I don't mean to pick on your specific comments, but I find these analysis almost always lack a crucial perspective: level of knowledge. This is the single biggest factor, and it's the hardest one to be honest about. No one wants to say "RDS is a good choice . . . because I don't know how nor have I ever self managed a database."
If you want a different opportunity cost, get people with different experience. If RDS is objectively expensive, objectively slow, but subjectively easy, change the subject.
> No one wants to say "RDS is a good choice . . . because I don't know how nor have I ever self managed a database."
I don't think that's accurate. I've self-managed databases, and I still think that RDS is compelling for small engineering teams.
There's a lot to get right when managing a database, and it's easy to screw something up. Perhaps none of the individual parts are super-complicated, but the cost of failure is high. Outsourcing that cost to AWS is pretty compelling.
At a certain team size, you'll end up with a section of the team that's dedicated to these sorts of careful processes. But the first place these issues come up is with the database, and if you can put off that bit of organizational scaling until later, then that's a great path to choose.
I disagree here. This falls apart when you zoom out one step. I'm perfectly capable of managing a database. I'm also capable of maintaining load balancers, redis, container orchestrators, Jenkins, perforce, grafana, Loki, Oncall, individually. But each of those has the high chance of being a distraction from what our software actually does.
Its about tradeoffs, and some tradeoffs are often more applicable than others - getting a ping at 7am on a Sunday because your ec2 instance filled it's drive up with logs and your log rotation script failed because it didn't have a long enough retey is a problem I'm happy to outsource when I should be focusing on the actual app.
Lack of expertise in some particular technology is simply another opportunity cost. I can learn how to operate a production DB at scale (I have racked servers and run other production workloads) but as cofounder/CTO in a startup is that the best use of my time?
If the cost of a hosted DB is going to sink the company, then of course, I will figure it out and run it myself. But it’s not, for most startups. And therefore that knowledge isn’t providing much leverage.
Starting an AI company with deep expertise in training models - that is an example of knowledge providing huge leverage. DB tech is not in this bucket for most businesses.
People do not really understand the value of the former. Even dealing with financial options (buy/sell and underlying) which are a pure form of it, people either do not understand the value, or do so in a very abstract way they do not intuit.
Good point. And, since you brought up financials, you also see this when people use a majority of their savings to lump sum pay off a mortgage. They take an overweighted view of saving on interest and, IMO, underweight the flexibility of liquidity.
I have a small MySQL database that’s rather important, and RDS was a complete failure.
It would have cost a negligible amount. But the sheer amount of time I wasted before I gave up was honestly quite surprising. Let’s see:
- I wanted one simple extension. I could have compromised on this, but getting it to work on RDS was a nonstarter.
- I wanted RDS to _import the data_. Nope, RDS isn’t “SUPER,” so it rejects a bunch of stuff that mysqldump emits. Hacking around it with sed was not confidence-inspiring.
- The database uses GTIDs and needed to maintain replication to a non-AWS system. RDS nominally supports GTID, but the documented way to enable it at import time strongly suggests that whoever wrote the docs doesn’t actually understand the purpose of GTID, and it wasn’t clear that RDS could do it right. At least Azure’s docs suggested that I could have written code to target some strange APIs to program the thing correctly.
Time wasted: a surprising number of hours. I’d rather give someone a bit of money to manage the thing, but it’s still on a combination of plain cloud servers and bare metal. Oh well.
> Sure, maybe if you are some sort of SaaS with a need for a small single DB, that also needs to be resilient, backed up, rock solid bulletproof.. it makes sense? But how many cases are there of this?
Very small businesses with phone apps or web apps are often using it. There are cheaper options of course, but when there is no "prem" and there are 1-5 employees then it doesn't make much sense to hire for infra. You outsource all digital work to an agency who sets you up a cloud account so you have ownership, but they do all software dev and infra work.
> If its so fundamental to your product and needs such uptime & redundancy, what are the odds its also reasonably small?
Small businesses again, some of my clients could probably run off a Pentium 4 from 2008, but due to nature of the org and agency engagement it often needs to live in the cloud somewhere.
I am constantly beating the drum to reduce costs and use as little infra as needed though, so in a sense I agree, but the engagement is what it is.
Additionally, everyone wants to believe they will need to hyperscale, so even medium scale businesses over-provision and some agencies are happen to do that for them as they profit off the margin.
A lot of my clients are small businesses in that range or bigger.
AWS and the like are rarely a cost effective option, but it is something a lot of agencies like, largely because they are not paying the bills. The clients do not usually care because they are comfortable with a known brand and the costs are a small proportion of the overall costs.
A real small business will be fine just using a VPS provider or a rented server. This solves the problem of not having on premise hardware. They can then run everything on a single server, which is a lot simpler to set up, and a lot simpler to secure. That means the cost of paying someone to run it is a lot lower too as they are needed only occasionally.
They rarely need very resilient systems as they amount of money lost to downtime is relatively small - so even on AWS they are not going to be running in multiple availability zones etc.
Lots of cases. It doesn't even have to be a tiny database. Within <1TB range there's a huge number of online companies that don't need to do more than hundreds of queries per second, but need the reliability and quick failover that RDS gives them. The $600k cost is absurd indeed, but it's not the range of what those companies spend.
Also, Aurora gives you the block level cluster that you can't deploy on your own - it's way easier to work with than the usual replication.
Once you commit to more deeply Amazon flavored parts of AWS like Aurora, aren't you now fairly committed to hoping your scale never exceeds the cost-benefit tradeoff?
If my scale exceeds the cost benefit tradeoff, then I will thank God/Allah/Buddah/Spaghetti Monster.
These questions always sound flawed to me. It's like asking won't I regret moving to California and paying high taxes once I start making millions of dollars? Maybe? But that's an amazing problem to have and one that I may be much better equipped to solve.
If you are small, RDS is much cheaper, and many company killing events, such as not testing your backups are solved. If you are big and you can afford a 60K/yr RDS bill than you can make changes to move on-prem. Or you can open up excel and do the math if your margins are meaningfully affected by moving on-prem.
Agree. "What if you're wildly successful and get huge?" Awesome, we'll solve the problem then. The other part is what if AWS was a part of becoming successful? IE, it freed my small team from having to worry all that much about a database and instead focused on features.
I assume that you do that math on all your new features too, right? The calculation of how much extra money they will bring in?
On some level, AWS/GCP/California relies on you doing this calculation for the things that you can do it on easily (the savings of moving away), while not doing this calculation on things where it's hard to do (new development). That way, you can pretend that your new features are a lot more valuable than the $Xk/year you will save by moving your infra.
>The calculation of how much extra money they will bring in?
Yes, I've done the math. The piece you are missing is, saving money on infra will bring in $0 new dollars. There is a floor to how much money I can save. There is no ceiling to how much money the right feature can bring in. Penny pinching on infra, especially when the amount of money is saved is less than the cost of an engineer is almost always a waste of time while you are growing a company. If you are at the point where you are wasting 1x,2x,3x of an engineers salary of superflous infrastructure - then congratulations you have survived the great filter for 99% of startups.
>That way, you can pretend that your new features are a lot more valuable than the $Xk/year you will save by moving your infra.
Finding product market fit is 1000x harder than moving from RDS to On-prem. If you haven't solved PMF, then no amount of $Xk/year in savings will save you from having to shut down your company.
I am well aware of the math on that. Also, switching to faster infra can be a surprising benefit to your revenue, by the way, if it makes your app feel nicer.
The thing is, most features, particularly later in the life of a company, don't have an easy-to-measure revenue impact, and I suspect that many features are actually worth $0 of revenue. However, they cost money to implement (both in engineering time and infra), making them very much net negative value propositions. This is why Facebook and Google can cut tons of staff and lose nothing off their revenue number.
Also, there's a bit of a gambling mentality here which is that a feature could be worth effectively infinite revenue (ie it could be the thing that gives you PMF), so it's always worth doing over things with known, bounded impact on your bottom line. However, improving your efficiency gives you more cracks at finding good features before you run out of money.
So moving to/from Aurora/RDS/own EC2/on-prem should be a matter of networking and changing connection strings in the clients.
Your operational requirements and processes (backup/restore, failover, DR etc) will change, but that's because you're making a deliberate decision weighing up those costs vs benefits.
You can use DNS to mitigate the pain of changing those connection strings, decoupling client change management from backend change process, or if you had foresight, not having to change client connection strings at all.
Nope, nope, nope! When you change DNS entries, they will take effect at some point in the future when the cache expires and when your app decides to reconnect. (Possibly after a restart) At that point, why not be sure and change the config?
I mean, DNS change can work, but when you're doing that one-in-years change, why risk the extra failure modes.
Sure, but if you're paying anywhere near list price for your on-prem hardware at scale you're also doing it wrong. I've never seen a scenario where Amazon discounts exceed what you would get from a hardware or software vendor at the same scale.
It's more interesting how cloud services are sold like any other consumables or corporate services.
No one runs their own electricity supply (well until recently with renewables/storage), they buy it as a service, up to a pretty high scale before it becomes more economic to invest the capex and opex to run your own.
Or you're realistic about what you're doing. Will you ever need to scale more than 10x? And on the timescales where you do grow over 10x, would it be better to reconsider/re-architect everything anyway?
I mean, I'm looking after a 4 instance Aurora cluster which is great feature wise, is slightly overprovisioned for special events, and is more likely to shrink than grow 2x in the next decade. If we start experiencing any issues, there's lots of optimisations that can be still gained from better caching and that work will be cheaper than the instance size upgrade.
There’s still a defined cost to swapping your DB code over to a different backend. At the point where it becomes uneconomical, you’re also at a scale you can afford rewriting a module.
That’s why we have things like “hexagonal architecture”, which focus on isolating the storage protocol from the code. There’s an art to designing such that your prototype can scale with only minor rework — but that’s why we have senior engineers.
RDS is not so bulletproof as advertised, and the support is first arrogant then (maybe) helpful.
People pay for RDS because they want to believe in a fairy tale that it will keep potential problems away and that it worked well for other customers. But those mythical other customers also paid based on such belief. Plus, no one wants to admit that they pay money in such irrational way.
It's a bubble
> $600k/year operational against sub-$100k capital cost pays DBAs, backups, etc with money to spare.
One of these is not like the others (DBAs are not capex.)
Have you ever considered that if a company can get the same result for the same price ($100K opex for RDS vs same for human DBA), it actually makes much more sense to go the route that takes the human out of the loop?
The human shows up hungover,
goes crazy, gropes Stacy from HR, etc.
You'll need an engineer with database skills, not a dedicated DBA. I haven't seen a small company with a full time DBA in well over a decade. If you can learn a programming language, you can learn about indexes and basic tuning parameters (buffer pool, cache, etc.)
Not only that, you can't just have one DBA. You need a team a them, otherwise that person is going to be on call 24/7, can never take a vacation, etc. Your probably looking at a minimum of 3.
Sure, maybe if you are some sort of SaaS with a need for a small single DB, that also needs to be resilient, backed up, rock solid bulletproof.. it makes sense? But how many cases are there of this? If its so fundamental to your product and needs such uptime & redundancy, what are the odds its also reasonably small?