Along the same lines, if you can avoid a distributed architecture, things get a lot more reliable. You can get a crazy amount of RAM, SSD, CPU cores on a single machine. If you run your system on a powerful machine with some other ones on hot standby, a lot of complexity goes away.
If you can run your system on a single machine, you don't need an SRE.
If you have hundreds or thousands of machines, that's an indicator that you /may/ have the complexity that requires the disciplines that can come from dedicated SRE. The tough thing is conflating filling operations problems with a role named SRE, versus actually using the best practices that will help you scale and improve reliability.
Not even Bay Area prices. A Jr. SWE after overhead (benefits, HR, laptop, office-space,...etc) is easily costing the company 150k+/year in most markets.
Indeed, but take it a step further. Two ten thousand dollar servers in your basement with UPS and some rudimentary failover configuration is basically fire and forget. Remote in monthly and install updates. Done.
Until there's a power outage, flooding, malice, etc.
I think the main issue is that the cloud providers don't publish much about outages that don't affect the end-user. I mean a failed hard drive happens all the time, but S3 is never affected by that.
Depends on your bandwidth requirements. Also, if you want even higher reliability, you might consider getting two independent internet links into your basement, which is pretty doable in an urban setting.
For a long time, my off-site backup was at my grandmother's house because it was the furthest geographic location I could give someone a box who would leave it plugged into their Internet. ;)