I've been running a web application for the past 6 months and it just crossed 150,000 page views/month mark. Sure, for others it's not that great, but for me, this is the project that is showing the biggest potential.
Anyway, the funny thing is, I'm running it on $2.50/month Vultr VPS. I got so worried when it crossed 30,000 that my site will crash. But it didn't. Then when its views got higher, I optimized further. I learned more things about how to make things faster. I refactored my code to squeeze out that much more juice out of $2.50 - not because I'm cheap (well, perhaps), but because I wanted to see how far I can push it.
I'll probably cross 200,000 soon and I think by then I'll need to increase it to (yikes heaven forbid!) $5/month plan.
Recently, just because of all the hype about AWS (in general and also at my work) I wanted to get started with AWS. Then I looked at their ridiculous pricing calculation page and I just closed the browser. I thought my time was better spent working on my project than learning about how to deploy a simple PHP application to AWS.
I don't understand the point of your comment. The parent was simply explaining why they don't need extreme scalability with their scenario. No one is under the impression that multiplying their traffic by 75000 (so you work at Google then?) will not require a big server upgrade.
He's making a judgement on the value of aws using a use case that makes no sense to use aws, anyway. He could be doing something similar with lightsail on aws at similar cost if he really wanted to, though.
> He's making a judgement on the value of aws using a use case that makes no sense to use aws,
I think the problem here is that a lot of people use AWS when there's no value in using AWS. It's commonly reached for as a first choice when it's often not a good option if you're looking to run lean.
AWS is extremely popular, but probably only efficient for a small percent of companies who have wildly variable traffic patterns.
Your idea of "running lean" optimizes for the operational expenses of servers, which are across-the-board cheap, rather than cost of developer time.
I bill between $200 and $250 an hour. (If I had a full-time job I'd be salaried around $90-$100/hour.) Being able to pay me less because AWS's tooling makes life a hell of a lot easier once set up makes a lot of sense even for fairly small companies.
AWS tooling doesn't save developer time in my experience. In fact, I'd argue that optimizing for it more often than not wastes it.
I've watched someone spend days learning to configure a performant DB server on AWS due to their poor disk IO performance and spend an enormous amount to get a high ram instance when a simple SSD based server where more IOPS were trivially available would have had it working out of the box.
Everything has a learning cost though, so perhaps that's a somewhat unfair example as having a deep knowledge of data centers and backbone providers doesn't come free, but I can crank an ansible and docker script out and have pretty much all the advantages of AWS but deployed on my own hardware at a fraction of the cost. So I'm not sure it's fair to say that AWS tooling offers anything particularly unique to merits its market position.
So I'm not trying to big-time you, but I have experience across a wide range of environments and shop sizes (both in clouds and, unfortunately, people who bought the "VPSes are fine too" idea in like 2015) and, after being in these trenches for a while, "it doesn't improve developer productivity" reads more to me as "we don't know how to leverage AWS for developer productivity." Elasticity is nice; pervasive automation and deep introspective monitoring (which you don't have to create and manage and scale) is not merely nice but required, especially for small teams, because your small team cannot afford to be disrupted by systems that can't take care of themselves without kneecapping your velocity. I believe you're arguing in good faith but you're telling me that's this pebbly wall when it's actually the whole elephant.
And AWS reduces business risk, too, which needs to be understood and respected, too. Your VPS universe is not actually repeatable. It takes one line to roll out a full environment for more than one of my clients. Dev environment? Here ya go, self-serve bootstrapping and management. Prod environment? Infra-as-code guaranteed to be what you pushed to test at the infrastructural level as well as at the deployable level.
This is why we are replacing system administrators with developers and why we are replacing hands-on system creation with cloud stacks: because it's a difference of kind and the fears of higher operational expenses are trivialized by being able to use that endless inventory to replace the expensive part of your operation--the people.
You forgot to mention the choice between bad peering vs a high jitter direct connect. Two vlans on the same cable doesn't equal redundancy, two cables along the same geographical site neither.
I think you're either focusing on a very small part of the market - very large websites - or you're overestimating the requirements most websites have. Either way, it seems like you're taking about a very different scenario than I am.
Endless inventory and minimized administration but higher upfront and ongoing server investment is not a benefit when you need to write a website used by under 10k people a day. Nor is it a benefit for internal services moved to the cloud for nonsense reasons. Both of which I've seen done far too frequently. Simple tooling and small virtual machines or singular dedicated servers like are easily enough for these applications.
I can definitely understand the need for a small team working on a very large, high hit count application though. I'm not saying there are no uses for it, there totally are and you make valid points, just that the uses more limited than one might expect by the hype.
So...I run one service (gratis, as it's a friend) that gets about 8K uniques a day. It costs $26 a month in AWS (it was $35 but that's apparently gone down, cool!) and makes about $400/month. That price tag is despite me not being particularly cost-conscious when rolling it out. It does use free tier AWS resources, but EC2 is not in free tier right now; we're talking about keeping DynamoDB at free tier read levels and stuff like that.
And, unlike the still-really-weird-and-ad-hoc VPS world, stuff like DR is a solved problem when-not-if you need it. The vig for AWS is consistently between 20% and 25% and you are able to leverage implicitly all the incredibly useful tooling and systems around you. If 20% to 25% of $26 is going to materially damage your business, you do not have a business.
- Automation/APIs are generally lousy. Some try (Digital Ocean is trying to shed its past and become a real boy--err, cloud), but stuff like Linode is infuriating to work with when you expect to be able to just do something like "declaratively describe your infrastructure and go". Or when you expect to have monitoring and alerting on hand without having to reinvent every wheel yourself.
- Networking is usually real bad. SDNs are your friend. Yeah, learn you an iptables and all, but this is the future, we can do better. (DigitalOcean is almost to the point AWS was at, like, eight years ago with EC2 Classic? Something like that.)
- Geographically-centralized but independent systems are hard to come by and so fault tolerance is a Big Problem. AWS loses an availability zone, my stuff keeps rolling. Can't say the same elsewhere.
- Value-add services. The sibling comment's complaining about RDS, but RDS configuration hits probably the 98% case and You Don't Have To Learn It. I'm a little more hesitant about lock-in services like SQS, SNS, etc., but moral equivalents exist elsewhere for the most part, you can use them pretty effectively.
I thought that was the point chx was making, with sideproject giving an example. How many people really do 150,000 hits a second versus how many are using AWS?
Well, I've worked on two now... once was self-hosted by a large services company, which has its' own data centers and redundant connections, it also handles a significant portion of DNS for the internet (for good or bad). It worked for them, but was painful.. Getting new servers up was months of work.
I'm working at another service company expecting to clear 5M requests/day with bursts close to what you're talking about several times a year. We've had so much pain managing colocated servers, we can't justify the cost of 3 full data centers for just our load, that doesn't make sense. We're currently moving to a cloud provider to be able to scale out better when peaks really spike.
We had a customer that wanted to do several million requests in a 15 minute window, and can't currently handle that... We're restructuring/refactoring so that we can.
It may not be sustained, but getting 330k requests/second in bursts is a different way to think about problems than anything less than 1k/second, which many servers can hit without a sweat, and why I'd be more inclined to push for mid-level VPS like DO or Linode in those cases. Depends on need and expected growth.
Shouldn't be anything incredibly high. I haven't seen any recent traffic info for HN, but a couple of years ago dang posted some numbers here: https://news.ycombinator.com/item?id=9219581
So at that point it was 2.6M views per day, which means HN itself was getting about 30 views per second. If you look at the 200k uniques instead (which might make more sense since any individual person will probably only click through once), that's about 2 unique visitors per second. So even if HN has grown a ton in the time since that post, I'd be surprised if it sent more than about 5 hits/second to anything.
Hah, the fact that HN itself is a single server (behind Cloudflare, these days) should be enough proof that anything linked to by HN is unlikely to need more than a single server behind a caching CDN.
Well, let's also differentiate computationally-intensive hits (e.g., users adding things to a shopping cart, requests to list Pokemon in the area, etc.) from hits to a home page, which should be static or at least cached aggressively. 150,000 index.htmls per second and 150,000 database writes per second are very different.
I used to run a university web hosting platform that currently has, I think, five web server VMs on about as many physical machines, two physical machines for load balancing, and two physical MySQL servers in active/passive replication (i.e., only one gets either reads or writes). We hit the front page of HN fairly frequently—for instance, we host mosh.org—and it hasn't really been a problem. I remember getting paged in ... 2009 or so? ... when a particular website in WordPress got to the front page of Reddit, but we had fewer machines then, and also I think we had not deployed FastCGI for PHP at that point (for complicated shared-hosting reasons), so each WordPress page load was its own PHP process via CGI. If you're optimizing for performance, even if you want to stay on WordPress, step one is to not use plain CGI and step two is to do one of the myriad things you're supposed to do for WordPress caching.
In any case, a handful of physical machines will handle being on the front page of HN just fine. If you're doing something where you have an extremely computationally-intensive process on the first page load and you're worried you might hit HN but you might not, put it on cloud and set up autoscaling, but other than that it probably doesn't make sense. If you know you won't scale too much—and a static site on the front page of HN isn't too much—chances are that your usage is so low that you're paying a premium for the unused ability to scale and you should just pay for two cheap VPSes, and if you know you will scale (e.g., you have a large fixed workload), again you're paying a premium for the unused ability to scale down, and you should just invest in a datacenter and save in the long term.
All that said, if you've got a static site, by all means stick it on a CDN, which I think is a perfectly defensible use of cloud for sites of all sizes.
I don't remember exactly, but not that many, maybe a couple thousand concurrent users. My brother's webapp has hit the front page of Reddit a few times, but a single dedicated machine was more than enough to handle that.
A hit every two seconds is 30 * (86400 / 2) = 1296000 hits per month.
150000 hits per month is an average of a hit every 86400 * 30 / 150000 = 17.28 seconds.
Yes, that's nothing. With a hit every 17 seconds, there is no performance optimization to be done. Therefore I don't understand the concern about performance in the parent comment by 'sideproject'.
Sure, there could be peak times when there is a hit every 100 microseconds and that's what forced the parent commenter to focus on performance optimization but nothing about this was mentioned in the comment.
Details about traffic in such peak times would have made the parent comment by 'sideproject' interesting. But with the current details in the comment right now, it is going to leave readers confused why one needs to discuss performance optimization for a hit every 17 seconds on average.
150,000 views a month is 150000/30/24/60/60 = 0.0578 qps. If there's no heavy processing you shouldn't need to be doing any optimization for the machine's sake at that rate. Slow queries / frontend code is a different story :)
Website traffic can be vary heavily during some parts of the day depending on demographics. It's good practice, unless you know that traffic profile, to inflate by 2-3x your average page views per second over any time period larger than an hour.
In addition to this, one pageview may produce many requests. Both of these need to be profiled before you can estimate reasonably how much traffic a webserver can handle given it's current resources.
There are some good benchmarking tools that will load the entire page, including all it's resources, and produce a more accurate load measure in terms of r/s.
As a side note, those $5 vultr instances can handle a surprising amount of static requests per second using nginx.
This is all fine for a personal project. The amount that the time you spent optimizing would cost a company is very likely to outweight the cost of scaling hardware.
Anyway, the funny thing is, I'm running it on $2.50/month Vultr VPS. I got so worried when it crossed 30,000 that my site will crash. But it didn't. Then when its views got higher, I optimized further. I learned more things about how to make things faster. I refactored my code to squeeze out that much more juice out of $2.50 - not because I'm cheap (well, perhaps), but because I wanted to see how far I can push it.
I'll probably cross 200,000 soon and I think by then I'll need to increase it to (yikes heaven forbid!) $5/month plan.
Recently, just because of all the hype about AWS (in general and also at my work) I wanted to get started with AWS. Then I looked at their ridiculous pricing calculation page and I just closed the browser. I thought my time was better spent working on my project than learning about how to deploy a simple PHP application to AWS.