I've been running a web application for the past 6 months and it just crossed 15...

empath75 · on July 26, 2017

150,000 views a month is like a hit every two seconds. That's nothing. Talk to me when you're managing 150000 hits a second.

Edit: sorry didn't mean to imply that your site isn't successful, only that in terms of traffic, it doesn't make aws worth it.

jazoom · on July 26, 2017

I don't understand the point of your comment. The parent was simply explaining why they don't need extreme scalability with their scenario. No one is under the impression that multiplying their traffic by 75000 (so you work at Google then?) will not require a big server upgrade.

empath75 · on July 26, 2017

He's making a judgement on the value of aws using a use case that makes no sense to use aws, anyway. He could be doing something similar with lightsail on aws at similar cost if he really wanted to, though.

problems · on July 26, 2017

> He's making a judgement on the value of aws using a use case that makes no sense to use aws,

I think the problem here is that a lot of people use AWS when there's no value in using AWS. It's commonly reached for as a first choice when it's often not a good option if you're looking to run lean.

AWS is extremely popular, but probably only efficient for a small percent of companies who have wildly variable traffic patterns.

eropple · on July 26, 2017

Your idea of "running lean" optimizes for the operational expenses of servers, which are across-the-board cheap, rather than cost of developer time.

I bill between $200 and $250 an hour. (If I had a full-time job I'd be salaried around $90-$100/hour.) Being able to pay me less because AWS's tooling makes life a hell of a lot easier once set up makes a lot of sense even for fairly small companies.

problems · on July 26, 2017

AWS tooling doesn't save developer time in my experience. In fact, I'd argue that optimizing for it more often than not wastes it.

I've watched someone spend days learning to configure a performant DB server on AWS due to their poor disk IO performance and spend an enormous amount to get a high ram instance when a simple SSD based server where more IOPS were trivially available would have had it working out of the box.

Everything has a learning cost though, so perhaps that's a somewhat unfair example as having a deep knowledge of data centers and backbone providers doesn't come free, but I can crank an ansible and docker script out and have pretty much all the advantages of AWS but deployed on my own hardware at a fraction of the cost. So I'm not sure it's fair to say that AWS tooling offers anything particularly unique to merits its market position.

eropple · on July 26, 2017

So I'm not trying to big-time you, but I have experience across a wide range of environments and shop sizes (both in clouds and, unfortunately, people who bought the "VPSes are fine too" idea in like 2015) and, after being in these trenches for a while, "it doesn't improve developer productivity" reads more to me as "we don't know how to leverage AWS for developer productivity." Elasticity is nice; pervasive automation and deep introspective monitoring (which you don't have to create and manage and scale) is not merely nice but required, especially for small teams, because your small team cannot afford to be disrupted by systems that can't take care of themselves without kneecapping your velocity. I believe you're arguing in good faith but you're telling me that's this pebbly wall when it's actually the whole elephant.

And AWS reduces business risk, too, which needs to be understood and respected, too. Your VPS universe is not actually repeatable. It takes one line to roll out a full environment for more than one of my clients. Dev environment? Here ya go, self-serve bootstrapping and management. Prod environment? Infra-as-code guaranteed to be what you pushed to test at the infrastructural level as well as at the deployable level.

This is why we are replacing system administrators with developers and why we are replacing hands-on system creation with cloud stacks: because it's a difference of kind and the fears of higher operational expenses are trivialized by being able to use that endless inventory to replace the expensive part of your operation--the people.

qaq · on July 26, 2017

That reads more like AWS marketing copy :) At scale you have to architect around AWS crapy network.

0 network transparency

abismal IOPS performance

very limited config. options

Horrible uptime (US East has worth uptime as a region than wast majority of quality DCs)

0 Access to people who can really help you (unless you are at several mil. per month spend)

evook · on July 26, 2017

You forgot to mention the choice between bad peering vs a high jitter direct connect. Two vlans on the same cable doesn't equal redundancy, two cables along the same geographical site neither.

problems · on July 26, 2017

I think you're either focusing on a very small part of the market - very large websites - or you're overestimating the requirements most websites have. Either way, it seems like you're taking about a very different scenario than I am.

Endless inventory and minimized administration but higher upfront and ongoing server investment is not a benefit when you need to write a website used by under 10k people a day. Nor is it a benefit for internal services moved to the cloud for nonsense reasons. Both of which I've seen done far too frequently. Simple tooling and small virtual machines or singular dedicated servers like are easily enough for these applications.

I can definitely understand the need for a small team working on a very large, high hit count application though. I'm not saying there are no uses for it, there totally are and you make valid points, just that the uses more limited than one might expect by the hype.

eropple · on July 26, 2017

So...I run one service (gratis, as it's a friend) that gets about 8K uniques a day. It costs $26 a month in AWS (it was $35 but that's apparently gone down, cool!) and makes about $400/month. That price tag is despite me not being particularly cost-conscious when rolling it out. It does use free tier AWS resources, but EC2 is not in free tier right now; we're talking about keeping DynamoDB at free tier read levels and stuff like that.

And, unlike the still-really-weird-and-ad-hoc VPS world, stuff like DR is a solved problem when-not-if you need it. The vig for AWS is consistently between 20% and 25% and you are able to leverage implicitly all the incredibly useful tooling and systems around you. If 20% to 25% of $26 is going to materially damage your business, you do not have a business.

foo101 · on July 27, 2017

What's wrong with VPSes? Care to elaborate?

eropple · on July 27, 2017

- Automation/APIs are generally lousy. Some try (Digital Ocean is trying to shed its past and become a real boy--err, cloud), but stuff like Linode is infuriating to work with when you expect to be able to just do something like "declaratively describe your infrastructure and go". Or when you expect to have monitoring and alerting on hand without having to reinvent every wheel yourself.

- Networking is usually real bad. SDNs are your friend. Yeah, learn you an iptables and all, but this is the future, we can do better. (DigitalOcean is almost to the point AWS was at, like, eight years ago with EC2 Classic? Something like that.)

- Geographically-centralized but independent systems are hard to come by and so fault tolerance is a Big Problem. AWS loses an availability zone, my stuff keeps rolling. Can't say the same elsewhere.

- Value-add services. The sibling comment's complaining about RDS, but RDS configuration hits probably the 98% case and You Don't Have To Learn It. I'm a little more hesitant about lock-in services like SQS, SNS, etc., but moral equivalents exist elsewhere for the most part, you can use them pretty effectively.

mysterydip · on July 26, 2017

I thought that was the point chx was making, with sideproject giving an example. How many people really do 150,000 hits a second versus how many are using AWS?

geofft · on July 26, 2017

How many projects actually get 150,000 hits a second?

(And are they not better served by buying and operating their own datacenters?)

tracker1 · on July 27, 2017

Well, I've worked on two now... once was self-hosted by a large services company, which has its' own data centers and redundant connections, it also handles a significant portion of DNS for the internet (for good or bad). It worked for them, but was painful.. Getting new servers up was months of work.

I'm working at another service company expecting to clear 5M requests/day with bursts close to what you're talking about several times a year. We've had so much pain managing colocated servers, we can't justify the cost of 3 full data centers for just our load, that doesn't make sense. We're currently moving to a cloud provider to be able to scale out better when peaks really spike.

We had a customer that wanted to do several million requests in a 15 minute window, and can't currently handle that... We're restructuring/refactoring so that we can.

It may not be sustained, but getting 330k requests/second in bursts is a different way to think about problems than anything less than 1k/second, which many servers can hit without a sweat, and why I'd be more inclined to push for mid-level VPS like DO or Linode in those cases. Depends on need and expected growth.

neuronexmachina · on July 26, 2017

How many hits/second does a project have to serve when they're on the front page of HN?

_ps6d · on July 26, 2017

Shouldn't be anything incredibly high. I haven't seen any recent traffic info for HN, but a couple of years ago dang posted some numbers here: https://news.ycombinator.com/item?id=9219581

So at that point it was 2.6M views per day, which means HN itself was getting about 30 views per second. If you look at the 200k uniques instead (which might make more sense since any individual person will probably only click through once), that's about 2 unique visitors per second. So even if HN has grown a ton in the time since that post, I'd be surprised if it sent more than about 5 hits/second to anything.

geofft · on July 26, 2017

Hah, the fact that HN itself is a single server (behind Cloudflare, these days) should be enough proof that anything linked to by HN is unlikely to need more than a single server behind a caching CDN.

geofft · on July 26, 2017

Well, let's also differentiate computationally-intensive hits (e.g., users adding things to a shopping cart, requests to list Pokemon in the area, etc.) from hits to a home page, which should be static or at least cached aggressively. 150,000 index.htmls per second and 150,000 database writes per second are very different.

I used to run a university web hosting platform that currently has, I think, five web server VMs on about as many physical machines, two physical machines for load balancing, and two physical MySQL servers in active/passive replication (i.e., only one gets either reads or writes). We hit the front page of HN fairly frequently—for instance, we host mosh.org—and it hasn't really been a problem. I remember getting paged in ... 2009 or so? ... when a particular website in WordPress got to the front page of Reddit, but we had fewer machines then, and also I think we had not deployed FastCGI for PHP at that point (for complicated shared-hosting reasons), so each WordPress page load was its own PHP process via CGI. If you're optimizing for performance, even if you want to stay on WordPress, step one is to not use plain CGI and step two is to do one of the myriad things you're supposed to do for WordPress caching.

In any case, a handful of physical machines will handle being on the front page of HN just fine. If you're doing something where you have an extremely computationally-intensive process on the first page load and you're worried you might hit HN but you might not, put it on cloud and set up autoscaling, but other than that it probably doesn't make sense. If you know you won't scale too much—and a static site on the front page of HN isn't too much—chances are that your usage is so low that you're paying a premium for the unused ability to scale and you should just pay for two cheap VPSes, and if you know you will scale (e.g., you have a large fixed workload), again you're paying a premium for the unused ability to scale down, and you should just invest in a datacenter and save in the long term.

All that said, if you've got a static site, by all means stick it on a CDN, which I think is a perfectly defensible use of cloud for sites of all sizes.

ericd · on July 26, 2017

I don't remember exactly, but not that many, maybe a couple thousand concurrent users. My brother's webapp has hit the front page of Reddit a few times, but a single dedicated machine was more than enough to handle that.

susam · on July 27, 2017

A hit every two seconds is 30 * (86400 / 2) = 1296000 hits per month.

150000 hits per month is an average of a hit every 86400 * 30 / 150000 = 17.28 seconds.

Yes, that's nothing. With a hit every 17 seconds, there is no performance optimization to be done. Therefore I don't understand the concern about performance in the parent comment by 'sideproject'.

Sure, there could be peak times when there is a hit every 100 microseconds and that's what forced the parent commenter to focus on performance optimization but nothing about this was mentioned in the comment.

Details about traffic in such peak times would have made the parent comment by 'sideproject' interesting. But with the current details in the comment right now, it is going to leave readers confused why one needs to discuss performance optimization for a hit every 17 seconds on average.

EGreg · on July 26, 2017

What are these "hits"?

Why not have a CDN serve static pages? Or better yet, an app on a mobile phone?

Then the hits are just APIs.

forgot-my-pw · on July 26, 2017

Guess most of us should never talk to you. Oh well, nothing of value was lost.

sockgrant · on July 26, 2017

150,000 views a month is 150000/30/24/60/60 = 0.0578 qps. If there's no heavy processing you shouldn't need to be doing any optimization for the machine's sake at that rate. Slow queries / frontend code is a different story :)

flippant · on July 26, 2017

The traffic likely peaks at certain hours

AYBABTME · on July 26, 2017

Not sure what your website is doing, but here's a quick thought exercise:

If you serve a page in 10s, then you can serve 259'200 pages/month.

    render_time page_view/month
    10s         259'200
    9s          288,000
    8s          324,000
    7s          370,285
    6s          432,000
    5s          518,400
    4s          648,000
    3s          864,000
    2s          1,296,000
    1s          2,592,000

So like, think 1 million views per month with 2s render time.

This is obviously skipping over a ton of details, but it's a good rule of thumb.

tgtweak · on July 26, 2017

Website traffic can be vary heavily during some parts of the day depending on demographics. It's good practice, unless you know that traffic profile, to inflate by 2-3x your average page views per second over any time period larger than an hour.

In addition to this, one pageview may produce many requests. Both of these need to be profiled before you can estimate reasonably how much traffic a webserver can handle given it's current resources.

There are some good benchmarking tools that will load the entire page, including all it's resources, and produce a more accurate load measure in terms of r/s.

As a side note, those $5 vultr instances can handle a surprising amount of static requests per second using nginx.

staticassertion · on July 26, 2017

This is all fine for a personal project. The amount that the time you spent optimizing would cost a company is very likely to outweight the cost of scaling hardware.