Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

>At the scale of Netflix or Reddit it very well may make sense to only keep very limited CPU/memory stats on such a massive fleet.

read again what is his argument, that we don't need to store __any__ cpu/mem/storage metrics, other than "business metrics" (or later he crawled back to polling from time to time).

> Look, I have a different opinion as well, but the difference between you and me is I'm not resorting to personal attacks and instead discussing it on the merits.

maybe that's due to difference in culture/region, but i'm unaware where i've attacked him personally. i've just pointed out that what he's saying is to be expected by someone without experience/knowledge "in the field".



I read again his argument.

> Keeping all of your application logs and telemetry forever is expensive, and I can't recall a single time when having more than a day's with of history was ever useful in tracking down an operational issue.

That doesn't say don't store any, that says you can get by storing a 24 hour period. And his broader point is that it should be time bound, that storing these metrics indefinitely isn't useful and can be very expensive.

I'm of the mind that a week or two of fast online access is the right amount myself (with offline "cold" storage of logs for a longer period), but the overall premise that storing logs and infrastructure metrics forever is unnecessary and wasteful.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: