None of this should be even remotely necessary. It’s like being frugal with table salt.
“We’ll show you how to make sure you don’t have even one crystal fall off the plate.”
My personal pet peeve is Azure Application Insights which uses Log Analytics under the hood… at a rate of $2.75 per ingested GB of logs stored for one month. That’s highway robbery.
Let that sink in: They charge $2,800 to store a TB of text that takes a few hundred dollars of overpriced cloud disk and maybe $10 of CPU time for the actual processing. That’s the cost of a serviceable used car or a brand new gaming PC!
But wait! There’s more.
In reality that 1 TB is column compressed down to maybe 100 MB, making it about $30K charged per terabyte stored on disk.
It doesn’t stop there! Thanks to misaligned incentives, the ingested data format is fantastically inefficient JSON that re-sends static values for every metric sample collected.
Why would anyone ever bother to optimise their only revenue?!
They won’t.
The reality is that a numeric metric collected once a second (not minute!) is just 21 MB if stored as a simple array. Most metrics are highly compressible, and that would easily pack to 100 KB per metric per month.
A typical Windows server has about 15,000 performance metrics. We could be collecting these once a second and use a grand total of… 1.5 GB per month. That’s every metric for every process, every device, every error counter, everything.
Modern server monitoring is inefficient and overpriced by 5 orders of magnitude. It’s that simple.
That fact that your company can exist at all is a testament to that.
Totally agree about the compressibility of metrics and toying with the scraping interval. I started out working for an enterprise monitoring vendor that had a proprietary agent that already decided sane intervals to emit metrics, when I learned that Prometheus let users configure that to me...just sounds like an expensive foot gun.
My real beef with metrics is at least for app layer insights is the waste. I'd so much rather have a span/event configured with tail sampling so you can derive metrics from traces and tie them to logs in a native contextualized way vs having to do that correlation on the backend and within different systems and query langs. Seems much more efficient and cost-effective that way, I'm scarred from seeing a zillion "service_name.http_response.p95.average" metrics that are imo useless
I’m starting to come to the same conclusion, but the point I’m making is a general one: efficient formats would allow finer grained telemetry to be collected without having to be tuned and carefully monitored.
What’s the point of a monitoring system that itself needs baby sitting?!
“We’ll show you how to make sure you don’t have even one crystal fall off the plate.”
My personal pet peeve is Azure Application Insights which uses Log Analytics under the hood… at a rate of $2.75 per ingested GB of logs stored for one month. That’s highway robbery.
Let that sink in: They charge $2,800 to store a TB of text that takes a few hundred dollars of overpriced cloud disk and maybe $10 of CPU time for the actual processing. That’s the cost of a serviceable used car or a brand new gaming PC!
But wait! There’s more.
In reality that 1 TB is column compressed down to maybe 100 MB, making it about $30K charged per terabyte stored on disk.
It doesn’t stop there! Thanks to misaligned incentives, the ingested data format is fantastically inefficient JSON that re-sends static values for every metric sample collected. Why would anyone ever bother to optimise their only revenue?!
They won’t.
The reality is that a numeric metric collected once a second (not minute!) is just 21 MB if stored as a simple array. Most metrics are highly compressible, and that would easily pack to 100 KB per metric per month.
A typical Windows server has about 15,000 performance metrics. We could be collecting these once a second and use a grand total of… 1.5 GB per month. That’s every metric for every process, every device, every error counter, everything.
Modern server monitoring is inefficient and overpriced by 5 orders of magnitude. It’s that simple.
That fact that your company can exist at all is a testament to that.