I use InfluxDB with Chronograf and Kapacitor for monitoring and notifications. It's pretty solid. I was excited about InfluxDB 2 since I wouldn't have to run Chronograf and Capacitor separately.
The notification settings are somewhat odd, though. In my case I want a notification if something transitions into a bad state (Critical), and also want to get a notification when it's back to normal (OK).
In InfluxDB 2, I can only set a notification to fire if the state goes to Critical, OR if it goes to OK. I can't have a single notification rule that covers both cases. It's super annoying to maintain two rules for each item I'm monitoring.
Meanwhile, Grafana has notifications work the way I described by default. I'm not sure if InfluxDB is weird, or if the rest of the world does notifications differently than I do.
One time I accidentally put my personal info on a conference registration, and went around scanning my badge for free stuff. I got calls and emails from every vendor for a few months. Every vendor except influx, they kept contacting me for 3 years. Now I get annoyed whenever I see the name.
I tried InfluxDB 2.0 after reading this, and I am disappointed. It's actually a lot more complicated. Flux is not easier than InfluxQL (could be more powerful, dunno) and monitoring and alerting also seems not as straight-forward. I am quite disappointed.
Maybe I'm looking at the downloads page [1] linked in the blog post too soon, but all I see is an v2.0.0. RC release (rc-4 docker image), which has been there for weeks now, and not a GA release.
There is a docker image on quay.io [2] tagged v2.0.0 though.
It actually depends on your usecase. My usecase was to monitor key business metrics as well as JVM performance details and send alerts in case a metrics seems off.
I initially evaluated InfluxDB and was impressed by it, but I was introduced to Prometheus I did not find any reason to go back to InfluxDB.
1. Prometheus has pull based of consuming metrics vs push based mechanism of InfluxDB. I see it as a plus point in terms of scalability especially if you have high traffic application.
2. If you are interested in zooming into a single event e.g. "error thrown at 11:01am" then Prometheus isn't for you. In that case you should evaluate timeseries databases like InfluxDB, TimescaleDB and others. Prometheus will only give you aggregated stats i.e. "errors thrown between 11.00am and 11.05am". You can label the events in Prometheus and group by those labels to draw some insights & define alerts.
3. The PromQL query language used to query metrics in Prometheus seemed much more powerful than InfluxDB to me. I find it much more advanced and flexible compared to any other solutions. e.g. you can compare current metrics value e.g. errors thrown per 15 seconds to the same value exactly a week ago which I find very cool. However PromQL has some learning curve. Luckily people have published common PromQL queries, so it's not that bad.
4. InfluxDB is a commercial product, which means some advanced options like distributed setup are available in paid version. This isn't a negative point, but just putting it out there. In contrast, Prometheus is open source.
5. I liked the deduplication and grouping feature in AlertManager(prometheus).
6. InfluxDB needs some serious hardware. This is understandable, because it stores each and every event. I was testing on a low spec machine, so InfluxDB used to crash after running for few hours and it consumed insane amounts of RAM(15-16 GB at times). I used the same hardware for Prometheus, published the same metrics under same load and I did not see any query performance issues. The RAM & CPU usage was very low(1 GB at max) in comparison This is mostly because Prometheus stores the aggregated values i.e. number of errors in 15 second windows as opposed to all the errors(which could be hundreads or thousands). So number of events do not affect performance in Promethues, only number of metrics do.
7. In comparison of setting up InfluxDB and other supporting components, I find Prometheus pretty straightforward to maintain.
8. The federation feature in Prometheus was very important for us. Basically you can pull data from one Prometheus instance into other Prometheus instance at lower resolution. This way I could monitor a fleet a 1000 nodes without any complicated setup or high spec nodes. e.g. you can configure a prometheus instance to pull metrics from 10 nodes at 15 second resolution. Then another prometheus instance would pull metrics from 10 such leaf level promertheus nodes at 1 minute interval. But since you are pulling lower resolution data, you can use the same hardware spec for leaf level prometheus instance and it's parent instance.
When you are looking into Grafana, you can first look at higher level prometheus instance data then zoom into a subset as necessary. This is something I find easy to configure and manage.
This is what I can gather from top of my head, there could of course be other pros and cons in both solutions.
InfluxDB stores individual events e.g. error occured at 11.01am whereas Prometheus stores the aggregates i.e. number of errors occured between 11.01am and 11.02am.
This depends on how you use it. It's pretty common to use something like statsd or a prometheus exporter to aggregate data before storing it in InfluxDB.
I was interested in using TimescaleDB, but I'd want to use Telegraf (or something good as it is) to feed data into TimescaleDB.
The TimescaleDB folks say that the pull-request for their Telegraf plugin just needs to be merged, but that doesn't seem to be the case. The pull request needs changes and the original submitter is unresponsive. Now the pull request is labeled as "help wanted". [1]
What do people use to send metrics into TimescaleDB?
VictoriaMetrics developer here. While Flux language looks great from computer science point of view, it has very low usability for ordinary users comparing to other specialized query languages for time series data such as PromQL [1] or MetricsQL [2]. For example, Flux requires to write long multi-line queries for simple cross-measurements math [3]. The same query written in PromQL is much shorter and is much easier to manage: