> If you're around I'd love to hear specifically what you mean by this. Internally Logstash is very thread friendly, we only recommend multiple processes when you want either greater isolation or greater fault tolerance.
Right, we considered using multiple Logstash processes, but we really didn't want to run three instances of Logstash requiring three relatively heavyweight Java VMs. The total memory consumption of a single VM running Logstash is higher than running three different instances of LogZoom.
We looked at the Filebeat Redis output as well. First, it didn't seem to support encryption or client authentication out of the box. But what we really wanted was a way to make Logstash duplicate the data into two independent queues so that Elasticsearch and S3 outputs could work independently.
Thanks for the thoughtfully considered response :).
Regarding security with redis. Did you read the docs here? https://www.elastic.co/guide/en/logstash/current/plugins-out... Logstash does support Redis Password auth (as does Filebeat). Regarding the encryption with redis point, seeing as Redis doesn't support SSL itself, are you using spiped as the official Redis docs recommend?
Regarding the two queues, I would like to clarify that you can do this with the:
If you declare two Logstash Redis outputs in the first 'shipper' Logstash you can write to two separate queues. And have the second 'indexer' read from both.
It is true that if one output is down we will pause processing, but you can use multiple processes for that. It is possible that in the near future we will support multiple pipelines in a single process (which we already do internally in our master branch for metrics, just not in a publicly exposed way yet).
Regarding JVM overhead. That's a fair point about memory. The JVM does have a cost. That said, memory / VMs are cheap these days, and that cost is fixed. One thing to be careful of is that we often times see people surprised to find that they get a stray 100MB event going through their pipeline due to an application bug. Having that extra memory is a good idea regardless. We have many users increasing their heap size far beyond what the JVM requires simply to handle weird bursts of jumbo logs.
Thanks for that information. There's no doubt Logstash can do a lot, and it sounds like with the multiple pipeline feature Logstash will make it easier to do what we wanted to do in a single process.
In the past, we've also been burned by many Big Data solutions running out of heap space that adding more processes that relied on tuning JVM parameters again did not appeal to us.
Right, we considered using multiple Logstash processes, but we really didn't want to run three instances of Logstash requiring three relatively heavyweight Java VMs. The total memory consumption of a single VM running Logstash is higher than running three different instances of LogZoom.
We looked at the Filebeat Redis output as well. First, it didn't seem to support encryption or client authentication out of the box. But what we really wanted was a way to make Logstash duplicate the data into two independent queues so that Elasticsearch and S3 outputs could work independently.