Hacker News new | past | comments | ask | show | jobs | submit login

> Heka proved to be the weak link in our logging stack. We get far better performance from fluentd.

I'd be curious to know some more details on this. I guess Go channels do copying rather than sharing, but I'd still expect Heka to perform better.

Personally I've found Heka to be more robust - I've had to fix bugs in plugins for the former two and generally I haven't found them to be architecturally sound.

For example with logstash I've had bugs in plugins which would crash the entire daemon -- that's just inexcusable for a core infrastructure service.

The multi-tiered thing sounds very strange, why not just hold an on-disk buffer?




Believe me, I wanted Heka to work. We're a big Go shop and I understood Heka better than any of the competing log shippers. We got very deep into Heka and ultimately, it just couldn't do the job. It performed very poorly when there was any sort of a bottleneck into Elasticsearch. It would enter an unrecoverable state when it got choked up We spent a lot of time on the Heka IRC channel talking to the devs. These were known issues. Some of the Heka team are/were working on a replacement written in C. It sounded promising but the ES output plugin did not exist yet so we couldn't use it.

The need for a performant and powerful log shipper is still there. I hope to see some new options come around soon that can achieve 1MM+ lines/sec from a single daemon without requiring multiple tiers, receivers, etc.


For the last year we have been working in a lightweight log shipper product called Fluent Bit[0]. Originally made for Embedded Linux now is taking it place in common environments.

It's pretty similar to Fluentd in architecture, some features are:

- Event-Driven (async network I/O).

- Input / Output plugins.

- Data routing based on Tags.

- Optional SSL/TLS for networking operations when required.

Next major version 0.9 will come with buffering support (memory/file system). Ah, it's fully made in C.

[0] http://fluentbit.io

http://fluentbit.io/documentation/0.8

http://github.com/fluent/fluent-bit


I couldn't find a link to your git repo on the website btw (repo (https://github.com/fluent/fluent-bit) ).

I see that your input/output plugins are written in C[0]. I'm guessing this is because of the constraints of the embedded environment, but it really doesn't seem like it would be worth it in a normal one. The LUA sandbox model (e.g. Heka) just seems highly preferable.

My main problem with Logstash/Fluentd is precisely the fragility and non-robustness of the plugin system.

[0] https://github.com/fluent/fluent-bit/blob/master/plugins/out...


The whole project is in C, there is a planed Lua support for the future versions that can help to filter/modify records, more news about it in the incoming weeks ;)

The decision about "why C" is: flexibility, performance and adaptability (note that it was originally designed for Embedded Linux targets, but now going everywhere). In order to make things easier for output plugins, every time a set of records needs to be flushed through some output plugin, a co-routine is created so any plugin can yield/resume at any time. For example out_http, out_es and out_forward relies on network I/O, having an event loop and a coroutine associated allows to simplify the plugin development and state management: connect, write, read, etc. This model is the foundation and allow the next step to integrate scripting more smoothly. For environments without co-routines support (old compilers), a POSIX thread model exists.

What are the specific "fragility"/concerns you see in Fluentd plugin model?


Same for us fluentd was very resource hungry, heka brought down utilization significantly. Regarding the buffering we just added kafka in between for graylog/ES scaling/maintenance.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: