Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

It is as complicated as you want or need it to be. You can avoid any magic and stick to a subset that is easy to reason about and brings the most value in your context.

For our team, it is very simple:

* we use a library send traces and traces only[0]. They bring the most value for observing applications and can contain all the data the other types can contain. Basically hash-maps vs strings and floats.

* we use manual instrumentation as opposed to automatic - we are deliberate in what we observe and have great understand of what emits the spans. We have naming conventions that match our code organization.

* we use two different backends - an affordable 3rd party service and an all-on-one Jaeger install (just run 1 executable or docker container) that doesn't save the spans on disk for local development. The second is mostly for piece of mind of team members that they are not going to flood the third party service.

[0] We have a previous setup to monitor infrastructure and in our case we don't see a lot of value of ingesting all the infrastructure logs and metrics. I think it is early days for OTEL metrics and logs, but the vendors don't tell you this.



It's as complicated as you want, but it's not as easy as I want. The floor is pretty high.

I'm still looking for an endpoint just to send simple one-off metrics to from parts of infrastructure that's not scrapable.


You can just send metrics via JSON to any otlphttp collector: https://github.com/open-telemetry/opentelemetry-proto/blob/v...


Shame none of this comes up whenever I search for it!

Top google result, for me, for 'send metrics to otel' is https://opentelemetry.io/docs/specs/otel/metrics/. If I go through the the Language APIs & SDK more whole bunch of useless junk https://opentelemetry.io/docs/languages/js/

Compare to the InfluxDB "send data" getting started https://docs.influxdata.com/influxdb/cloud/api-guide/client-... which gives you exactly it in a few lines.


There's an excellent article on how to implement OpenTelemetry Tracing in 200 lines of code.

https://jeremymorrell.dev/blog/minimal-js-tracing/

"It might help to go over a non-exhaustive list of things the offical SDK handles that our little learning library doesn’t:

- Buffer and batch outgoing telemetry data in a more efficient format. Don’t send one-span-per-http request in production. Your vendor will want to have words."

- Gracefully handle errors, wrap this library around your core functionality at your own peril"

You can solve them of course, if you can


Maybe the confusion here is in comparing different things.

The InfluxData docs you're linking to are similar to Observability vendor docs, which do indeed amount to "here's the endpoint, plug it in here, add this API key, tada".

But OpenTelemetry isn't an observability vendor. You can send to an OpenTelemetry Collector (and the act of sending is simple), but you also need to stand that thing up and run it yourself. There's a lot of good reasons to do that, but if you don't need to run infrastructure right now then it's a lot simpler to just send directly to a backend.

Would it be more helpful if the docs on OTel spelled this out more clearly?


The problem is ecosystem wide - the documentation starts at 8/10 and is written for observability nerds where easy things are hard, and hard things are slightly harder.

I understand the role that all the different parts of OTel plays in the ecosystem vs InfluxDB, but if you pay attention to that documentation page, it starts off with the easiest thing (here's how you manually send one metric), and then ramps up the capabilities and functionality from here. OTel docs slam you straight into "here's a complete observaility stack for logs, metrics, and traces for your whole k8s deployment".


The equivalent page for this is the Get Started pages in OTel, e.g. https://opentelemetry.io/docs/languages/js/getting-started/n...

However, since OTel is not a backend, there's no pluggable endpoint + API key you can just start sending to. Since you were comparing the relative difficulties of sending data to a backend, that's why I responded in kind.

I do agree that it's more complicated, there's no argument there. And the docs have a very long way to go to highlight easier ways to do things and ramp up in complexity. There's also a lot more to document since OTel is for a wider audience of people, many of whom have different priorities. A group not talked about much in this thread is ops folks who are more concerned with getting a base level of instrumentation across a fleet of services, normalizing that data centrally, pulling in from external sources, and making sure all the right keys for common fields are named the right way. OTel has robust tools for (and must document) these use cases as well. And since most of us who work on it do so in spare time, or a part-time capacity at work, it's difficult to cover it all.


https://github.com/openobserve/openobserve, more or less.

First time it takes 5 minutes to setup locally, from then on you just run the command in a separate terminal tab (or Docker container, they have an image too).


I did not find that manual instrumentation made things simpler. You’re trading a learning curve that now starts way before you can demonstrate results for a clearer understanding of the performance penalties of using this Rube Goldberg machine.

Otel may be okay for a green field project but turning this thing on in a production service that already had telemetry felt like replacing a tire on a moving vehicle.


I've not used otel for anything not greenfield, but I just wanted to say

> felt like replacing a tire on a moving vehicle.

Some people do this as a joke / dare. I mean literally replacing a car tire on a moving vehicle.

You Saudi drift up onto one side, and have people climb out of the side in the air, and then swap the tire while the car is driving on two wheels.

It's pretty insane stuff: https://youtu.be/Str7m8xV7W8?si=KkjBh6OvFoD0HGoh


That was the image I had in my head.

My whole career I’ve been watching people on greenfield projects looking down on devs on already successful products for not using some tool they’ve discovered, missing the fact that their tool only functions if you build your whole product around the exact mental model of the tool (green field).

Wisdom is learning to watch for people obviously working on brownfield projects espousing a tool. Like moving from VMs to Docker. Ansible to Kubernetes (maybe not the best example). They can have a faster adoption cycle and more staying power.


SaS Institute used that exact same analogy & even this video in their talk about implementing ScyllaDB back in 2020 (check out 0:35 in the video):

https://www.scylladb.com/2020/05/28/sas-institute-changing-a...

Seems like moving to OTel might even be a bit more complex for some brownfield folks.


Mind sharing that that affordable 3rd party service is?


honeycomb


Very sane advice. Most folks will already have something for metrics and logs and unless there's ROI on changing it out, why bother?


>You can avoid any magic and stick to a subset...

... if (and only if) all the libraries you use also stick to that subset, yea. That is overwhelmingly not true in my experience. And the article shows a nice concrete example of why.

For green-field projects which use nothing but otel and no non-otel frameworks, yea. I can believe it's nice. But I definitely do not live in that world yet.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: