For anyone interested in learning how to use Open Telemetry for distributed trac...

hendry · on May 27, 2021

Amount of "otel" instrumentation required in your code by LOC looks ridiculous to me.

Isn't it a better approach to just use logs?

e.g. defer ctx.WithField("path", path).Trace("opening").Stop(&err)

from https://medium.com/@tjholowaychuk/apex-log-e8d9627f4a9a

chrisz4 · on May 28, 2021

Once setup, I found that amount of code required for creating a simple trace via otel and logging is roughly the same.

Refer to: https://github.com/michaelperel/otel-demo/blob/master/cmd/se...

51: Start a span. Equivalent to one line of log 60: Add a event. Equivalent to one log 69: Set attribute. This retroactively add attribute to whole span since the start. While log don't have exact same effect, one line of log can be used here. 74: RecordError. Equivalent to one log

I haven't compare amount of code to setup a proper logger which connected to correct infra with amount of code to setup otel yet. Still, I don't think it gonna make much difference.

In general, I won't mind either approach if I get a great visualizer. The main reason I would choose OpenTelemetry is I get trace visualizers for free and I can switch to better visualizer anytime that I want.

aseipp · on May 27, 2021

Your counterexample is misleading, because a single simplistic function call is what happens in most of the code, e.g. creating spans for an arbitrary point in the code is like, 1 function call[1] after setup. I don't see anything egregious here. Of course, you also have to record where the span ends, but that's part of the game you're playing when you move beyond logs. You have to record that.

What you're probably looking at is all the boilerplate set up, e.g. configuring the provider and backend to point to the right stuff. It reminds me of SL4J, which isn't actually an insult. It's just boilerplate, because people want a lot out of their logging and tracing systems.

Demo applications like this are often easily misleading because there are only like 50 lines of "business logic" and 50 lines of tracing setup, so it makes it seem like the tracing is excessively hard. But those two things don't scale the same way. In a large application where tracing is really valuable, you'll have 100,000 lines of business logic, and still only 50 (or maybe like 100) lines of tracing setup, per application.[2] Actual usage at the call sites remains only a line or two in most cases, and easy to add as you need, where you need it, just like a logger.

It is also worth noting in other ecosystems like when I played with tokio_trace, I found integrating tracing easy, even at the very start. So some of this definitely involves the "philosophy" of the client library.

[1] https://github.com/michaelperel/otel-demo/blob/master/cmd/cl...

[2] I guess if the 100,000 LoC running your business is split into 2000 microservices with 50 lines each, then yes, it may be excessive.

bassdropvroom · on May 27, 2021

Opentelemetry is a lot more than that. It can handle traces across services.

hendry · on May 27, 2021

You can log and use a tool like AWS LogInsights to see the trace across services..

No code changes needed.

bassdropvroom · on May 27, 2021

Great, so now link each request across services, combined with the timings of each individual component within each service. LogInsights is for logging, not tracing. Logging is not a replacement for tracing.

hendry · on May 28, 2021

link request with X-Amzn-Trace-Id https://docs.aws.amazon.com/elasticloadbalancing/latest/appl... and yes you will have timings also in the structured log.

I am effectively getting trace level insights from my logs. I must be doing something wrong!

avinassh · on May 27, 2021

How much changes I need to make if I want to add open telemetry in my existing application?