noodls browser compatibility check

The security settings of your browser are blocking the execution of scripts.

To use noodls, javascript support must be enabled. Please change your browser's security settings to enable javascript.

If you have changed your browser's security settings, you can click here.

related announcements

News

Cognizant Technology Solutions[...]

The Great Wealth Transfer: How smaller banks can compete
Wyoming Military Department

Wyoming Purple Heart Day: From jungle firefights to modern bomb blasts
Wyoming Military Department

Community turns out for Wyoming Guard’s Touch a Truck event in Rock Spring

Information Technology

New Relic Inc.

07/04/2025 | News release | Distributed by Public on 07/04/2025 03:28

Optimizing Kafka Tracing with OpenTelemetry: Boost Visibility & Performance

Ideally, you should be using distributed tracing to trace requests through your system, but Kafka decouples producers and consumers, which means there are no direct transactions to trace between them. Kafka also uses asynchronous processes, which have implicit, not explicit, dependencies. That makes it challenging to understand how your microservices are working together.

However, it is possible to monitor your Kafka clusters with distributed tracing and OpenTelemetry. You can then analyze and visualize your traces in an open-source distributed tracing tool like Jaeger or a full observability platform like New Relic. In this post, I will leverage a simple application to show how you can achieve this.

Design Considerations & Guidelines

OpenTelemetry typically comes in two flavors:

When I talk about these flavors, I typically use the analogy above. You can either buy a ready-made cake and enjoy it or buy all the ingredients and make the cake yourself. With OpenTelemetry, the approach is very similar and the flavors are:

Zero-code instrumentation: in this approach, you will use an OpenTelemetry agent and attach it to your application at startup time. This agent will then do its magic and automatically (without any source code changes) provide a lot of telemetry signals (metrics, traces and logs) and insights into your application.
- Pros:
  - Getting started quickly
  - No source code changes
- Cons:
  - Limited customization
  - Depth of visibility into your application may be limited
Manual instrumentation: this option requires you to add some dependencies and packages to your source code that you need to manage as part of your regular software development lifecycle (SDLC). However, this also allows you to be more specific and custom about your instrumentation. You can easily add custom metrics, traces, attributes to your telemetry.
- Pros:
  - Way more flexible with customizing telemetry
  - Easily able to add, remove and tweak the depth of your instrumentation
- Cons:
  - Dependencies in your source code
  - More effort to implement

Sample application

The sample application (available in this public GitHub repository) that I am using in this blog is based on this high-level architecture:

It contains these components:

kafka-java-producer: a Java Spring Boot application that produces messages into a Kafka topic
Kafka broker
kafka-java-consumer: a Java Spring Boot application that subscribes to a Kafka topic and reads messages from it. This component also makes calls to an external REST API service (that is not in our control)
kafka-java-service: a downstream Java Spring Boot application that is being called from the consumer service

Zero-code instrumentation

Let's start with zero-code instrumentation, aka automatic instrumentation.

Configuration

Each of the different services contain a `run.sh` script to get the service up and running. The script looks like this:

The key line in this is the first one. Here we are defining the JAVA_TOOL_OPTIONS and configuring the `-javaagent` to point to the location of the OpenTelemetry Java agent.

The next three lines configure how we want to deal with the different telemetry signals. In our case, I define the traces, metrics and logs to be exported via OpenTelemetry Line Protocol (OTLP).

There are three additional environment variables that are quite important to configure:

OTEL_EXPORTER_OTLP_ENDPOINT: the target system where we want to export the data to, i.e. our telemetry backend. In my case, this is for sure New Relic and so I configure New Relic's native OTLP endpoint
OTEL_EXPORTER_OTLP_HEADERS: the above exporter endpoint is an open API, so we need to configure an API key. In the case of New Relic, this is a New Relic license key.
OTEL_SERVICE_NAME: ideally, we want to give the service a meaningful name, so that New Relic can create an appropriate entity from it.

This is basically all we need to configure. Everything else is dealt by the OpenTelemetry Java agent. No need to change anything in our source code.

Observability

Let's see what level of visibility into the services we can achieve from zero-code instrumentation.

When navigating to my New Relic account, I can see all services reporting into separate entities.

Let's start by exploring the kafka-java-producer service.

The Summary view offers a great overview of all the most important telemetry and metrics I should be focusing on.

As part of this blog, I am mostly interested in the Distributed Tracing section, so let's dive deeper into this area.

By looking at a single trace, this allows me to view the detailed information on how long this specific trace took to execute and where the time was spent.

We also automatically draw an Entity map of all the different services involved in a given trace.

The interesting area that I want to draw your attention to lies in the trace and span breakdown. You can see how the trace gets initiated on the producer, the consumer then picks up the message and how the consumer then also makes two separate calls to the downstream service.

What is interesting here is the span that says "Uninstrumented time". This is code in the consumer where the agent was not able to capture some more detailed information about what is going on in its internal methods.

This already shows the limits of zero-code instrumentation. The agent by default will not instrument all the various methods and source code, but rather stops - by design - at some level to get deeper visibility into your code.

Manual instrumentation

In the previous section, you saw how zero-code instrumentation has some limits when it comes to visibility into your application. This is exactly where manual instrumentation comes into play.

Configuration

I have configured the same application, but this time, no agent at all is configured when starting the application.

I simply use the Maven wrapper to run the application.

The other configuration details are then part of my application.properties:

These properties are then used in my Spring Boot application code to define the configuration for OpenTelemetry for traces, metrics and logs.

Observability

Before I jump into the details of how I implemented some manual instrumentation, let's have a look at the result first.

Do you notice how the span, which previously was called out with "Uninstrumented time", now shows much more detailed information? I now can see these additional spans:

ExecuteLongRunningTask
WhyTheHeckDoWeSleepHere
SomeTinyTask
AnotherShortRunningTask

The one that says "WhyTheHeckDoWeSleepHere" seems to be taking the most time. No wonder, as the name suggests .

Let's have a look at the source code to reveal the manual instrumentation I put in place.

In the method named ExecuteLongRunningTask I have created a new span on the current tracer by using the spanBuilder() Method.

In addition to that, you may also notice that - just for the fun of it - I created another span called "WhyTheHeckDoWeSleepHere" that contains an artificial unit of work or rather a sleep instruction on the current thread.

These concepts to leverage the OpenTelemetry SDK allow me to be much more specific in getting insights and details into my application and source code. But, as you can imagine, also have the caveat that I need to have some dependencies and custom code available in my source code.

Conclusion

I hope I was able to show you how easy it can be to leverage OpenTelemetry in order to get insights into your application and services. We looked into zero-code instrumentation to get started without any code changes, but the level of details may be limited. We then also looked into manual instrumentation. This allowed us to be more specific and customize the instrumentation, but the effort to get started is a little higher.

I encourage you to have a look into OpenTelemetry and its fascinating capabilities. Let me know your thoughts and please get in touch if you have any questions or need further information.

Happy coding!

New Relic Inc. published this content on July 04, 2025, and is solely responsible for the information contained herein. Distributed via Public Technologies (PUBT), unedited and unaltered, on July 04, 2025 at 09:28 UTC. If you believe the information included in the content is inaccurate or outdated and requires editing or removal, please contact us at support@pubt.io

Back

View original format