Splunk Inc.

11/08/2024 | News release | Archived content

Monitoring Splunk Enterprise Deployments with OpenTelemetry

Are you tired of flying blind when it comes to monitoring your Splunk Enterprise deployment? Do you wish there was an easier way to ensure that your indexing and search operations are running smoothly? Well, buckle up, because we're about to dive into how the Splunk Enterprise Receiver for OpenTelemetry can turn your monitoring woes into monitoring wows! (Okay, yeah - that was terrible.)

The Splunk Enterprise Receiver: An Overview

Monitoring your Splunk Enterprise deployment can be like herding cats without the right tools. Maintaining historical perspective of metrics coming from within your Splunk Enterprise environments on-premises can be especially daunting. In many cases this requires relying on the built in Monitoring Console, a separate instance of Splunk to monitor your splunk ("yo dawg…") or a home-rolled solution polling the APIs and custom searches. Enter the Splunk Enterprise receiver, a superhero written by a great couple of contributors to the OpenTelemetry Collector Contributions repository. This receiver swoops in to collect and report valuable metrics and attributes (indexer queuing and K-V store status, anyone?), giving you a crystal-clear view of your Splunk environment.

By retrieving many of the same readings provided in the Splunk Monitoring Console and sending them as OTLP compatible time series metrics they can be visualized in your favorite OpenTelemetry-compatible observability backend.. Which must be Splunk Observability Cloud (right?) Having these metrics available along with the metrics for applications that are likely sending their logs to that Splunk Enterprise deployment can help more quickly correlate increases in log ingestion or indexing with log creation following issues with applications in production. Additionally, in the unlikely event your Splunk Enterprise instances were to go down completely, the historical time series data will be invaluable while troubleshooting what happened just before failure.

The receiver has just been released and into alpha status and currently provides details into indexers, search scheduling, queuing, the K-V store, and much more to come! We've even begun dogfooding the receiver internally at Splunk to help introspect our own Enterprise instances. Let's take a look at some of the current data the Splunk Enterprise receiver provides and how you can get value out of it!

Key Metrics and Attributes

The Splunk Enterprise receiver serves up a bonanza of metrics and attributes that cover different aspects of your Splunk deployment. Here are some of the key ones:

Metrics:

  • splunk.license.index.usage: Tracks the indexed license usage per index. No more surprises - a constant running total of index license usage can keep you out of the dark when it comes to provisioning!
  • splunk.scheduler.avg.execution.latency: Measures the average execution latency of scheduled searches. Check in on some of those scheduled searches that are not yet speedy Splunk admin approved.
  • splunk.scheduler.completion.ratio: Calculates the ratio of completed to skipped scheduled searches. Start to know when searches fail and get started investigating why those failures suddenly started increasing.
  • splunk.indexer.avg.rate: Monitors the average rate of indexed data. Splunk admins know that the data must flow! Keeping tabs on indexer rates lets you know when that flow may be impeded.
  • splunk.parse.queue.ratio, splunk.typing.queue.ratio, splunk.aggregation.queue.ratio: Track your queues and keep an eye on where data may be backing up. If your typing queue is backing up for instance you may be having trouble with regex or other transforms
  • splunk.kvstore.status, splunk.kvstore.replication.status, splunk.kvstore.backup.status: The kvstore is the heart of how you use your data in Splunk. These metrics help you quickly determine if and when issues may have started at a glance.

Attributes:

  • splunk.host: Name of the Splunk host.
  • splunk.index.name: Name of the index reporting a specific KPI.
  • splunk.indexer.status: Status message reported for a specific object.
  • splunk.indexer.searchable: Searchability status reported for a specific object.
  • splunk.bucket.dir: The bucket super-directory (home, cold, thawed) for each index.

Figure 1.1 Splunk Enterprise metrics can be viewed in your favorite Observability tool with OpenTelemetry compatible metrics. This places your Splunk monitoring data alongside the monitoring data for applications sending their logs to Splunk.

Why IT Operations, DevOps, and Software Development Teams Should Care

  • Splunk Administrators: As a Splunk admin, you live and breathe data. The Splunk Enterprise receiver is your new best friend, providing a treasure trove of metrics and attributes to help you keep your deployment running like a well-oiled machine. By monitoring key aspects such as license usage, index rates, and search latencies, you'll be able to spot issues before they become problems, optimize performance, and ensure everything is running smoothly. It's like having a magic dashboard that shows you everything you need to know!
  • IT Operations / Support Analysts: As a member of the IT Operations team, using the Splunk Enterprise receiver allows you to proactively monitor the health and performance of your Splunk deployment. This helps in quickly identifying and resolving issues, ensuring smooth operations. Less firefighting, more high-fiving!
  • DevOps / SRE: For DevOps and SRE teams, integrating the Splunk Enterprise receiver with OpenTelemetry provides valuable metrics that aid in maintaining system reliability and performance. By monitoring key metrics, you can optimize your deployment and prevent potential bottlenecks. It's like having a crystal ball for your logging infrastructure.
  • Software Developers: Developers can benefit from the insights provided by the Splunk Enterprise receiver to ensure that their applications are logging and indexing data efficiently. This helps in maintaining application performance and reliability. Code more. Worry less!

Practical Usage and Integration

Integrating the Splunk Enterprise receiver into your OpenTelemetry setup is easy. Simply add the receiver, an extension for your basicauth username / password and add them to your metrics pipeline. By configuring the receiver, you can start collecting and visualizing metrics in your monitoring tools. This provides a centralized view of your Splunk deployment's performance, making it easier to manage and optimize. To see just how easy the receiver is to use check out this sample configuration:

```
extensions:
basicauth/indexer:
client_auth:
username: admin
password: securityFirst
basicauth/cluster_master:
client_auth:
username: admin
password: securityFirst
receivers:
splunkenterprise:
indexer:
auth:
authenticator: basicauth/indexer
endpoint: "https://localhost:8089"
timeout: 45s
cluster_master:
auth:
authenticator: basicauth/cluster_master
endpoint: "https://localhost:8089"
timeout: 45s
exporters:
debug:
verbosity: basic
service:
extensions: [basicauth/indexer, basicauth/cluster_master]
pipelines:
metrics:
receivers: [splunkenterprise]
exporters: [debug]
```

For a detailed guide on setting up the Splunk Enterprise Receiver - or if you'd like to contribute! - visit the OpenTelemetry-Collector-Contrib GitHub repository.

Next Steps

Ready to take your Splunk Enterprise monitoring to the next level and start combining your Splunk Enterprise monitoring with the monitoring of applications sending logs to it? You can sign up to start a free trial of the Splunk Observability Cloud suite of products today!

Monitoring your Splunk Enterprise deployment doesn't have to be a Herculean task. By leveraging the Splunk Enterprise receiver from OpenTelemetry, you gain access to a treasure trove of metrics and attributes that provide deep insights into your system's performance. Whether you're part of the IT Operations, DevOps, or Software Development team, this receiver can help you ensure that your deployment is running smoothly and efficiently.

This blog post was authored by Jeremy Hicks (Github: Greatestusername) and Sam Halpern (Github: Shalper2), co-owners of the Splunk Enterprise Receiver for OpenTelemetry.