Here at groundcover, we're big fans of open source monitoring solutions, log analysis tools and performance optimization platforms, which are highly flexible and extensible. They’re also (usually) free of cost. All in all, it's hard not to love open source observability tools.
We know: That may sound a bit strange, given that not all of the software we build at groundcover is open source (although we do maintain a number of open source repositories). But the fact is that, even though open source isn't always the right way to meet your observability or other needs, it's often a good starting point.
With that fact in mind, we'd like to walk through the various open source observability tools on the market today. We'll also discuss how to choose between them – and when it makes sense to opt for a non-open source observability solution.
What is observability and why does it matter to IT operations?
Before diving into an analysis of open source observability tools, let's discuss what observability means in the first place and why it's so important.
As you probably know if you're an acolyte of IT buzzwords, observability is the interpretation of the internal state of a complex system based on its external outputs. In other words, when you observe an IT system – as opposed to simply monitoring it – you do more than just collect data. You also compare and correlate the various types of information available to you in an effort to gain a holistic understanding of what's happening deep inside the system.
The observability concept has been around for decades, but it didn't catch on in the realm of IT until starting about five years ago. At that time, the growing proliferation of complex, distributed systems – like microservices apps running atop sprawling Kubernetes clusters – necessitated strategies for understanding the state of those systems that extended beyond conventional monitoring, data analysis and application performance monitoring. Hence why observability has become a key focus of modern IT operations.
Types of observability tools
Broadly speaking, observability tools fall into four main categories:
• Log management and analysis: Some observability platforms are designed to help you ingest, aggregate, analyze and manage logs, which store critical insights for application performance management.
• Metrics and visualizations: Metrics are another critical source of observability insights, and certain observability tools are designed to help you collect and visualize metrics data.
• Application-level tracing: In distributed, cloud-native apps, it's often helpful to trace how requests flow within the app to discover the root cause of performance issues. Some observability tools focus on this need.
• Kubernetes infrastructure monitoring: Because Kubernetes infrastructure is unique in certain key respects, some observability tools cater to Kubernetes monitoring specifically.
You can find observability platforms that attempt to cover multiple areas of functionality. But in the open source observability tool ecosystem, most solutions have a narrower focus. Embodying the old Unix mantra that every tool should "do one thing, and do it well," they address a specific observability need.
Features of observability tools
But just because open source observability tools tend to have narrow areas of focus doesn't mean that they don't share much in common. On the contrary - from a feature perspective, most observability tools provide the same core types of functionality:
• Data collection: They let you ingest data from whichever sources will enable you to observe a system.
• Data analysis: They help you analyze the data you've collected to identify patterns or anomalies that can tip you off to performance problems or risks.
• Root cause analysis: Knowing that a performance problem exists is only half the battle, which is why most observability platforms also provide features designed to help pinpoint the root cause of an issue.
• Data management: Some observability tools offer features to help manage the data they ingest and store. For example, they might assist with log rotation or the archiving of data after you've completed analysis of it.
Where the various types of observability tools differ is in the use cases they support. They also tend to vary with regard to the types of data they can collect and analyze, but that's because different data sources align with different use cases.
Top open source observability tools
IT tool developers have responded to the observability craze by building a number of solutions, including several fantastic open source observability tools. Here's a look at what we consider to be the top four contenders in the open source observability market.
As you'll see, although these tools overlap a bit functionality-wise in some cases, each solution specializes aligns with a different type of observability need – so rather than thinking of these as either-or open source observability tools, think of them as a set of tools that, when combined together, can form the foundation for a modern observability strategy.
1 | ELK stack (or OpenSearch) for log analysis
The so-called ELK stack consists of three components:
- Elasticsearch, a distributed analytics engine that can run queries for a variety of use cases, including log analysis.
- Logstash, a data processing pipeline that can support virtually any type of data source.
- Kibana, which provides data visualization to help interpret complex sets of information.
By combining these tools together, you get a more or less open source tooling stack that lets you ingest log data, search it and visualize it using a unified set of tools. We say "more or less" because Elasticsearch and Kibana aren't officially open source at present; since 2021 they have been "open code" per Elastic, the company that maintains them. We won't get into the politics surrounding that choice of label or the differences between open source and open code, but you can read more about the status of the ELK stack and the debates it spurred within the open source community if you'd like. (Logstash is unequivocally open source, for the record.)
We'll also note that if you're not comfortable with the open code status of parts of the ELK stack, you might be interested in OpenSearch, which is a fully open source solution derived from the ELK stack. Again, there are some politics and history here that we won't get into, but suffice it to say that OpenSearch was basically created to give folks a 100 percent open source version of ELK.
Allow us to note, too, that there are variants on the ELK stack. For example, you can swap out Logstash for an alternative open source log collector, like Fluentd, in which case you'd have an EFK stack instead of an ELK stack.
Discussing the pros and cons of different log collectors is beyond the scope of this article, but we note the flexibility here because the ability to customize your ELK stack based on your tooling preferences is part of what makes ELK (or whatever acronym aligns with your tool choices) so powerful.
2 | Prometheus and Grafana stack for metrics and performance optimization
The ELK stack is a great way to construct an open source solution for log analysis, but what about working with metrics that aren't stored in logs?
That's where Prometheus and Grafana come in. Prometheus is an open source monitoring tool that lets you collect time-series metrics from a variety of different applications and environments – including modern, cloud-native apps. Grafana provides visualization and analysis functionality so that you can make sense of your metrics data.
Prometheus and Grafana integrate with each other basically out-of-the-box, so it's very easy to feed metrics collected by Prometheus into Grafana in order to visualize and analyze them.
3 | OpenTelemetry and Jaeger stack for distributing tracing
Logs and metrics are two of the so-called pillars of observability. The third is distributed tracing, which lets you track the movement of data within distributed applications to pinpoint the source of errors.
If you want to run distributed traces using open source observability tools, the go-to solutions are OpenTelemetry and Jaeger. OpenTelemetry is a set of APIs and SDKs that allow you to expose observability data from within an application in a standardized, efficient way, and Jaeger is designed to help monitor and analyze interdependent components within an application.
So, when you pair OpenTelemetry with Jaeger, you get a complete toolchain for collecting and analyzing traces within your cloud-native microservices apps.
Lest we leave readers with the impression that distributed tracing is the only thing OpenTelemetry is good for, we should note that it's not. You can use OpenTelemetry to collect virtually any type of observability data, not just traces, and connect it to a variety of tools, not just Jaeger. But distributed tracing is the use case you'd target if you chose to deploy OpenTelemetry and Jaeger together.
4 | OpenLens for Kubernetes infrastructure monitoring
It's possible to leverage some of the observability tools we've already discussed, such as the ELK stack and Prometheus, to monitor the infrastructure that powers your Kubernetes clusters. However, Kubernetes is its own special beast from an infrastructure perspective. It relies on a unique set of infrastructure components and concepts – control plane nodes, worker nodes, an API server, and etcd store and so on – and observing them with tools not designed specifically for Kubernetes doesn't always go as smoothly as one desires.
OpenLens was built with this need in mind. It lets you monitor the health of the various components of your Kubernetes infrastructure to ensure that your workloads have the resources they need to perform at their best.
How to choose the right observability platform – and why open source isn't always best
Now that we've told you which observability platform you should choose if you want to take a fully open source approach to observability, let's talk about why you should or shouldn't use the solutions we've just discussed, depending on your priorities.
The key benefit of open source observability tools is that they're typically free of direct cost because they don't usually require you to pay for a license or subscription. They also tend to be flexible and not tied to any particular vendor's ecosystem.
That said, open source observability tools can also present some challenges:
What this means is that you should choose open source observability tools if you have the in-house staff expertise and availability to handle the rough edges of open source solutions. Otherwise, choosing open source may mean trading one set of downsides (like paying for software) for another (like going insane when your software doesn't work and the only people available to help are randos on Stack Exchange who may or may not know what they're talking about).
Making the most of observability tools, no matter where they come from
If you've decided that open source observability tools are the right fit for your needs – either because you actually weighed the advice we gave you above, or because you treat everything Linus Torvalds says as an absolute truth that no one should ever question – you now know which tools to look for.
Alternatively, if you want an observability solution that is less of a hassle, scales with your needs and was built for modern apps, check out groundcover.