If eBPF were peanut butter, service meshes would most definitely not be jelly. But just because eBPF and service meshes don’t always go hand-in-hand, they can be excellent complements to each other under the right circumstances.
Keep reading to find out why as we unpack everything you need to know about eBPF and service meshes – including how each technology works, how they can be used together, and why you may (or may not) want to deploy an eBPF service mesh.
Introduction to eBPF and service mesh
Let’s get started by defining what eBPF and service meshes are.
What is eBPF?
eBPF is a technology built into the Linux kernel that makes it possible to run custom code in kernel space – which means the code runs as part of the operating system, rather than a standard application (which runs in what’s known as user space). This ability makes eBPF programs both highly efficient and highly secure. It also allows them to access virtually any information available to the operating system, since they run directly within the kernel.
A complete discussion of how eBPF works and why you should be so excited about it is beyond the scope of this article (although you can read all about eBPF elsewhere on our website if you’d like). But suffice it to say that in essence, eBPF offers a means of collecting performance and observability data in a hyper-efficient way. Instead of traditional approaches to collecting observability data – such as using agents running in sidecars – eBPF makes it possible to pull the data directly from Linux kernel space.
What is a service mesh?
A service mesh is a software tool that controls and monitors services within a distributed application. The main purpose of a service mesh is to help manage communications between the various microservices within modern, cloud-native applications. Service meshes can also help to manage microservice performance and observability by monitoring traffic flowing between microservices and balancing load across microservices instances. They can assist with security, too, by enforcing security rules to govern microservice communication.
Most service meshes include two main components: A control plane, which centrally monitors and manages microservice communications, and a data plane, which is the set of software agents or proxies that tell individual microservices what to do.
You don’t strictly need a service mesh to deploy a microservices app. If you wanted, you could include logic within your app that tells the microservices how to discover and communicate with each other on their own. You could also create logic to collect observability data directly for the app. However, using a service mesh allows you to outsource these and other critical tasks to a dedicated infrastructure layer, making it easier to deploy apps that include multiple services.
The eBPF-based service mesh model
Now, let’s talk about what eBPF and service meshes have to do with each other.
Sidecar proxy model
The first thing to know about eBPF-based service meshes is that they’re a relatively new idea. Traditionally, most service meshes did not use eBPF. Instead, they relied on what’s known as the sidecar model to manage microservices apps.
Architecture and functionality of sidecar proxies
The architecture of a sidecar proxy involves deploying service mesh proxies inside sidecar containers. Sidecar containers run alongside application containers (hence the name sidecar) within the same Kubernetes Pod, allowing them to interact with and monitor microservices.
Examples and performance considerations
There’s nothing wrong with using sidecar containers to power your service mesh if doing so floats your boat. But there are some drawbacks, especially in the realm of performance. Running sidecar containers alongside your “main” application containers increases the number of containers you have running, and with it, the amount of CPU and memory that your workloads consume. As a result, the sidecar proxy model leaves fewer resources available to applications, potentially leading to lower levels of performance. Sidecar containers also have to work harder to collect data than eBPF programs because they run in user space, and therefore don’t have direct access to kernel-level resources.
Despite this potential drawback, the sidecar model is the one that most service meshes, such as Istio, have traditionally relied on – which is understandable because those service meshes emerged before eBPF entered into widespread use. (eBPF and its predecessor, BPF, have been around for decades, but eBPF didn’t really become mature enough for production use until the later 2010s – by which point the sidecar proxy model was already well-established.)
Enhancing service mesh with eBPF
As an alternative to sidecar service meshes, you can use eBPF programs to manage your microservices. Because eBPF can run inside the Linux kernel of each node that hosts microservices, eBPF programs are able to track and monitor microservices, and manage communications between them, without requiring the deployment of sidecar containers.
The hyper-efficient nature of eBPF code means that an eBPF service mesh typically performs better than a sidecar-based alternative. In addition, the eBPF model gives you more granular control than you’d typically get from sidecars, because with eBPF, there are almost no limits on the types of data you can collect for observability purposes or the types of controls you can place upon traffic flows between microservices.
This is why service meshes like Cilium have adopted eBPF as the basis for monitoring and managing microservices. In addition, traditional service meshes like Istio (which did not originally use eBPF) now offer optional eBPF-based capabilities.
Comparing eBPF-based and sidecar proxy models
From the perspective of service mesh functionality, eBPF-based service meshes and sidecar service meshes do the same things: Both types of solutions support service discovery, monitoring, security policy enforcement, and so on.
Under the hood, however, they work quite differently. The main distinctions include:
- Data plane deployment model: In a sidecar service mesh, the data plane is composed of a set of sidecar containers running along the main application containers in user space. In an eBPF service mesh, the data plane consists of eBPF programs running directly in node operating system kernels.
- Overhead: As noted above, a sidecar service mesh has more overhead (in terms of CPU and memory consumption) than an eBPF service mesh, due to the resource requirements of sidecar containers running in user space. In addition, an eBPF service mesh only has to run one instance of the data plane software per node, whereas in a sidecar service mesh, you have to run one sidecar for each application or microservice, which can also contribute to higher overhead.
- Granularity: Because eBPF can access virtually any resource, they enable very flexible and granular control over microservices. Sidecar service meshes are fairly flexible, too, because you can customize the logic inside sidecar containers. But because sidecars run inside Pods and don’t have kernel-level access to the host system, they lack the granular service mesh functionality of eBPF service meshes.
- Ease of deployment: Sidecar service meshes are arguably easier to deploy because you can easily include a sidecar container in your Pod specs. Writing and running your own eBPF code to power a service mesh is considerably more complicated and requires specialized knowledge of eBPF. That said, if you use a service mesh like Cilium, the eBPF code you need is already built into the service mesh, so most of the hard work is done for you.
- Supported environments: Sidecar proxies can run virtually anywhere, but eBPF service meshes only work on nodes whose operating systems support eBPF – which means nodes running Linux kernels version 5.8 or higher. (The Cilium service mesh offers limited support for Windows thanks to interesting efforts by Microsoft to port eBPF to that operating system, but it wouldn’t be accurate to say that eBPF service meshes are Windows-compatible.)
Advantages and disadvantages of each model
Because sidecar proxy models and eBPF service meshes work in different ways, each offers a different set of pros and cons.
The main advantages of a sidecar service mesh are:
- Ease of deployment.
- Compatibility with virtually any environment.
But the major drawbacks of this type of service mesh include:
- High overhead, which may lead to lower overall performance.
- Less control over how the service mesh operates.
As for eBPF service meshes, the primary advantages include:
- High efficiency, which translates to higher levels of performance.
- Extensive control and granularity over service mesh operations.
The major disadvantages of using an eBPF service mesh are:
- They only support nodes running modern versions of Linux.
- There are fewer eBPF-based service mesh options available.
- Customizing eBPF programs to fine-tune service mesh behavior is very complex.
Better together? Using eBPF and sidecar proxies at the same time
Sidecar service meshes and eBPF service meshes don’t have to be an either-or proposition. It’s possible to use both models together via a hybrid strategy.
Integrating eBPF with sidecar proxies
A hybrid approach would mean running eBPF programs to perform some service mesh functions, while relying on sidecar containers to handle others.
For instance, because eBPF can inspect network traffic directly from kernel space instead of user space, you might choose to use eBPF to support network observability, since it can handle this task more efficiently than a sidecar proxy that doesn’t have kernel-level access to network traffic. Meanwhile, you could use sidecar containers to support service discovery and authentication for your microservices, since these functions are simple to implement at the Pod level.
Currently, most major service meshes are designed either to use sidecars exclusively or eBPF exclusively – so there is not an easy way to set up a hybrid deployment in most cases. That said, Istio’s merbridge project does offer a way to deploy eBPF programs to handle certain service mesh operations, while relying on traditional Istio sidecars for the rest.
You can deploy this service mesh functionality on an existing Istio cluster with just one command:
Merbridge support is also available for other popular sidecar service meshes, such as Linkerd.
If you want more fine-tuned control over how eBPF interacts with a service mesh that is otherwise based on sidecars, you’ll likely need to deploy custom eBPF programs. But merbridge is an easy way to create a hybrid setup that offloads some of the most resource-intensive service mesh operations to eBPF, while keeping sidecars in place for everything else.
groundcover’s role in eBPF and service mesh
If you use groundcover, you can benefit from eBPF no matter which type of service mesh you deploy. groundcover offers built-in support for eBPF observability, which means it uses eBPF under the hood to collect observability data from nodes, Pods, and clusters – so no matter which type of service mesh you use to help manage microservices, you can collect observability metrics and perform eBPF tracing in a hyper-efficient way.
groundcover isn’t a service mesh, so you’ll still want to weigh the pros and cons of each type of service mesh before deciding which option to go with. But by choosing groundcover, you’ll benefit from the high performance of eBPF-powered observability either way.
The future of eBPF and service mesh
If you asked us to predict the future, we’d say there’s a good chance that eBPF will become more and more central to the service mesh scene. Although most of the popular service meshes available today don’t primarily rely on eBPF (with Cilium being the only exception), some, like Istio, now offer limited support for eBPF, as we noted above.
This isn’t to say that sidecars are likely to disappear entirely. They offer some important advantages, as some have noted. But on the whole, there is a lot of momentum in the service mesh world surrounding eBPF, and it’s hard to imagine that this technology won’t play an increasingly important role in microservices management going forward.
Sign up for Updates
Keep up with all things cloud-native observability.