eBPF is kind of like avocado toast: The ingredients on which it's based have been around for a long time. And yet, it has been only withinpast couple of years that eBPF – or the extended Berkeley Packet Filter, as it's formally known – suddenly emerged as one of the latest, greatest buzzwords in IT.
We can't explain how avocado toast became a culinary sensation across global hipsterdom. But we can tell you why eBPF has become such a big deal for revolutionizing Kubernetes observability, among other tasks. Let’s look at the history of eBPF, how it works, which problems it solves and why you – yes, you – should start taking advantage of it.
What is eBPF?
eBPF, which is short for extended Berkeley Packet Filter, is a Linux kernel feature that makes it possible to run sandboxed programs within kernel space. eBPF extends the functionality of the operating system in a safe and controlled manner, taking advantage of the kernel's access to resources and system data without compromising on security or efficiency.
As its name suggests, eBPF is an extension of an earlier Linux kernel technology – the Berkeley Packet Filter, or BPF – that was introduced in 1993 to equip the Linux kernel with tools for viewing, controlling and filtering network traffic via the system call interface. BPF allowed developers to modify the system's network policies dynamically, an approach to kernel programming that was new and exciting.
eBPF expands on the concepts of BPF in several ways (which makes it almost an understatement to call eBPF a mere "extension" of BPF):
- Unlike its predecessor, eBPF is not limited to the Linux kernel networking subsystem. It can access virtually any resource that the Linux kernel can "attach" to.
- Whereas BPF supported only relatively simple, network-oriented operations, eBPF is equipped with a wide array of tools that can extend the capabilities of a developed program.
- A massive community effort has arisen around eBPF, resulting in a number of Software Development Kits (SDKs) and tools that ease the process of developing eBPF programs.
As we explain below, you can use eBPF's dynamic functionality to support a wide range of use cases, from performance monitoring, to networking observability and security and beyond.
Read more about eBPF observability.
How does eBPF work?
eBPF works by allowing developers to execute custom code in kernel space by following this process:
- They write an eBPF program.
- They load the program into the kernel. Typically, they do this with a user space tool, such as bpftool, that allows interaction with the eBPF framework.
- The program undergoes a verification process, which checks for issues like whether the program might attempt to access memory beyond its designated memory regions. eBPF program verification helps ensure that the program won't destabilize the system.
- If the program passes the verification check, the kernel runs it using the code path specified in the program, and developers can view the output wherever the program was designed to expose it.
eBPF Architecture: the Linux Kernel, User Space, and the System Calls
We just gave you an overview of how eBPF works. To take a deeper dive – and to explain fully which key benefits eBPF offers – let's explore the eBPF architecture and core concepts.
If you know much about operating systems, you’re probably familiar with the concepts of user space and kernel space. User space – also known as user land – is where your standard applications – those controlled by non-privileged users – run. Kernel space, in contrast, is reserved for processes and programs associated with the kernel.
This bifurcated operating system architecture (which exists to isolate ordinary programs from the kernel and help ensure that a buggy or malicious user land program won't crash the entire system) means that if programs that run in user space (as a traditional monitoring agent would, for example) want to access low-level system data – like information about system calls – they have to ask the kernel for that data, because only the kernel can access it directly. This approach is slow, and compared to the eBPF approach, it leads to consumption of a significant amount of memory and CPU resources.
But since eBPF programs run directly in the kernel, they can access kernel-level resources directly. For example, if you want to inspect every call to the exec() system call (which is responsible for creating new processes) you can use eBPF to do so. (Check out the code for execsnoop for an example of how you'd implement this.)
This approach provides several critical advantages over relying on traditional applications to provide visibility into what's happening deep inside the operating system.
Flexibility and programmability of eBPF
Because eBPF can inspect system calls using custom code, it provides an enormous degree of flexibility for addressing a variety of use cases in nuanced ways. You can use eBPF to access all user land and kernel space memory and resources.
This makes eBPF different from other tools that let you access some system data, but only in rigid, predefined ways. For example, you can use dmesg to print the contents of the kernel buffer, which can provide some visibility into kernel events. But with dmesg, you're limited to the information stored in the kernel buffer. You can't customize the tool to display other types of information.
In contrast, using eBPF, the sky is the limit when it comes to the types of data you can view and how you can view it. eBPF's programmability also makes it possible to control the way output is exposed and structured, making the technology even more customizable.
Efficient resource utilization and performance monitoring
The ability to run programs in kernel space enables massive efficiency gains. This is because, again, eBPF doesn't require you to waste resources having user space applications request data from the kernel. eBPF programs can access the data directly.
Enhanced security and kernel-level visibility
Because eBPF programs run in sandboxed environments and have to pass a verification process before they can run, eBPF minimizes the risk of introducing security or stability problems into the system.
This is advantageous compared to approaches like Linux kernel modules. Kernel modules are another way of inserting custom code into the kernel, but they can cause the system to crash if they turn out to be buggy, and insecure modules could lead to breaches that spread to other parts of the system.
Dynamic tracing and observability capabilities
The ability to insert custom code into the Linux kernel on demand makes eBPF very dynamic for tracing and observability purposes. To run an eBPF program, modifying the kernel source code is not necessary. You can load and execute the program on demand within a running kernel.
This flexibility means you can dynamically track a "living and breathing" system without having to change anything in it.
Now that you know how eBPF came to be and why it's so powerful, let's talk about how it actually works.
In general, to deploy an eBPF program, you need to do these things:
Write and compile your code
eBPF code is most often written in "Restrictive C," then compiled into eBPF bytecode. Clang is the de facto standard for compilation.
When writing your code and defining a code path, you can reference bpf_helper kernel functions to perform a variety of common operations, like memory copying, retrieval of pid and timestamp attributes and communication with other applications (using eBPF maps, which define eBPF data structures). So, you don't typically have to write a lot of custom code from scratch. The customization is limited to whichever specific functionality you want to implement.
Verification and loading
To deploy your compiled eBPF program, first call the bpf() syscall, which passes the bytecode into the kernel verifier. The kernel verifier's job is to make sure the program won't cause issues with the kernel. If verification succeeds, the kernel JIT compiler will turn it into machine code ready to be executed.
Once loaded and verified, your program is ready to execute. It will monitor whichever code flow you attached it to – whether in kernel space, user space or both. Once it runs, you can access program input or output using eBPF maps or predefined file descriptors.
The following (partial) snippets from Cilium’s Golang eBPF framework should help illustrate the procedure (on the top is the eBPF code, below it is the user space application which loads and communicates with the eBPF program).
Upon compiling and running the application, it will count the number of new programs being executed on your system. Pretty neat!
Key features and benefits of EBPF
The eBPF approach allows you to do some amazing things, especially in the context of monitoring and observability:
- Run custom programs on demand without having to modify the kernel source code or deploy special user space applications (beyond the tools required to load eBPF programs).
- Reduce performance overhead. Thanks to the minimal CPU and memory consumption of eBPF programs, eBPF leaves more resources for your actual workloads to use.
- Adopt an approach to monitoring and observability that is standardized across all modern Linux releases. Regardless of your Linux distribution or version, eBPF is built into the kernel source code.
eBPF use cases
The ability to inspect all network events and network traffic flows into and out of a system and determine which processes are interacting with them helps admins troubleshoot networking issues. It also provides critical context that can be useful when you're trying to figure out whether a performance issue with an application is triggered by a network event or the application itself. And you can do it all without the overhead of an Istio service mesh or similar solution.
With eBPF, you can calculate performance metrics for individual applications, microservices and processes in a highly granular way. This makes it possible to establish performance profiles and track them on a continuous basis to detect anomalies.
Although you could also do these things with traditional performance monitoring software, eBPF allows you to obtain this visibility in a much more efficient way, because you don't have to deploy and configure agents for tracking each application or process that you want to monitor.
You can use eBPF tracing to collect traces and contexts from running applications. This is valuable if you need to determine which part of an application is causing a bottleneck when processing requests.
eBPF profiling leverages the power of the eBPF framework to gain deep insights into the resource utilization of individual applications or processes. Unlike traditional monitoring tools, which primarily focus on aggregate system resource consumption, eBPF profiling provides a super granular understanding of how much CPU, memory, and network resources a specific application or process is consuming. This detailed data is invaluable for pinpointing resource-intensive applications, identifying potential optimization, and making informed decisions about workload migration. While most monitoring and observability tools track system-wide resource usage, eBPF profiling empowers you to scrutinize the behavior of individual components.
eBPF makes it possible to detect and investigate anomalies in privileged operations. For instance, you can identify unusual processes that have launched on a system and track their behavior to determine which resources they are trying to access or which data they are transmitting over the network. In this way, eBPF complements and helps support other processes that you have in place as part of the security systems protecting your server.
Getting started with eBPF in Kubernetes
Although eBPF is built into the kernel source code of modern versions of Linux, using it to observe environments like Kubernetes requires some setup.
Requirements and prerequisites for using eBPF
First, you must ensure that your kernel version supports eBPF. eBPF was introduced with Linux 4.16, so earlier versions won't work. In addition, more modern kernels will offer more eBPF features, since eBPF has evolved over time.
Thus, as a first step in using eBPF on Kubernetes, check the kernel version of your nodes (which you can do by running the uname -r command) to verify that they support the eBPF functionality you need.
Setting up and configuring eBPF in a Kubernetes cluster
The way you actually use eBPF on Kubernetes will depend on which tools you choose for interacting with eBPF. Although eBPF is built into the kernel source code, you need user space utilities – like those provided by bpftrace – to load and execute eBPF programs.
After choosing your tools, you'll need to deploy them on each node. You can do this manually by logging into each node, but a more efficient approach is to set up a DaemonSet that loads the eBPF programs onto each node through the Kubernetes scheduler.
eBPF best practices
To get the most out of eBPF in Kubernetes and other environments, consider the following best practices:
- Keep up to date: eBPF is evolving quickly, with new utilities and frameworks appearing regularly. Follow development communities to keep up to date.
- Stay positive: We'll be blunt: Learning to write eBPF programs that pass the verification process can be frustrating. To conquer this learning curve, keep a positive attitude and strive to learn from your mistakes so that your programs get better over time.
- Leverage multiple distributions: For best results, test eBPF code across multiple Linux distributions. Although in theory eBPF is standardized across all Linux releases, in reality small details can break a program on an untested kernel, so don't assume that a program that you can run on one node will also work flawlessly on another.
Leveraging eBPF SDKs
Once upon a time – which is to say, in the mid-2010s – writing and loading eBPF code was a lot of work, because the tooling surrounding eBPF was not yet mature.
That has changed over the past few years, thanks to the introduction of more tools to simplify eBPF usage, alongside continuous improvements in bpf_helper functions and eBPF maps.
At the same time, external toolchains help to simplify eBPF bootstrapping and development. Key examples include BCC and libpf (which is now being maintained as part of the Linux kernel, and is therefore starting to become the de facto option). There are also eBPF-friendly compilers like Clang, as noted above.
And, for those of you who wish to step up their eBPF game a notch in terms of development in modern languages, there are solutions thanks to projects that make it possible to write user space code to interact with eBPF programs in languages like Python, Golang and Rust.
Overall, the eBPF toolchain ecosystem is fast-changing, and it remains a bit too early to say exactly which tools will end up gaining widespread adoption. But we can definitely say that the tooling surrounding eBPF is increasingly mature, and that it gives developers less and less reason to shy away from taking advantage of eBPF. Thanks to the great tools out there, even the most, well, motivationally-challenged coders among us can write and load eBPF code!
The downsides of eBPF
On balance, it's worth noting that eBPF doesn't solve every problem under the sun, and that it's subject to certain limitations that will probably never go away.
One is that writing eBPF code to comply with the kernel verifier can be tricky, especially for newcomers. And if your program is rejected, the verifier doesn't always bother to explain why. Tools that will inspect your program independently of the verifier help to address this challenge, but they don't eliminate the risk that your program will be rejected, and you will become very frustrated trying to figure out why. And because verification doesn't happen until runtime, you face the risk that one kernel will accept your program but a different version of the kernel will reject it.
Another limitation of eBPF is that eBPF programs are limited in stack space, which makes development more difficult and less intuitive. (They were also previously limited in instruction size, but that limitation was effectively removed in kernel 5.3). You need to learn to be efficient in writing eBPF code to make it work at scale.
A third issue is that, although eBPF is implemented in the Linux kernel, differences between kernel versions and customizations across different Linux distributions mean that eBPF programs are not always as portable as you might expect them to be. If you have some nodes running, say, Alpine Linux and others running Ubuntu, you could discover the hard way that your program doesn't work across all nodes. Work is underway to improve eBPF portability, but it's still not as seamless as you might wish.
eBPF security: How to maximize eBPF safety
eBPF is safe by design – in fact, it deserves accolades for how carefully it works to mitigate risks via program verification and sandboxing. Still, the verification process won't always catch every potential issue, and creative attackers could theoretically find ways to break out of a program sandbox.
That's why it's important to take extra steps to ensure the safety and security of eBPF, just as you would with any technology. One way to do this is using CAP_BPF, which provides granular control over the privileges processes have for interacting with eBPF.
More generally, you should be sure to keep your system patched and up-to-date so that it's secure against any vulnerabilities impacting eBPF or the utilities you rely on to interact with it.
Using Flora as your eBPF-based observability agent
We explained above how you can interact with eBPF using various open source tools. But if you want a simpler and more powerful way to take advantage of eBPF for observability (and who doesn't), we recommend Flora, an eBPF observability agent created by groundcover.
Flora allows you to collect all of the data you need to observe your system and workloads in a lightweight, efficient way. Flora covers the entire stack, and you can use it without the trouble of having to set up a toolchain for working with eBPF from scratch.
What is the future of eBPF?
We don't know if folks will still be eating avocado toast a decade or two from now. But we are very confident that Kubernetes developers and admins will be leveraging eBPF to help understand what is happening inside nodes and pods.
That's especially true given that the eBPF ecosystem is becoming increasingly organized, thanks to groups like the eBPF Foundation. And as Thomas Graf notes of eBPF development, we're starting to see "large companies like Google and Facebook maintaining this and driving this forward."
So, if you've been resisting the eBPF revolution, now's the time to surrender. The future of Kubernetes observability – among other things – hinges on eBPF, and you may as well start learning to use it.