What is eBPF, anyway, and why should Kubernetes admins care?
Discover the ins and outs of eBPF and why it is particularly exciting when it comes to observing your containers and Kubernetes clusters


eBPF is kind of like avocado toast: The ingredients on which it's based have been around for a long time. And yet, it has been only within the past couple of years that eBPF suddenly emerged as one of the latest, greatest buzzwords in IT.
We can't explain how avocado toast became a culinary sensation across global hipsterdom. But we can tell you why eBPF has become such a big deal for revolutionizing Kubernetes observability, among other tasks. Let’s look at the history of eBPF, how it works, which problems it solves and why you – yes, you – should start taking advantage of it.
What is eBPF? A brief history...
You can't understand the history of eBPF – which stands for "extended Berkeley Packet Filter" – unless you understand the plain-old Berkeley Packet Filter, or BPF tout court.
BPF was introduced in 1993 as a way of equipping the Linux kernel with a programmable, highly efficient virtual machine that could control and filter traffic. This was meaningful at the time because Linux had recently gained software-defined networking support, and BPF provided a powerful means of operationalizing it.
That said, although there was some early buzz around BPF in the 1990s, it more or less faded into obscurity shortly after. The reason why was that gaining kernel-level programmable control over network traffic didn't actually turn out to be that important for most people in a world where workloads were all running on bare metal or VMs, and traffic could be managed well enough through firewalls and hypervisors.
That all began to change in 2013, when Docker came along and suddenly it became time for containers to shine. (Incidentally, Docker is akin to eBPF in that containers have actually existed for decades, so what Docker did wasn't really fundamentally new; instead, Docker's real achievement was that it managed to make containers popular for the first time, thanks mostly to introducing better tooling). Combined with Kubernetes, which appeared a year later, Docker led to a world in which the ability to filter and control traffic in a very granular, container-by-container way became very valuable.
Enter: "eBPF"
Hence the introduction in 2014 of eBPF, which expands on BPF's original architecture by providing tools that allow programs to run directly in the Linux kernel space.
We'll get to why that's key in the context of containers and Kubernetes in a moment. But first, let's step back and explain what running in kernel space means.
Basically, it's that code is executed by the kernel, as opposed to running in "user space" like standard applications. That's important for three main reasons:
- It allows the code to run very efficiently.
- It allows the code to access low-level kernel resources that would otherwise be complicated and costly (in terms of resource overhead) to access from within user space.
- It lets you observe any and all programs running in user space – which is hard to do when relying on observability tools that operate in user space themselves.
This is why people like Brendan Gregg call eBPF an "invaluable technology" and compare it to JavaScript:
So instead of a static HTML website, JavaScript lets you define mini programs that run on events like mouse clicks, which are run in a safe virtual machine in the browser. And with eBPF, instead of a fixed kernel, you can now write mini programs that run on events like disk I/O, which are run in a safe virtual machine in the kernel.
If you're a die-hard Linux geek, you may be thinking: "OK, but what does eBPF do that I couldn't do before using kernel modules?"
That's a legitimate question to ask. It's true that it has long been possible to execute code in kernel space using kernel modules. But the problem with those modules is that they have to be inserted into the kernel, which makes them more complicated to deploy. They also tend to have complex dependencies, adding to the deployment headache. And they are not particularly safe, because they require you to trust the user to insert only modules that are both stable and secure.
You could also, of course, theoretically modify the Linux kernel itself to run whichever code you want in kernel space. But one does not simply modify the Linux kernel – unless one is a kernel programmer (and there aren't too many of them, comparatively speaking), at least, or one wants to deal with maintaining some kind of customized fork of the kernel, which would be a nightmare to manage.

eBPF solves all of these issues by allowing custom programs to run in isolated kernel-level virtual machines. You can (with the help of various tools, as we'll see below) easily deploy code of your choosing, without having to deal with kernel module dependencies, touch kernel source code or even necessarily have root privileges. Goodbye, modprobe; hello, eBPF bytecode!
eBPF architecture, explained
Now that you know how eBPF came to be and why it's so powerful, let's talk about how it actually works.

In general, to deploy an eBPF program, you need to do these things:
Write and compile your code
eBPF code is most often written in "Restrictive C," then compiled into eBPF bytecode. Clang is the de facto standard for compilation.
When writing your code, you can reference bpf_helper functions to perform a variety of common operations, like memory copying, retrieval of pid and timestamp attributes and communication with other applications (using eBPF maps, which define eBPF data structures). So, you don't typically have to write a lot of custom code from scratch. The customization is limited to whichever specific functionality you want to implement.
Verification and loading
To deploy your compiled eBPF program, first call the bpf() syscall, which passes the bytecode into the kernel verifier. The kernel verifier's job is to make sure the program won't cause issues with the kernel. If verification succeeds, the kernel JIT compiler will turn it into machine code ready to be executed.
Runtime
Once loaded and verified, your program is ready to execute. It will monitor whichever code flow you attached it to – whether in kernel space, user space or both. Once it runs, you can access program input or output using eBPF maps or predefined file descriptors.
The following (partial) snippets from Cilium’s Golang eBPF framework should help illustrate the procedure (on the top is the eBPF code, below it is the user space application which loads and communicates with the eBPF program).
Upon compiling and running the application, it will count the number of new programs being executed on your system. Pretty neat!
Explore related posts


Meet Flora: the eBPF observability agent that was born to outperform
Get to know Flora - groundcover's newly launched eBPF observability agent that delivers full observability while incurring near-zero overhead on the resources of the application it monitors and discover why it's particularly important in modern cloud-native environments.


Prometheus Alertmanager: Manage your Alerts Anywhere
Everything you need to know about Prometheus Alertmanager - why it's important, how it's designed, how to configure it, why we at groundcover love Prometheus Alertmanager so much, and how it has made our lives easier.