Goodbye Sidecars: Could eBPF Steal Istio Service Meshes' Thunder?
From the basics of Istio service mesh to its benefits, this article covers everything you need to know about Istio service mesh and where eBPF comes into play
The concept of sidecars has become so commonplace in the world of containers and microservices that it's easy to think of sidecars as a natural, healthy part of cloud-native technology stacks.
But if you step back and think about it, you realize that sidecars are not necessarily so grand. After all, they're called sidecars in reference to the literal sidecars that you can affix to motorcycles if you want to carry things that won't fit on the cycle itself. Although sidecars solve the problem of having limited capacity on a motorcycle, they also slow down the cycle considerably, and they make it harder to maneuver. There's a reason your typical biker dude doesn't have a sidecar dragging him down.
Fortunately, microservices applications also no longer have to have the wind taken out of their sails by sidecars. Thanks to technologies like eBPF, it's becoming possible to do the things that sidecars used to do in a distributed app, without the downsides they introduce.
To explain why, let's walk through the role of sidecars – and service meshes like Istio, of which they are a part – within cloud-native applications. Then, we'll look at how eBPF presents a simpler, more efficient alternative to Istio and traditional service mesh architectures.
What is a service mesh?
A service mesh is a layer within a technology stack that helps to connect, secure and monitor the various components of a distributed application.
You typically wouldn’t use a service mesh if your application is a monolith, meaning it runs as a single process without a complex web of dependencies and interprocess communication. But when you move to a microservices architecture, you run into the challenge of having to manage communications between discrete microservices. You also need to ensure that microservices transactions are secure, and you need an efficient way to collect observability data from each microservice.
You could, if you wanted, handle these requirements by instrumenting them within the code of your microservices themselves. But that would not make your developers very happy, because it would mean they'd spend loads of time tediously writing and maintaining custom code in each and every microservice to handle connectivity, security and telemetry.
Service meshes solve this problem by providing a centralized means of governing services. In essence, service meshes allow developers to outsource most of the work required to manage microservices connectivity, security and observability to a dedicated infrastructure layer, rather than having to handle these tasks within the microservices themselves. In this way, service meshes help to simplify and standardize the way microservices are managed.
Sidecars and service meshes
Of course, service meshes can't magically talk to or integrate with microservices. They need a means of connecting to them. Traditionally, that means involving what's known as the "sidecar pattern."
Under the sidecar pattern, you deploy a special container – known as the sidecar – alongside the primary application container that hosts the business logic for each microservice. The sidecar hosts a service mesh agent, which is responsible for managing the microservice. If you run the sidecar and the main container inside the same Pod, the sidecar container can integrate with the main container to enforce whichever governance rules you define within your service mesh.
The sidecar pattern has historically made a lot of sense for managing microservices within distributed apps that are deployed as containers and orchestrated using Kubernetes. In the absence of a better technique for connecting the service mesh to individual application containers, deploying sidecar containers alongside the actual microservices was a simple and straightforward way of weaving the service mesh into a microservices architecture.
Why everyone loves Istio
There are a number of service meshes out there today, like Linkerd and Traefik. But probably the most popular solution is Istio, an open source service mesh designed especially for Kubernetes-centric stacks.
Istio implements a service mesh by providing two main components:
- A data plane, which relies on sidecar containers running the Envoy proxy to interface with individual microservices.
- A control plane, which runs as a centralized process to provide service discovery, enforce configurations and secure traffic.
Istio's open source nature and Kubernetes-friendly design have made the tool a core part of thousands upon thousands of cloud-native hosting stacks to date.
The dark side of Istio
Istio and other service meshes that depend on the sidecar pattern solve real problems, and you certainly can't blame anyone for using them – especially when there weren't real alternatives available.
But if you were to design a perfect solution for managing the challenges of connecting, securing and observing a distributed application, a service mesh like Istio would not be it. Istio and similar meshes suffer from two key problems: High resource consumption and slow performance.
Having to run a sidecar container alongside each and every microservice in your distributed hosting environment doubles the total number of containers you have running. That means that your application ends up consuming more resources. In addition to the resources consumed by the sidecar containers themselves, the orchestrator has to work harder to manage the sidecars, and you consume more network bandwidth deploying and updating the sidecars.
When you tie up resources running sidecars, there are fewer resources available to your actual application, which can translate to lower performance during times of peak demand. It may also lead to higher hosting costs, since you'll ultimately need more nodes (or more expensive nodes with higher resource allocations) to handle your workload.
Performance and latency
Beyond the cost of hosting sidecars, the fact that sidecar containers insert themselves in the middle of network traffic as it flows into and out of each microservice can also create a drag on performance. Every packet has to pass through the sidecar before your application can receive and respond to requests, which increases latency and may negatively impact your user experience.
Istio performance downsides, by the numbers
In case you're wondering whether the performance overhead of sidecar containers is actually more than negligible, let's take a look at the figures that Istio itself documents about performance.
While performance overhead will vary depending on exactly what you configure Istio to do (the more features you use, the higher the overhead), Istio says that each Envoy proxy will consume 0.35 vCPU and 40 MB memory per 1000 requests.
So, if you have ten microservices, and you deploy an Envoy sidecar for each one, you'll need an extra 3.5 vCPUs and 400 megabytes of memory to host them. That could easily translate to the equivalent of an extra VM instance just to run the sidecars. (We won't even mention the control plane, which uses another 1 vCPU and 1.5 GB, according to Istio.)
Note, too, that Istio says that each proxy container adds 2.65 milliseconds to the 90th percentile latency, on average. So, you'll slow down your responses by that number whenever they have to pass through a sidecar. Admittedly, 2.65 ms is not huge, but it can be disruptive in a world where every millisecond counts, especially for applications that need to perform in true real time.
Goodbye, sidecars; hello, eBPF!
Historically, developers and IT teams typically saw the performance and latency costs incurred by sidecar containers as a necessary evil. Using a service mesh with a sidecar pattern was a lot easier than not using one and having to instrument governance within each microservice, so they were happy to pay a little more for hosting and/or accept a performance hit in order to centralize microservices management within a service mesh.
Today, however, a better world has become possible – thanks to eBPF, which makes it possible to run hyper-efficient, hyper-secure, dynamic code directly within the Linux kernel, without having to deal with kernel modules or modify the kernel sources.
What that means for engineers who need a service mesh is that, using eBPF, the microservice governance that has traditionally been implemented using sidecar containers could instead be handled in the kernel via eBPF programs. Since the eBPF programs can run on every (Linux-based) node in a Kubernetes cluster, they could manage microservice connectivity, security and observability from right within the kernel, instead of having to operate as separate sidecars.
This approach would solve several challenges associated with traditional service meshes like Istio:
- Performance: Because eBPF programs consume minimal resources, they would dramatically reduce the overhead of running a service mesh, as compared to using a sidecar architecture.
- Simplicity: An eBPF-based service mesh would eliminate the need to deploy and manage a suite of sidecar containers.
- Visibility and control: By running directly within the kernel, eBPF programs have virtually unlimited scope in terms of which data they can access from within containers and which control they can exert over them. In this respect, eBPF-based meshes would be more powerful than those that depend on sidecar containers.
eBPF has many use cases, and leveraging it to solve the shortcomings of traditional service meshes remains a relatively new idea. However, developers are devoting increasing attention to this strategy, which has already been implemented by Cilium.
eBPF's bright future
So, in case you were looking for yet another reason why eBPF is revolutionizing the way developers approach security, observability and management within distributed applications across all layers of the stake, add eBPF's potential as a service mesh solution to your list. In addition to making it easier to collect rich data for observability purposes, and to secure data as it moves within and between containers, eBPF just may take the crown from service meshes like Istio as a simpler, more effective and less resource-hungry solution for managing interactions between microservices.
This isn't to say that Istio or its equivalents will totally disappear. We can imagine a world where the Istio control plane remains, but where the data plane is driven by eBPF programs instead of Envoy proxies running in sidecar containers. Istio has developed a lot of powerful technology for service discovery and configuration management, and that functionality will remain relevant in an eBPF-based service mesh.
But we expect that sidecar containers will look increasingly dated over the coming years – just like the sidecars attached to motorcycles. Those who prioritize speed and efficiency will turn to eBPF as a means of freeing themselves from the limitations of sidecars.