One of the great things about Kubernetes is that it lets you define resources using code, then deploy them with a single kubectl apply command.

But if there’s something wrong with your code – if it’s missing an important value, for example, or contains an insecure setting – Kubernetes won’t do anything to detect and fix the error. It will happily apply whatever you tell it to apply.

Fortunately, there’s a convenient way to validate Kubernetes resources: Open Policy Agent (OPA). By making it possible to define and execute Kubernetes admission control checks that automatically evaluate whether Kubernetes resource definitions meet criteria admins define, OPA helps teams ensure consistency and avoid errors within Kubernetes environments.

Read on for the details on how OPA works, how to deploy and work with OPA in Kubernetes, and best practices for getting the most from OPA.

What is Open Policy Agent (OPA)?

Open Policy Agent, or OPA, is an open source policy engine that uses a policy-as-code approach to defining and validating policies.

Put another way, OPA lets you write code that defines policies, meaning conditions or rules that you want software resources to conform with. Then, you can use OPA to compare the actual state of your resources with the rules you configured in your policies and identify any deviations between them. OPA policies are written in a declarative query language called Rego.

OPA is a general-purpose policy engine that is not specific to Kubernetes; you can use it to validate policies on virtually any system whose resources are defined using code. As we’ll see below, however, OPA can be especially useful as a way of checking resource configurations in Kubernetes.

How does OPA work with Kubernetes?

Most Kubernetes resources (such as nodes, pods, and application deployments) are configured using code – usually in the form of YAML or JSON. OPA works with Kubernetes by evaluating the YAML code of Kubernetes resources and checking whether the Kubernetes resource descriptions align with policies configured in OPA.

To be clear, this doesn‘t mean that you use OPA to configure Kubernetes resources directly. You do that using YAML. Instead, it means that you can use policy rules that you configured in OPA to validate Kubernetes resource definitions.

For example, imagine that you want to check whether the containers in your cluster have memory limits before deploying them. To do this, you could write an OPA policy like the following:

package kubernetes.admission

deny[message] {
  input.request.kind.kind == "Pod"

  not input.request.object.spec.containers[_].resources.limits.cpu
  message := "CPU limit is required for all containers"
}

deny[message] {
  input.request.kind.kind == "Pod"

  not input.request.object.spec.containers[_].resources.limits.memory
  message := "Memory limit is required for all containers"
}

Now, imagine you used this policy to evaluate a pod configuration like the following (note that this configuration includes a CPU limit, but no memory limit):

request:
  kind:
	kind: Pod
  object:
	spec:
  	  containers:
    	     - name: nginx
      	     image: nginx:latest
      	     resources:
        	        limits:
          	           cpu: "500m"

An OPA evaluation in this case would generate a message stating that the container pod configuration violates the OPA policy because there is no memory limit for the container.

Benefits of using OPA with Kubernetes

While there is no requirement to use OPA with Kubernetes, doing so opens the door to several important benefits:

  • Enhanced security: OPA can help prevent security oversights, such as workloads with insecure network settings or container images downloaded from untrusted registries, by automatically blocking them through Kubernetes admission control rules.
  • Compliance assurance: Along similar lines, OPA helps provide confidence that Kubernetes resources conform with compliance requirements as they are defined in OPA policies.
  • Scalable policy evaluations: OPA’s policy-as-code approach, combined with its ability to evaluate Kubernetes resource definitions automatically, makes it a scalable, efficient way to validate configurations and enforce them through automated admission control.
  • Centralized policy management: With OPA, you can manage all of your policies through a central tool, making it easy to keep track of which policies you have in place and to update them as needed.

OPA deployment options for Kubernetes: Sidecars vs. Gatekeeper

OPA provides a command-line tool, called opa, that you can use to evaluate individual Kubernetes resource definition files against OPA policies manually. But that approach doesn’t scale. It also doesn’t allow you to validate resource configurations automatically and block those that violate policies before the resources are deployed.

To do these things, you need to deploy OPA in such a way that it integrates directly with the Kubernetes API server and automatically checks resource definitions whenever they appear or change (regardless of the Kubernetes deployment strategies you choose).

There are two viable approaches for deploying OPA in a Kubernetes cluster:

  1. Sidecar containers: This approach uses OPA’s built-in admission controller, which integrates with Kubernetes via a ValidatingWebhookConfiguration. It also deploys kube-mgmt, which manages OPA policies for Kubernetes, to run an OPA service via sidecar containers in your Kubernetes cluster. Kube-mgmt watches the Kubernetes API server to detect changes to resources, then evaluates them based on OPA policies.
  2. Gatekeeper: Gatekeeper is a Kubernetes add-on that deploys OPA as a native Kubernetes custom resource definition (CRD). It also interfaces with the Kubernetes API server to process admission requests, but it doesn’t require the use of sidecar containers.

The first option is more straightforward because it doesn’t require the implementation of CRDs. However, Gatekeeper is generally viewed as a better way of deploying OPA in Kubernetes because it provides native integration. It also avoids the overhead of sidecar containers.

Sidecars vs. Gatekeeper

| | Sidecars | Gatekeeper | |---|---|---| | Deployment method | Runs OPA agent as sidecar containers and OPA admission controller. | Uses custom resource definitions (CRDs). | | Overhead | Higher. | Lower. | | Extensibility | Limited. | High (offers options like external data support). |

How to deploy OPA Gatekeeper in Kubernetes

Gatekeeper supports two main deployment approaches: Using a YAML manifest or using Helm charts. Both are relatively simple.

Using YAML

To deploy OPA Gatekeeper in a Kubernetes cluster using YAML, simply apply the Gatekeeper manifest, which includes a Gatekeeper OPA container image and relevant deployment metadata:

kubectl apply -f https://raw.githubusercontent.com/open-policy-agent/gatekeeper/v3.19.1/deploy/gatekeeper.yaml

Be sure that you have cluster admin permissions when running the command.

Using Helm

To deploy OPA Gatekeeper with Helm, simply add the Gatekeeper repository, then install the Helm chart:

helm repo add gatekeeper https://open-policy-agent.github.io/gatekeeper/charts
helm install gatekeeper/gatekeeper --name-template=gatekeeper --namespace gatekeeper-system --create-namespace

Again, make sure you have cluster admin permissions.

Installing constraint templates and constraints

After deploying OPA, you’ll need to set up a ConstraintTemplate. A constraint template defines the conditions that you want to be present in Kubernetes resources (although it doesn’t include the actual values, which are defined in constraints, as we’ll explain in a moment).

For instance, here’s a simple ConstraintTemplate that prevents the use of container images from non-approved registries:

apiVersion: templates.gatekeeper.sh/v1beta1
kind: ConstraintTemplate
metadata:
  name: k8sallowedrepos
spec:
  crd:
	spec:
  	names:
    	kind: K8sAllowedRepos
  	validation:
    	openAPIV3Schema:
      	properties:
        	repos:
          	type: array
          	items:
            	type: string
  targets:
	- target: admission.k8s.gatekeeper.sh
  	rego: |
    	package k8sallowedrepos

    	violation[{"msg": msg}] {
      	container := input.review.object.spec.containers[_]
      	valid := [repo | repo := input.parameters.repos[_]; startswith(container.image, repo)]
      	not any(valid)
      	msg := sprintf("container image '%s' is not from an allowed registry", [container.image])
    	}

    	violation[{"msg": msg}] {
      	container := input.review.object.spec.initContainers[_]
      	valid := [repo | repo := input.parameters.repos[_]; startswith(container.image, repo)]
      	not any(valid)
      	msg := sprintf("initContainer image '%s' is not from an allowed registry", [container.image])
    	}

To install the constraint template, save it as a file and apply it with kubectl:

kubectl apply -f registry-constraint-template.yaml

At this point, the ConstraintTemplate is deployed, but OPA is not yet enforcing it through the admission controller. To tell OPA to begin enforcing the rule defined in the template, you need to create and install a constraint that matches the template, such as the following:

apiVersion: constraints.gatekeeper.sh/v1beta1
kind: K8sAllowedRepos
metadata:
  name: allowed-repos
spec:
  match:
	kinds:
  	- apiGroups: [""]
    	kinds: ["Pod"]
  parameters:
	repos:
  	- "gcr.io/your-approved-repo"
  	- "docker.io/library"

This constraint defines the specific values (including the paths to valid container registries) that OPA will evaluate.

To install the constraint, save the code as a file and apply it with kubectl:

kubectl apply -f registry-constraint.yaml

To check whether your constraint was successfully deployed, use the following command, which lists active constraints:

kubectl get constraints

Writing OPA policies with Rego

As we mentioned, OPA uses the Rego declarative query language to write policy code. Thus, you’ll use Rego to spell out the rules that you want to enforce via OPA.

Rego design principles

Rego is designed to serve two key goals:

  1. Ensuring that policies are unambiguous, such that an evaluation results in either true or false.
  2. Keeping policy definitions and other OPA data readable to humans.

To achieve these goals, Rego uses a declarative approach in which admins define what a query should return, rather than defining how to execute the query.

Rego policy examples

To illustrate Rego code in action, here’s a look at policies that support some common use cases for OPA in Kubernetes.

First, let’s examine an example policy that requires the use of the labels app and team for Kubernetes resources:

apiVersion: templates.gatekeeper.sh/v1beta1
kind: ConstraintTemplate
metadata:
  name: k8srequiredlabels
spec:
  crd:
	spec:
  	names:
    	kind: K8sRequiredLabels
  	validation:
    	openAPIV3Schema:
      	properties:
        	labels:
          	type: array
          	items:
            	type: string
  targets:
	- target: admission.k8s.gatekeeper.sh
  	rego: |
    	package k8srequiredlabels

    	missing_labels[label] {
      	required := input.parameters.labels[_]
      	not input.review.object.metadata.labels[required]
      	label := required
    	}

    	violation[{"msg": msg}] {
      	count(missing_labels) > 0
      	msg := sprintf("Missing required labels: %v", [missing_labels])
    	}

To enforce this policy (which means using the admission controller to validate resources before allowing them), you would apply a constraint like the following:

apiVersion: constraints.gatekeeper.sh/v1beta1
kind: K8sRequiredLabels
metadata:
  name: require-app-and-team-labels
spec:
  match:
	kinds:
  	- apiGroups: [""]
    	kinds: ["Pod"]
  	- apiGroups: ["apps"]
    	kinds: ["Deployment", "StatefulSet", "DaemonSet"]
  parameters:
	labels:
  	- "app"
  	- "team"

Second, here’s a sample policy that restricts the use of image registries other than those that are specifically approved. (This is the same example we used above when looking at how to define and install ConstraintTemplates and constraints.)

apiVersion: templates.gatekeeper.sh/v1beta1
kind: ConstraintTemplate
metadata:
  name: k8sallowedrepos
spec:
  crd:
	spec:
  	names:
    	kind: K8sAllowedRepos
  	validation:
    	openAPIV3Schema:
      	properties:
        	repos:
          	type: array
          	items:
            	type: string
  targets:
	- target: admission.k8s.gatekeeper.sh
  	rego: |
    	package k8sallowedrepos

    	violation[{"msg": msg}] {
      	container := input.review.object.spec.containers[_]
      	valid := [repo | repo := input.parameters.repos[_]; startswith(container.image, repo)]
      	not any(valid)
      	msg := sprintf("container image '%s' is not from an allowed registry", [container.image])
    	}

    	violation[{"msg": msg}] {
      	container := input.review.object.spec.initContainers[_]
      	valid := [repo | repo := input.parameters.repos[_]; startswith(container.image, repo)]
      	not any(valid)
      	msg := sprintf("initContainer image '%s' is not from an allowed registry", [container.image])
    	}

The following constraint would enforce this policy:

apiVersion: constraints.gatekeeper.sh/v1beta1
kind: K8sAllowedRepos
metadata:
  name: allowed-repos
spec:
  match:
	kinds:
  	- apiGroups: [""]
    	kinds: ["Pod"]
  parameters:
	repos:
  	- "gcr.io/your-approved-repo"
  	- "docker.io/library"

Finally, here’s a sample Rego policy that enforces memory and CPU limits:

apiVersion: templates.gatekeeper.sh/v1beta1
kind: ConstraintTemplate
metadata:
  name: k8srequiredresourcelimits
spec:
  crd:
	spec:
  	names:
    	kind: K8sRequiredResourceLimits
  targets:
	- target: admission.k8s.gatekeeper.sh
  	rego: |
    	package k8srequiredresourcelimits

    	missing_limits(container) = missing {
      	missing := []

      	not container.resources.limits.memory
      	missing := array.concat(missing, ["memory"])

      	not container.resources.limits.cpu
      	missing := array.concat(missing, ["cpu"])
    	}

    	get_containers := input.review.object.spec.containers
    	get_init_containers := input.review.object.spec.initContainers

    	violation[{"msg": msg}] {
      	container := get_containers[_]
      	missing := missing_limits(container)
      	count(missing) > 0
      	msg := sprintf("Container '%s' is missing resource limits: %v", [container.name, missing])
    	}

    	violation[{"msg": msg}] {
      	container := get_init_containers[_]
      	missing := missing_limits(container)
      	count(missing) > 0
      	msg := sprintf("InitContainer '%s' is missing resource limits: %v", [container.name, missing])
    	}

You can enforce this policy with the following constraint:

apiVersion: constraints.gatekeeper.sh/v1beta1
kind: K8sRequiredResourceLimits
metadata:
  name: enforce-container-resource-limits
spec:
  match:
	kinds:
  	- apiGroups: [""]
    	kinds: ["Pod"]
  	- apiGroups: ["apps"]
    	kinds: ["Deployment", "StatefulSet", "DaemonSet", "ReplicaSet"]

OPA Kubernetes best practices for scalable policy management

To get the most out of OPA in Kubernetes (and, indeed, OPA in general), consider the following best practices:

  • Employ modular policy design: Although it’s possible to write Rego code that will test for multiple conditions (like check labels while also checking CPU and memory limits), it’s a best practice to keep policies modular by creating a separate policy for each condition that you want to evaluate. This makes it easier to apply policies granularly, and to modify policy specifications on a policy-by-policy basis.
  • Use descriptive error messages: Instead of configuring policies to generate generic messages like “true” or “false,” consider including descriptive errors to help admins understand exactly what was evaluated.

Reuse policies across environments: In general, it’s best to reuse policies across namespaces and resources as much as possible. Trying to maintain separate policies for each namespace, team leads to clutter and management challenges.

| Practice | Description | |---|---| | Modular policy design | Design each policy to evaluate just one condition. | | Descriptive error messages | Make error messages descriptive, instead of producing simple messages like "true" or "false". | | Policy reuse | Reuse policies across environments to the extent possible, rather than trying to create different policies for each environment. |

How groundcover enhances OPA in Kubernetes

While OPA will tell you when something is wrong with a Kubernetes resource configuration before you deploy it, groundcover tells you when something goes wrong with a running workload. It also provides the powerful Kubernetes monitoring and Kubernetes observability you need to get to the root of even the most complex Kubernetes performance challenges.

By combining OPA with groundcover, you get double assurance: The confidence of knowing that your resources conform with policy rules defined in OPA, and the confidence of being able to identify and troubleshoot performance issues quickly and efficiently, using groundcover’s unique, eBPF-based approach to observability.

OPA for the win

If OPA didn’t exist, confirming that Kubernetes resources are actually configured in the way they should be configured would be tough, to put it mildly. You’d have to review your configurations by hand, and hope that you didn’t miss any important details.

Fortunately, Kubernetes admins’ lives are easier thanks to OPA, which automates and scales policy enforcement across your cluster.

Sign up for Updates

Keep up with all things cloud-native observability.

We care about data. Check out our privacy policy.