Deep Dive into Policy Controllers and their impact on Cluster Management

Author: Andreas Ttofi | Posted on: July 3, 2024



What is a policy engine in Kubernetes?

In this article, we are going to look at how Policy Engines can help enforce organisational standards in Kubernetes. We will do a deep dive into the most popular available open-source solutions and compare their approaches and strengths, to enable readers to more rapidly make informed decisions when faced with the need for enforcing standards.

In every organisation, there are several standards that need to be followed when you are deploying resources in a Kubernetes cluster. Such standards may govern things like the name of the resource, admission criteria, object validation or evaluation. A solution to these existing problems can be policies, which can validate, or even mutate, resource requests to enforce the standards your organisation established.


How does a policy engine work?

Kubernetes policy engines are designed to enforce specific rules and guidelines for the creation, configuration, and management of Kubernetes objects within a cluster. These engines enable administrators to implement and automate governance and compliance requirements across their Kubernetes environments. An overview on how exactly a policy engine works:

Admission Control

Kubernetes policy engines often integrate with the Kubernetes Admission Controllers. Admission Controllers are a part of the Kubernetes API server that intercepts requests to the Kubernetes API before the objects are persisted but after they are authenticated and authorised. Policy engines use this mechanism to evaluate the requests against the defined policies.

Policy Evaluation

When a request is made to create, update, or delete a Kubernetes object, the policy engine evaluates the request against the set of defined policies. This evaluation process involves checking the attributes of the Kubernetes objects in the request against the conditions specified in the policies.

Enforcement

Depending on the outcome of the policy evaluation, the policy engine enforces the policies by either

  • Allowing the request to proceed if it complies with all the defined policies.
  • Denying the request if it violates any policy, optionally returning an error message explaining the reason for the denial.
  • Mutating the request by altering its content to make it compliant with the policies before it is processed by the Kubernetes API server.
  • Reporting and Auditing: Policy engines often provide reporting and auditing capabilities, allowing administrators to review which requests were allowed or denied and why. This helps in compliance reporting and identifying potential issues in the policy definitions or cluster usage.

Policy Definition

The first step involves defining the policies that need to be enforced. Policies can cover a wide range of aspects, such as security practices, resource constraints, naming conventions, and network configurations. These policies are typically defined in a declarative manner, using YAML, JSON and other formats (even language defined), which specify the rules and the actions to be taken when those rules are violated.

Policy Storage

Once defined, policies are stored in a centralised location or within the Kubernetes cluster itself. This allows for easy management, versioning, and distribution of policies across the cluster or multiple clusters.

Integration with DevOps Practices

Kubernetes policy engines can also be integrated into DevOps practices, particularly within Continuous Integration/Continuous Deployment (CI/CD) pipelines. This allows teams to evaluate and enforce compliance of Kubernetes manifests against defined policies before these manifests are deployed to the cluster. This proactive approach, often referred to as shift-left security, ensures that only configurations that meet the organisation’s policy requirements are applied, significantly reducing the risk of deploying non-compliant or potentially insecure resources.


Example of a policy

A simple example of how a policy can validate a resource. A user tries to create a pod in their namespace. Once the request is made for the API server, the admission controller captures it and compares it against the policy. If the request for the new resource(pod) complies with the policy rules set in place, the pod gets created. If the request does not comply, the validation fails and the creation of the pod fails.

alt_text


Deep Dive to each solution

In this article, we are going to make a deep dive into the most popular available open-source solutions , including OPA Gatekeeper, Kyverno, KubeWarden, and JsPolicy. We will start with OPA Gatekeeper, an extension of the Open Policy Agent that provides a comprehensive framework for policy enforcement in Kubernetes, focusing on its declarative policy language, Rego, and how it integrates with Kubernetes' Admission Controllers. Next, we’ll examine Kyverno, known for its Kubernetes-native approach and simplicity, allowing policies to be defined as Kubernetes resources without the need to learn a new language (yaml format). Following that, we will examine KubeWarden, that uses WebAssembly to write and enforce policies, offering a high degree of flexibility and security. Lastly, we’ll examine JsPolicy, which leverages JavaScript or TypeScript for policy definition, appealing to organisations familiar with these languages.


General Comparison of Solutions

alt_text


OPA (Open Policy Agent) Gatekeeper

Our first deep dive into policy engines will be the most mature and widely spread used, OPA Gatekeeper . OPA is a general-purpose policy engine that means its capabilities are not limited to a Kubernetes Cluster. Some of its use cases include microservice authorization, infrastructure, data source filtering, CI/CD pipeline policies and, of course, Kubernetes admission controller. OPA Gatekeeper, on the other hand, is a specialised project that provides integration between OPA and Kubernetes.

Gatekeeper integrates with the Kubernetes API server through the dynamic admission control mechanism, specifically by registering itself as a Validating and Mutating Admission Webhook. This integration is critical for Gatekeeper to enforce custom policies on resources as they are created or updated within the Kubernetes cluster.

Here is a summary on how this process is done:

Admission Webhook Registration

When Gatekeeper is installed in a Kubernetes cluster, it registers itself with the API server as an admission webhook . This registration includes specifying the operations (e.g., CREATE, UPDATE), and the types of resources (e.g., Pods, Services) Gatekeeper should intercept. This is configured in the webhook configuration object, which tells the Kubernetes API server to send certain requests to Gatekeeper for evaluation before processing them.

Request Interception

Once registered, the Kubernetes API server forwards relevant API requests to Gatekeeper before they are persisted in etcd. This happens after the request has been authenticated and authorized, but before it is executed. The request is sent to Gatekeeper as an AdmissionReview object, which includes the resource being created or modified and the operation being performed.

Policy Evaluation

Upon receiving an AdmissionReview request, Gatekeeper evaluates the request against its loaded policies (defined as ConstraintTemplates and instantiated through Constraints). Gatekeeper uses the Rego language to define these policies, allowing for complex logic and evaluation against the attributes of the resources included in the request.

Admission Response

After evaluating the request against the defined policies, Gatekeeper constructs an AdmissionResponse object. This response indicates whether the request should be allowed or denied based on the policy evaluation. If the request is to be denied, the AdmissionResponse includes an error message explaining which policy was violated and why.

Enforcement by the API Server

The Kubernetes API server receives Gatekeeper’s AdmissionResponse and acts accordingly If the response is an “allow,” the API server proceeds with processing the request, eventually persisting the resource in etcd. If the response is a “deny,” the API server rejects the request and returns the error message from Gatekeeper to the user, preventing the resource from being created or updated.

Mutating Admission Webhook

In addition to validating requests, Gatekeeper can also be configured as a mutating admission webhook , which allows it to modify requests to make them compliant with certain policies before they are processed by the API server. This is a more advanced and less commonly used feature, as it requires careful policy design to prevent unintended modifications.


Architecture

alt_text


Pros & Cons of Gatekeeper

Pros

Fine-Grained Policies

  • Gatekeeper allows for the definition of fine-grained policies using the Rego language, providing high flexibility and precision in policy enforcement.

Native Kubernetes Integration

  • As a Kubernetes-native solution, Gatekeeper seamlessly integrates with the Kubernetes ecosystem, leveraging existing components like the Admission Controller for policy enforcement.

Declarative Management

  • Policies in Gatekeeper are defined declaratively, which aligns well with Kubernetes overall design philosophy and simplifies the management of policies as code.

Extensibility

  • Through the use of ConstraintTemplates, Gatekeeper offers a highly extensible framework, allowing organisations to define custom policies that cater to their specific requirements.

Community Support

  • Being part of the Open Policy Agent (OPA) project, Gatekeeper benefits from strong community support and ongoing development efforts, ensuring that it stays up-to-date with the latest in policy management and Kubernetes security trends.

Audit Capability

  • Gatekeeper provides auditing features, allowing administrators to assess historical compliance and identify resources that violate the defined policies, enhancing security and governance.

Cons

Learning Curve

  • The need to learn Rego for policy definition can present a learning curve for users not already familiar with this language, potentially slowing down initial adoption and policy development.

Maintenance Overhead

  • The complexity of the Rego language adds a management overhead to a team to ensure sufficient members are equipped with knowledge of the language.

Complexity in Policy Management

  • While the declarative nature of Gatekeeper’s policies is helpful, managing a large number of complex policies can become challenging, requiring careful organisation and governance.

Exceptions support

  • Requires to define exceptions within the ConstraintTemplates and Constraints, often necessitating more complex Rego policies to achieve exceptions functionality.

Kyverno

Kyverno is a policy engine designed specifically for Kubernetes, developed by Nirmata . It is recognized as an incubating project under the Cloud Native Computing Foundation (CNCF), highlighting its growing importance and adoption within the cloud-native ecosystem. Kyverno stands out for its Kubernetes-native approach to policy management, allowing cluster administrators to define, manage, and enforce policies directly on Kubernetes resources without the need to write complex code.

Kyverno works by integrating directly with the Kubernetes API to apply policies as Kubernetes resources themselves, which simplifies the process of policy management in Kubernetes environments. Here’s an overview of how Kyverno operates within a Kubernetes cluster:

Dynamic Admission Control

  • Kyverno registers itself as a dynamic admission controller with the Kubernetes API server. This allows Kyverno to intercept API requests, such as the creation, update, or deletion of Kubernetes resources, before they are processed and persisted in the cluster’s etcd database.

Policies as Kubernetes Resources

  • Administrators define policies using Kyverno custom resources, such as ClusterPolicy (non-namespaced) and Policy (namespaced). These policies are written in YAML, similar to other Kubernetes resources, making them accessible to users familiar with Kubernetes. The policies specify rules that can validate, mutate, or generate Kubernetes resources.

Validation Rules

  • Check the configuration of resources to ensure they meet certain criteria before being allowed in the cluster. If a resource fails validation, the request is rejected, and an error is returned to the user.

Mutation Rules

  • Automatically modify resources as they are created or updated to ensure they comply with organisational standards or to inject certain configurations automatically.

Generation Rules

  • Create additional resources based on the presence or configuration of other resources. This can be useful for ensuring related resources are always deployed together.

Policy Matching

  • When Kyverno receives an admission review request from the Kubernetes API server, it evaluates the request against all applicable policies. Policies can be scoped to apply to specific kinds of resources, namespaces, or even specific resource names, and can use label selectors for finer granularity.

Policy Enforcement

  • For each matching policy, Kyverno applies the rules defined within that policy to the resource in the admission review request.
  • If a validation rule fails, Kyverno rejects the request and returns an error explaining the violation.
  • If a mutation rule applies, Kyverno modifies the resource in the request according to the rule and forwards the modified version to the API server for processing.
  • If a generation rule is triggered, Kyverno creates the specified resources automatically.

Audit Policy

  • Kyverno generates policy reports that provide feedback on policy violations, mutations, and generated resources. These reports help administrators understand the impact of policies and identify compliance issues within the cluster.

Architecture

alt_text


Pros & Cons of Kyverno

Pros

No Additional Language Required

  • Policies in Kyverno are defined using YAML, which means there’s no need to learn a new language for policy creation. This lowers the barrier to entry for teams already familiar with Kubernetes resources and configurations.

Automated Resource Generation

  • Kyverno Generation property allows for the automatic creation of Kubernetes resources based on policy triggers.

Granular Exceptions

  • Kyverno offers detailed controls for exemptions, allowing for precise and flexible policy application tailored to specific needs or exceptions within the Kubernetes environment.

Declarative Policy Management

  • Kyverno policies are managed as Kubernetes resources, which fits well with the Kubernetes philosophy of declarative configuration. This approach enables version control of policies alongside your application code, enhancing auditability and governance

Exceptions Support

  • Allows for straightforward definition of exceptions within policies. For example, you can specify that a particular policy should not be applied to resources with a certain label or in a certain namespace.

Multi Tenant Support

  • Kyverno is highly flexible and supports the deployment of policies at a namespace level, allowing tenants to define their own policies.

Cons

Dependent on Kubernetes Versions

  • Since Kyverno relies on Kubernetes APIs and features, there may be dependencies on specific Kubernetes versions. This could potentially limit the use of certain Kyverno features based on the Kubernetes version deployed.

Potential for Complex Policies

  • As with any large system, there’s potential to create overly complex policies that are challenging to understand and maintain. This can lead to challenges in troubleshooting and could increase the risk of misconfigurations. Especially in the case of Kyverno, which does not use any programming language.

Limited to Kubernetes

  • As Kyverno is specifically designed for Kubernetes, its use is limited to environments where Kubernetes is the orchestration tool in use. This could be a limitation for organisations using multiple or different orchestration systems.
  • Policies are implementations of the Kubernetes admission APIs.

Kubewarden

Third in our comparison list is Kubewarden . It is currently a CNCF sandbox project and was originally developed by Rancher by SUSE. Unlike traditional policy engines that rely on domain-specific languages, Kubewarden leverages the flexibility and security of WebAssembly (Wasm) to execute policies, allowing developers to write policies in familiar programming languages before compiling them into Wasm modules. This approach not only provides flexibility in policy creation, enabling more contributors within an organization, but also ensures a high level of isolation and safety due to the sandboxed nature of WebAssembly. How does exactly Kubewarden work?

Kubewarden operates as a dynamic admission controller within Kubernetes. This means it intercepts API requests to create, update, or delete Kubernetes resources before they are processed by the API server.

Policy Evaluation

For each intercepted request, the Kubewarden controller invokes the relevant Wasm policy modules to evaluate the request. The policies inspect the request details (such as the resource kind, metadata, and specifications) and determine whether it complies with the defined rules.

Admission Decision

Based on the evaluation, the Wasm modules return a decision to the Kubewarden controller. If any policy is violated, the controller rejects the request, and it’s prevented from proceeding, ensuring that only compliant resources are deployed or modified in the cluster. If the request passes all policy checks, it’s allowed to proceed.

Another case to mutate the newly created resources, which basically rebuilds the request in order to comply with the standards. This case can be a simple addition of a label for example.

Finally Kubewarden supports context aware policies, that basically can fetch the current state of the cluster and compare the newly created resource with it. A simple example would be to not duplicate a resource that is already deployed in the cluster.

Policy Development

Developers or security engineers write policies according to the organization’s requirements. These policies can be about anything that needs to be enforced in the Kubernetes environment like ensuring only images from a trusted registry are used, or that all Pods must have resource limits set. Thanks to Kubewarden flexibility, these policies can be written in languages like Rust, Go, or others that compile to Wasm, and it also supports compatibility to Rego (Gatekeeper).

Compilation

Once the policies are written, they are compiled into WebAssembly (Wasm) modules. This compilation step transforms the high-level policy code into a binary format that can be executed in a sandboxed environment, providing security and portability.

Feedback and Reporting

Kubewarden provides detailed feedback on its policy decisions, which can be used for auditing and compliance purposes. This feedback includes information on which policies were evaluated, their decisions, and any messages returned by the policies explaining their decisions. An additional attribute of Kubewarden is audit-scanner which is a component that checks resources in the cluster. It flags the ones that don’t adhere with the Kubewarden policies deployed in the cluster.


Architecture

alt_text


Pros & Cons of Kubewarden

Pros

Flexibility in Policy Development

  • Kubewarden allows policies to be written in any programming language that compiles to WebAssembly, offering flexibility and leveraging existing developer skills, without the need to learn a new domain-specific language.

Strong Isolation with WebAssembly

  • Policies are executed in a sandboxed environment provided by WebAssembly, enhancing security by isolating the policy execution from the Kubernetes cluster and reducing the risk of malicious exploits.

Performance Efficiency

  • WebAssembly is designed for high performance and low overhead, making Kubewarden an efficient solution for policy evaluation, even in high-load environments.

Cross-Platform Portability

  • WebAssembly ensures that policies run consistently across different environments and Kubernetes distributions, thanks to its platform-agnostic design.

Community and Ecosystem

  • As part of the CNCF sandbox projects, Kubewarden benefits from community contributions, shared knowledge, and a growing ecosystem of integrations and tooling.

Exceptions Support

  • Allows for straightforward definition of exceptions within policies. For example, you can specify that a particular policy should not be applied to resources with a certain label or in a certain namespace.

Multi Tenant Support

  • Tenants can deploy their own policies using Kubewarden’s PolicyServers, which can be configured to enforce policies within specific namespaces or sets of namespaces.

Cons

Maturity

  • As a relatively new project, Kubewarden might not have the same level of maturity, widespread adoption, or extensive documentation as more established Kubernetes policy engines, potentially leading to challenges in troubleshooting and community support.

Tooling and Integration

  • Given its unique approach using WebAssembly, the ecosystem of tools specifically designed for developing, testing, and debugging Kubewarden policies is still evolving. This might limit some capabilities compared to more mature solutions with extensive tooling support.

JsPolicy

JsPolicy , developed by Loft Labs , is an open-source policy engine tailored for Kubernetes, leveraging the familiar and versatile JavaScript language to define and enforce cluster policies. By enabling policies to be scripted in JavaScript, JsPolicy simplifies the policy creation process. It makes it accessible to a wider range of developers and administrators who can now apply their JavaScript expertise to enhance security and compliance within Kubernetes environments. This approach makes policy management easier by lowering the learning curve but also harnesses the dynamic capabilities of JavaScript. It offers a flexible and user-friendly solution for managing complex Kubernetes policies, all while benefiting from the collaborative and transparent nature of its open-source community.

JsPolicy operates through three main parts, Webhook Manager, V8 Javascript Sandbox pool and Policy Compiler. A breakdown of what actually happens on each one of them:

Webhook Manager Registration

Webhook Manager registers with the Kubernetes API server as a dynamic admission webhook, which allows it to intercept requests to the API server, such as creating, updating, or deleting Kubernetes resources. When a request is intercepted, the webhook manager examines the request and determines which JsPolicy policies are applicable based on the type of resource and the operation being performed. The webhook manager then forwards the request to the appropriate JavaScript policy for evaluation, waiting for the policy’s decision on whether to allow, deny, or mutate the request.

V8 JavaScript Sandbox Pool

For executing JavaScript policies, JsPolicy uses the V8 JavaScript engine, which is known for its performance. Each JavaScript policy runs inside its own isolated sandbox environment provided by the V8 engine. This sandbox is essential for security, ensuring that policy code execution is isolated from the host environment and from other policies, preventing any malicious code within a policy from affecting the Kubernetes cluster or the JsPolicy system itself. The sandbox pool refers to the collection of these isolated environments, where multiple policies can be executed concurrently in their sandboxes, ensuring efficiency and scalability.

Policy Compiler

The policy compiler is responsible for transforming the JavaScript policy code written by developers into an optimized form that can be executed by the V8 engine within the sandbox environments. This compilation process includes parsing the JavaScript code, performing various optimizations, and compiling it into a lower-level representation that the V8 engine can execute more efficiently. The policy compiler ensures that the JavaScript policies are ready to be evaluated quickly and efficiently each time a relevant Kubernetes API request is intercepted, minimizing the performance impact on the cluster.

So the workflow looks like this:

  • A request is made to the Kubernetes API (e.g., to create a Pod).
  • The webhook manager intercepts this request as part of the dynamic admission control process.
  • The webhook manager identifies which JsPolicy policies apply to this request.
  • The applicable JavaScript policies are then invoked within their isolated V8 sandboxes for evaluation.
  • Each policy evaluates the request, making decisions based on the policy logic written in JavaScript.
  • Decisions to allow, deny, or mutate the request are collected and acted upon. If any policy denies the request, it is rejected. Mutation policies can alter the request before it proceeds.
  • The Kubernetes API server completes the request based on the policies' decisions, enforcing the organization’s rules and standards.

Architecture

alt_text


Pros & Cons of JsPolicy

Pros

Familiarity of JavaScript

  • By using JavaScript, JsPolicy allows more team members to contribute without the need to learn a new language.

Dynamic and Flexible Scripting

  • JavaScript enables complex logic and flexibility in policy definitions, allowing for sophisticated control over Kubernetes resources and operations.

Fast Execution with V8 Engine

  • JsPolicy uses the V8 JavaScript engine, known for its high performance, ensuring policies are evaluated quickly and efficiently, minimizing the impact on cluster operations.

Secure Sandbox Execution

  • Policies are executed in isolated sandbox environments, enhancing security by preventing malicious or buggy code from affecting the Kubernetes cluster or the host system.

Ease of Integration

  • Being Kubernetes-native and leveraging CRDs (Custom Resource Definitions) for policy management, JsPolicy integrates seamlessly with Kubernetes ecosystems, fitting well into existing workflows.

Exceptions Support

  • Allows for straightforward definition of exceptions within policies. For example, you can specify that a particular policy should not be applied to resources with a certain label or in a certain namespace.

Multi Tenant Support

  • Tenants can have their own unique policies applied to their resources and similar to Kyverno, jsPolicy supports defining policies that apply only to certain namespaces or objects.

Cons

Performance Overhead

  • While the V8 engine is fast, running JavaScript policies for every admission request can introduce latency, especially in high-load environments or with complex policies, potentially affecting cluster performance.

JavaScript Limitations

  • The dynamic nature of JavaScript, while flexible, can also lead to runtime errors and harder-to-debug issues compared to statically typed languages, potentially increasing the maintenance burden for complex policies.

Maturity and Ecosystem

  • As a newer policy controller into the Kubernetes policy space, JsPolicy’s ecosystem, community support, and available integrations may not be as mature or extensive as more established policy engines. This also extends to its documentation and usage.

Final Thoughts

Choosing the right Kubernetes policy engine is a critical decision that can significantly impact the security, compliance, and operational efficiency of your Kubernetes environments. As we’ve explored the aspects of OPA/Gatekeeper, Kyverno, Kubewarden, and JsPolicy, it’s evidence that each policy engine offers a unique blend of features, design philosophies, and operational implications. From the flexibility and security of WebAssembly-based policy execution in Kubewarden to the widespread familiarity of JavaScript with JsPolicy, and from the declarative power of Rego in OPA/Gatekeeper to the Kubernetes-native approach of Kyverno, each engine caters to different organizational needs, skill sets, and use-cases.

In conclusion, the decision to adopt a particular Kubernetes policy engine should not be taken lightly. It requires a comprehensive evaluation of various factors, including organizational needs, technical compatibility, team expertise, performance considerations, future scalability, and security requirements. A thorough comparison ensures that the selected policy engine not only meets the current needs but is also a strategic fit for the organization’s long-term Kubernetes strategy, enhancing governance, security, and operational efficiency across your clusters.