Comparison: Kubeadmiral and Karmada

Author: Derek Mortimer | Posted on: March 17, 2025

Kubeadmiral, Karmada, and Multi-cluster Federation Standards

A client recently asked us for a comparison of two multi-cluster Kubernetes management technologies, specifically Kubeadmiral and Karmada . In this post we’ll introduce these technologies, and the Kubernetes standards that influenced their architecture, to enable multi-cluster workload orchestration. The focus will be entirely on managing the manifests inside existing Kubernetes clusters, specifically excluding discussion of standards such as the Cluster API which are concerned with the provisioning of new clusters from scratch.

Introduction

When we talk about multi-cluster orchestration (also known as federation), we are referring to having multiple Kubernetes clusters, each of which is managed by some centralized authoritative source of truth for what manifests should be deployed. For clarity, we will refer to them as hub and spoke clusters:

Hub clusters are the authoritative single source of truth that defines what manifests should be deployed to all managed clusters
Spoke clusters are associated with a hub cluster, the local state on a spoke cluster is constantly reconciled towards the desired state defined by the hub cluster

Two approaches exist for synchronizing this configuration between the hub and spoke clusters, the push and pull models:

In the push model, the hub cluster proactively manages manifests deployed on the spoke clusters (e.g., by directly interacting with its Kubernetes API), ensuring that they stay reconciled against the defined configurations.
In contrast, the pull model moves the reconciliation into spoke clusters themselves. Each spoke cluster monitors the hub cluster, autonomously pulling updates and reconciling local state as needed.

This model simplifies how platform engineering and application teams can deploy workloads consistently across an estate of managed clusters. By aligning all spoke clusters to a single source of truth, the system can automatically detect configuration drift and trigger reconciliation.

Crucially, this approach also leverages a Kubernetes-native approach to define, manage, and automate the federated rollout process, aligning with the broader cloud-native ecosystem for a familiar user experience.

Evaluation

We’ll evaluate the standards and solutions in terms of:

Capability	Reasoning
Dynamic Placement	Can manifests be dynamically placed on subset of spoke clusters? Can replicas be dynamically spread?
Dynamic Configuration	Can Kubernetes resources of all types be synchronized to spoke clusters? Can Kubernetes resources have per-cluster overrides applied to them when synchronized?
Operational Complexity	How difficult/complex are the tools to deploy? How difficult/complex are the tools to run? How are the solutions architected and designed?
Operational Mode Support	Is push-based management supported? Is pull-based management supported?

Before we get into evaluating Kubeadmiral, we’ll take a bit of a detour to look at the first set of Kubernetes working group APIs targeting the use case of multi-cluster federation: Kubefed.

Standards: Kubefed

Across its v1 and v2 incarnations, Kubefed was a CNCF/k8s standard that introduced the terminology of a “federation control plane” (i.e., the hub cluster) to manage resources across multiple federated Kubernetes clusters (i.e., the spoke clusters). The hub is made aware of other clusters using Kubernetes objects — in v1, this was the Cluster object, while in v2, the FederatedCluster object served the same purpose. The federation control plane acts as a central manager, ensuring resources are synced and distributed across these clusters following the push model.

Kubefed v1 provided a predefined set of objects such as FederatedNamespace, FederatedDeployment, FederatedReplicaSet, and FederatedIngress, which were automatically synchronized across all spoke clusters. While functional, this version was rigid: only supporting a fixed set of resource types and applying them identically across all known clusters in the federation.

apiVersion: federation/v1
kind: Cluster
metadata:
  name: cluster-a
spec:
  # ... cluster endpoint and credentials ...
---
apiVersion: federation/v1
kind: Cluster
metadata:
  name: cluster-b
spec:
  # ... cluster endpoint and credentials ...
---
apiVersion: extensions/v1beta1
kind: FederatedDeployment
metadata:
  name: nginx-deployment
  namespace: nginx
spec:
  # ... deployment template ...

The above example shows YAML defining 2 spoke clusters (cluster-a and cluster-b), and a FederatedDeployment (to deploy nginx). Under Kubefed v1, the spec of the FederatedDeployment would be identically applied across all known spoke clusters (a and b).

Kubefed v2 iterated on two particular pain points identified in v1:

Only being able to federate a restricted set of types (e.g., Deployment, Ingress, ConfigMap)
The inability to selectively target specific clusters for resource deployment
The inability to do cluster-specific overrides on the configuration of manifests that are federated

To solve the first issue, v2 introduced the FederatedTypeConfig object, which allowed Kubernetes administrators to specify any resource type they wanted to federate. For example, by creating a FederatedTypeConfig for a resource like apps/v1/Deployment, a corresponding FederatedDeployment type would be automatically generated, enabling it to be synchronized across clusters.

The second and third issues were solved by adding placement and overrides support to all generated Federated* object types which allow configuration of which clusters resources are federated to, and any cluster-specific overrides to be applied before synchronization.

apiVersion: extensions/v1beta1
kind: FederatedDeployment
metadata:
  name: nginx-deployment
  namespace: nginx
spec:
  # ... deployment template ...
placement:
  clusters:
    - name: cluster1
    - name: cluster2
overrides:
    - clusterName: cluster1
      clusterOverrides:
        - path: "/spec/replicas"
          value: 5

This example shows the placement and overrides properties being used to explicitly state which clusters the deployment should be federated across (cluster1 and cluster2), and that the spec.replicas field should be set to 5 on the resource when it is synchronized to cluster1.

Despite the advancements in v2, Kubefed failed to gain significant community traction, being seen as complex, lacking popular community-supported solutions, and competing against emerging alternatives such as Google Anthos. These factors culminated in the archival and abandonment of the Kubefed standard and reference implementations.

Technology: Kubeadmiral

The first popular federation solution we’ll look at is Kubeadmiral , a system inspired by Kubefed v2, but not a direct implementation of it. It follows the push model of a hub cluster managing the resources across multiple spoke Kubernetes clusters.

At a high level, Kubeadmiral allows you to define PropagationPolicy objects which target arbitrary Kubernetes objects and mark them for propagation to Federated Clusters, which are defined via FederatedCluster objects. PropagationPolicy objects allow you to target:

All Kubernetes object of a given group, version, and kind (e.g., apps/v1/Deployment)
A subset of objects of a given group, version, and kind, using typical matchLabels and matchExpressions
A specifically named object of a given group, version, and kind, explicitly by name

NOTE:PropagationPolicy are namespace scoped objects, so the objects they target must be in the same namespace to be synchronized into federated clusters. A ClusterPropagationPolicy exists that allows you to manage cluster-wide syncing of Kubernetes objects.

Kubeadmiral supports the placement and overrides functionality of Kubefed v2 where you can explicitly list a subset of spoke clusters to synchronize resources across, but it also adds more sophisticated capabilities, notably including:

clusterAffinity, clusterSelector & tolerations – Dynamically deploying federated resources to clusters based on cluster affinity, match labels, and match expressions
autoMigration – Dynamically migrating resources away from clusters that have no capacity or have failed
followerScheduling – Specify whether dependent resources should always follow their leader (e.g., a Deployment depends on a ConfigMap that must be present in the same cluster) to the target cluster(s)
maxClusters – Specify an upper limit on how many clusters resources can be propagated across
replicaStrategy – Configure whether Kubeadmiral uses a spread or bin-packing strategy when scheduling replicas across multiple clusters
schedulingMode – “Duplicate” mode deploys a specified number of replicas across all selected clusters while “Divide” will use a configured strategy to place an overall desired number of replicas across available clusters
reschedulePolicy – Specify when targeted resources should be rescheduled across target clusters, such as when clusters are added or removed, or manifests and policies are updated

The example YAML below demonstrates some of these features being exercised:

apiVersion: apps/v1
kind: Deployment
metadata:
  name: echo-server
  namespace: default
  labels:
    app: echo-server
spec:
  replicas: 6
  selector:
    matchLabels:
      app: echo-server
  template:
    metadata:
      labels:
        app: echo-server
    spec:
      containers:
      - name: echo-server
        image: ealen/echo-server:latest
        ports:
        - containerPort: 8080
---
apiVersion: core.kubeadmiral.io/v1alpha1
kind: PropagationPolicy
metadata:
  name: policy-echo-server
  namespace: default
spec:
  # Divide replicas dynamically across clusters
  schedulingMode: Divide
  clusterSelector: {}  # Apply to all clusters
  reschedulePolicy:
    replicaRescheduling:  # Enables replica rescheduling
      avoidDisruption: true  # Avoids moving replicas to prevent service disruption
      intervalSeconds: 300  # Reschedules every 5 minutes if needed

In this example we:

Define a Deployment whose replicas should be spread across spoke clusters
Define a PropagationPolicy that targets the Deployment for federation
- Selecting all known spoke clusters for federation
- Dividing the replicas (6) across all spoke clusters (e.g., 1 * 6, 2 * 3, 3 * 2)
- Allow rescheduling of replicas across known clusters on a regular basis, while avoiding disruptions

Issues and Conclusion

While the functionality offered by Kubeadmiral is a sensible evolution on the capabilities offered by Kubefed v2 which served as inspiration, the technology itself has a few issues which make adoption hard to recommend.

Notably, the documentation isn’t particularly comprehensive; you will often have to dive into the code to find the available options for available configuration properties and their impact, compared to the alternative products (which we’ll get onto later) Kubeadmiral is the most lacking in terms of docs. On top of this, Kubeadmiral’s contributions seem to have slowed down, and no official support beyond Kubernetes 1.24 is mentioned anywhere.

From a technical/deployment perspective, Kubeadmiral makes some interesting choices. When you deploy Kubeadmiral to a kubernetes cluster (which Kubeadmiral refers to as your meta-cluster), it actually creates a brand new virtual Kubernetes cluster inside your meta-cluster. This virtual cluster is your kubeadmiral cluster. This means you have your meta cluster, which runs your **hub **cluster virtually inside it, targeting all of your desired spoke clusters. The below figure shows a meta-cluster, containing a kubeadmiral cluster, managing two resources in two spoke clusters.

Kubeadmiral Clusters

This virtual kubeadmiral cluster deployment, at time of writing, uses k8s 1.20 to deploy a customized apiserver and kube-controller-manager; these old versions are missing a lot of core API changes that have landed since it was released. The reason Kubeadmiral runs its own apiserver and controller manager is so it can prevent pods actually being scheduled inside your virtual kubeadmiral cluster and instead ensure they run only across the managed clusters.

The biggest issues with this approach are:

The Kubernetes version of the virtual hub cluster determines the API capabilities your developers can use on the objects they ultimately want to synchronize across other clusters and is likely to be a complete showstopper.
Having a virtual cluster configured entirely by Kubeadmiral (which may not be in line at all with your security or internal standards), plus the added complexity of that cluster subverting standard Kubernetes pod execution outside recommended approaches

The approach taken by Kubeadmiral, to avoid generating a Federated* pair type for every resource you want to sync, is aimed directly towards addressing the perceived complexity of the original Kubefed standards.

However, the approach of augmenting Deployments with new functionality (e.g., scheduling across multiple clusters) is supported by registering your own controllers to be called during regular scheduling of pods. The choice by Kubeadmiral to subvert management of them by standard Kubernetes control plane components makes it a hard choice to recommend, especially when modern alternatives already address all of these pain points.

Standards: Work API

The Kubefed v2 standard had 2 main issues that hampered adoption and usefulness:

The only supported model was push, where the hub proactively manages Kubernetes manifests across all spoke clusters. Requiring the central cluster to have privileged access to every cluster it manages can be both a security risk, and a scaling risk as the number of clusters under management scales.
There was a proliferation of Kubernetes API types resulting in the dynamic generation of Federated* objects, which can be avoided using modern Kubernetes capabilities such as inserting your own controllers into the scheduling processes as middleware

The Work API simplifies “manifests to be synced” down to Work objects (which live on the hub cluster) defining manifests to be deployed to a given set of clusters, and AppliedWork objects (which can live in the hub or a spoke) which represent reconciled status of manifests applied within a specific cluster.

apiVersion: work/v1
kind: Work
metadata:
  name: example-work
  namespace: cluster1
spec:
  workload:
    manifests:
      - apiVersion: apps/v1
        kind: Deployment
        metadata:
          name: nginx
          namespace: nginx
        spec:
          ... deployment spec ...
---
apiVersion: work/v1
kind: AppliedWork
metadata:
  name: example-work
  namespace: fleet-system
spec:
  clusterName: cluster1
  workName: example-work
status:
  conditions:
    - type: Applied
      status: "True"
      lastTransitionTime: "2023-10-01T12:00:00Z"
    - type: Completed
      status: "True"
      lastTransitionTime: "2023-10-01T12:05:00Z"

You can think of these objects (Work and AppliedWork) as a “signal that some reconciliation needs to happen” and “status of some ongoing reconciliation” respectively. This simplified kernel allows you to support push and pull models:

You can support a push model by either having the hub cluster actively manage resources on spoke clusters, and manage the status of AppliedWork objects
You can support a pull model by allowing spoke clusters to watch for Work objects on the hub cluster, and then reconcile their own local changes as required

You may have noticed that we’ve made no mention of placement, the Work API deliberately is focused on describing the result of a decision, that is, “this set of manifests should be deployed to interested clusters.” These Work objects are namespace scoped objects, with the expectation that spoke clusters “subscribe” to one or more namespace(s) they are interested in for Work objects.

Another important note is that the manifests property of Work objects is validated when it is applied to spoke clusters, the Work API makes no assumptions about validation before a Work object is generated, and will capture failures to apply in AppliedWork and status fields.

This intentionally loose coupling allows you to define your own placement strategies and workflows, whatever is making the decisions simply needs to create Work objects to signal the outcome of those decisions, to be reconciled against the spoke clusters. Shown at an extremely high level in the below figure:

Federation decisioning decoupled

Another deliberate omission here is how you define spoke clusters from the hub cluster. In a pull model, the hub cluster may act simply as a source of truth from which all the spoke clusters pull. They simply need to be able to read via the standard Kubernetes API, without the hub knowing anything about the spokes. For a push model you would require the hub to be configured with knowledge and credentials for the spoke clusters it proactively manages, but it is specifically out of scope of the Work API as defined.

Technology: Karmada

Karmada , currently an incubating CNCF project, is a Kubernetes native and modern architecture to cross cluster orchestration and federation. It supports both push and pull models through the use of the karmada-agent which can be installed in the hub cluster to push updates out to spoke clusters, or can be run in spoke clusters to pull changes observed in the hub and reconcile them locally.

Karmada runs its control plane components in the hub cluster, documented here . As stated above, the karmada-agent component may run in hub or spoke clusters as needed (for push and pull use cases respectively). The following figure shows an example architecture of a hub cluster managing two spoke clusters, one via push, and one via pull.

Karmada components

Karmada implements the Work API but also builds on top of it to offer dynamic placement capabilities:

The Work API defines manifests to be deployed to subscribed spoke clusters
- The status field of this object takes the place of a dedicated AppliedWork object, more in line with Kubernetes API norms
Around the Work API, Karmada offers:
- Resource Templates: these are your Kubernetes manifests (e.g., Deployment, ConfigMap, or any valid type), which you would like to deploy to one or more spoke clusters
- Propagation Policy: A PropagationPolicy object targets resources by group, version, kind, name, and possibly namespace, and selects which spoke clusters they should propagate to
- Resource Binding: A ResourceBinding object represents the result of a resource template being selected for propagation to a spoke cluster, referencing an individual resource template and known cluster
- Override Policy: An override policy applies cluster specific settings to resources after they are bound by a ResourceBinding object, but before they are propagated into a Work object to be reconciled on spoke cluster

Karmada offers the following capabilities for dynamic placement:

Explicitly list the (sub)set of spoke clusters to deploy to
Dynamically select the spoke clusters to deploy to using matchLabels or matchExpressions
Strategies for replica placement and balancing (bin packing, spread, duplicate, divide)
Trigger rebalancing when clusters come and go
Dynamic rebalancing based on replica health
Explicit dependencies between propagated resources

In addition to these capabilities, Karmada offers multi-cluster Ingress and Service support, although these capabilities require additional configuration to enable the necessary network connectivity.

Summary

In the evaluation section, we outlined a set of capabilities used to evaluate the standards and technologies in this blog post, the below table summarizes those capabilities against the technology

Capability / Tech	Kubefed v1 API	Kubefed v2 API	Work API	Kubeadmiral	Karmada
Can manifests be dynamically placed on subset of spoke clusters?	⛔️	✅	N/A	✅	✅
Can replicas be dynamically spread	⛔️	⛔️	N/A	✅	✅
Can Kubernetes resources of any type be synchronized to spoke clusters?	⛔️	✅	✅	✅	✅
Can Kubernetes resources have per-cluster overrides applied to them when synchronized?	⛔️	✅	✅	✅	✅
Complexity of operation?	N/A	N/A	N/A	🔥🔥	🔥
Complexity of architecture?	N/A	N/A	N/A	🔥🔥	🔥
Is push based management supported?	✅	✅	✅	✅	✅
Is pull based management supported?	⛔️	⛔️	✅	⛔️	✅

Distilling complexity of operation and architecture down to a pictogram isn’t particularly easy, the intention here is to indicate that Karamada has a cleaner, more Kubernetes-friendly approach to architecture and operation when compared to Kubeadmiral.

Conclusion

Karmada is a useful evolution in technology for federated resource management across multiple clusters. It can flexibly support a varied cluster estate operating in push and pull mode as required, and can offer additional niceties such as multi-cluster ingress and multi-cluster services if you wish to expose your workloads from a central cluster.

The functionality, community and sponsor support, CNCF adoption, and overall Kubernetes friendly design (i.e., avoiding all the non-standard pitfalls in deployment configuration that Kubeadmiral made) make Karmada an easy and solid choice for a cloud vendor agnostic, or on-premise, approach to multi-cluster Kubernetes resource management.

Honorable Mentions

While we were tasked with evaluating and comparing Kubeadmiral and Karmada, a number of other tools exist in the similar and adjacent areas. We’ll very briefly touch on them below.

Anthos

Anthos is a managed service offered by GCP, targeted specifically at the application of resources across multiple Kubernetes clusters, benefitting from enterprise level security and policy support offered by GCP. It supports a wide range of target clusters, including on-prem and across other Cloud Service Providers (e.g., AWS and Azure). Anthos is a solid offering with great documentation that comes with the usual trade-offs of locking in to a vendor-specific solution.

When directly compared to Karmada, Anthos doesn’t provide intelligent placement of replicas across multiple clusters, similar to Kubefed v1, Anthos simply ensures that a set of manifests is uniformly applied across multiple clusters. However, it is important to note that the Anthos suite can provide multi-cluster meshing which allows you to pursue similar ends, via different means.

Argo CD and Flux CD (GitOps)

Argo CD and Flux CD can both satisfy the use case of multi-cluster manifest management in a couple of ways:

Locally reconciling manifests pulled from supported repositories (typically GitOps repos)
Pull manifests from supported repositories and remotely push them to other clusters

As with Anthos, these technologies are not aware of individual replicas and their placement across an estate of clusters, but they do support dynamic generation of per-cluster configuration using techniques such as Application Sets .

Clusterpedia

Definitely on the edge of related technologies, Clusterpedia allows you to federate the status of existing objects across multiple clusters into a single place, allowing aggregated views of objects across your entire estate. It does not support synchronizing resources outwards to other clusters, but may form part of a strategy to monitor your estate.

Farewell

I hope you found this quick sojourn through multi-cluster federation standards and technologies useful and that it helped illuminate how these tools work and why they work the way they do!