Using GitOps to Manage a KWOK Deployment

Table of Contents

  • Overview
  • Requirements
  • File Tree
  • Create a Cluster
  • Install Flux Controllers
  • Create a Secret for Authenticating to the GitHub Repository
  • Connect Flux to the Repository
  • Target the KWOK Manifests using Flux
  • Apply Custom Resources (CRs) of Stages
  • Target the Deployment and Node YAML Manifests using Flux
  • Install Prometheus and Grafana
  • Enable Metrics on Kube-Scheduler
  • Grab the Grafana Username and Password
  • Scale the Deployment to 30 and Modify Node Affinity
  • View Scheduler Metrics on Grafana Dashboard
  • Conclusion


So, I have been studying the GitOps associate exam, I decided to document my learning by implementing it on an open-source project I worked on during my past internship experience at DaoCloud.

DaoCloud has an open-source project called Kubernetes without Kubelet (KWOK) which helps developers who are building products on Kubernetes, test their Kubernetes control plane. They do this by creating a huge amount of pods or nodes to stress test the control plane. Since running such tests in a real environment is expensive, KWOK makes this cost-effective.

I haven’t found any resource on the internet showing users how to deploy this using Flux, a GitOps tool, and monitoring the Kubernetes scheduler using Prometheus, this article should help.

High-level architecture of a KWOK deployment using GitOps

It’s also worth mentioning that I found a KWOK documentation bug and opened a PR that got merged while I was preparing for this article.


File Tree

GitHub repository

├── cluster.yaml
├── gitops
│   ├── flux
│   │   ├── git.yaml
│   │   ├── kwok.yaml
│   │   └── others.yaml
│   │   └── prometheus-grafana.yaml
│   └── kustomization
│       ├── kwok
│       │   ├── base
│       │   │   ├── kustomization.yaml
│       │   │   └── namespace.yaml
│       │   └── overlays
│       │       └── kustomization.yaml
│       └── others
│           ├── base
│           │   ├── deployment.yaml
│           │   ├── kustomization.yaml
│           │   ├── namespace.yaml
│           │   └── node.yaml
│           └── overlays
│               └── modify-number-of-pods
│                   ├── deployment.yaml
│                   └── kustomization.yaml
└── secrets.yaml

We will be working with a few files, so this tree will guide us through it.

Breaking it into sections:

  • gitops/flux/ → Flux configurations
  • gitops/kustomization/kwok/ → KWOK setup using Kustomize
  • gitops/kustomization/others/ → Additional Kubernetes resources
  • secrets.yaml → Holds authentication credentials

Create a Cluster

The first step is to create a real Kubernetes cluster. KWOK will be deployed into the cluster. Here I will be using a Kubernetes in Docker (KIND) cluster.

kind create cluster --config cluster.yaml


kind: Cluster
apiVersion: kind.x-k8s.io/v1alpha4
  - role: control-plane
      - containerPort: 30000 # The k8s service port for grafana
        hostPort: 3000 # The host port you want to expose the k8s service
        protocol: TCP
      - containerPort: 30909 # The k8s service port for prometheus
        hostPort: 9090 # The host port you want to expose the k8s service
        protocol: TCP

The cluster.yaml file configures port mappings that allow the host to access the Grafana and Prometheus services once they are deployed into the Kind cluster for observability.

Install Flux Controllers

Flux is a GitOps tool that takes the infrastructure code defined in a Git repository and automatically deploys them into a Kubernetes cluster. It also watches for changes in the repository, and synchronizes those changes in the cluster.

All these interactions are handled by the controllers.

flux install

kubectl get pod -n flux-system

NAME                                       READY   STATUS    RESTARTS        AGE
helm-controller-5bb6849c4f-rbrrj           1/1     Running   1               5m
kustomize-controller-68597c4488-9vtnl      1/1     Running   1               5m
notification-controller-7d6f99878b-plvzl   1/1     Running   1               5m
source-controller-666dc49455-z4gs7         1/1     Running   1               5m

Create a Secret for Authenticating to the GitHub Repository

Flux needs to authenticate to your repository because this repository may be private.

echo -n "testing" | base64
echo -n "testing1234" | base64

Your username and password need to be converted to base64. The result will be specified as the value.


apiVersion: v1
kind: Secret
  name: github-auth
  namespace: flux-system
type: Opaque
  username: dGVzdGluZw==
  password: dGVzdGluZzEyMzQ=
kubectl apply -f secrets.yaml 

Connect Flux to the Repository

├── gitops
├── flux
│   │   ├── git.yaml
│   │   ├── kwok.yaml
│   │   └── others.yaml

kubectl apply -f gitops/flux/git.yaml


apiVersion: source.toolkit.fluxcd.io/v1
kind: GitRepository
  name: git
  namespace: flux-system
  interval: 1m # Syncs to the repo every 1 minutes
  url: https://github.com/network-charles/kwok-gitops.git
    branch: main
    name: github-auth

For real-world production workloads, a longer interval (e.g., 5-10 minutes) is recommended to reduce API rate limits and unnecessary sync attempts.

Target the KWOK Manifests using Flux

When working with Flux, you can use a tool called Kustomize to place your main configs in a base directory, and then modify any of these configs using a copy placed in an overlays directory. It’s preferable this way than messing with your main config.

├── gitops
│     ├── flux
│     │   │   └── kwok.yaml
│       └── kustomization
│       ├── kwok
│       │   ├── base
│       │   │   ├── kustomization.yaml
│       │   │   └── namespace.yaml
│       │   └── overlays
│       │       └── kustomization.yaml

kubectl apply -f gitops/flux/kwok.yaml


apiVersion: kustomize.toolkit.fluxcd.io/v1
kind: Kustomization
  name: kwok
  namespace: flux-system
  interval: 1m # Syncs to the repo every 1 minutes
  targetNamespace: kwok
    kind: GitRepository
    name: git
    namespace: flux-system
  path: "/gitops/kustomization/kwok/overlays"
  prune: true
  timeout: 1m # time it takes for the manifest to finish applying before timing out

Flux targets the directory/path containing the file I will use to deploy KWOK into my cluster.

The manifest below is the targeted file. I am attempting to modify the image version used to deploy KWOK into my cluster. Since I am already referencing the base directory via the resources field, Flux will create everything there first, afterward, it’d modify any image version changes I specify in the patches and images fields.


apiVersion: kustomize.config.k8s.io/v1beta1
kind: Kustomization
  - ../base
  - target:
      kind: Kustomization
    patch: |-
      apiVersion: kustomize.config.k8s.io/v1beta1
      kind: Kustomization
        name: kwok
        namespace: kwok
        - https://github.com/kubernetes-sigs/kwok/kustomize/kwok?ref=v0.5.1
  - name: registry.k8s.io/kwok/kwok
    newTag: "v0.5.1"

Apply Custom Resources (CRs) of Stages

This makes it possible to simulate the state of a pod or node in the cluster with very little resources.

KWOK_LATEST_RELEASE=$(curl "https://api.github.com/repos/${KWOK_REPO}/releases/latest" | jq -r '.tag_name')
kubectl apply -f "https://github.com/${KWOK_REPO}/releases/download/${KWOK_LATEST_RELEASE}/stage-fast.yaml"

Target the Deployment and Node YAML Manifests using Flux

├── gitops
│     ├── flux
│     │   │  └── others.yaml
│     └── kustomization
│       └── others
│           ├── base
│           │   ├── deployment.yaml
│           │   ├── kustomization.yaml
│           │   ├── namespace.yaml
│           │   └── node.yaml
│           └── overlays
│               └── modify-number-of-pods
│                   ├── deployment.yaml
│                   └── kustomization.yaml

kubectl apply -f gitops/flux/others.yaml

Now all manifests in this others directory will be managed by Flux.


apiVersion: kustomize.toolkit.fluxcd.io/v1
kind: Kustomization
  name: others
  namespace: flux-system
  interval: 1m # Syncs to the repo every 1 minutes
    kind: GitRepository
    name: git
    namespace: flux-system
  path: "/gitops/kustomization/other/overlays/modify-number-of-pods"
  prune: true
  timeout: 1m

Notice how the overlays directory is still been referenced, and used to make changes to the main deployment.

Install Prometheus and Grafana

This deploys Prometheus and Grafana using Helm, but it’s managed by Flux. I like this because, I don’t need to manage Helm manually using the command line.


apiVersion: source.toolkit.fluxcd.io/v1
kind: HelmRepository
  name: prometheus-community
  namespace: flux-system
  interval: 1m
  url: https://prometheus-community.github.io/helm-charts
apiVersion: helm.toolkit.fluxcd.io/v2
kind: HelmRelease
  name: prometheus-grafana
  namespace: flux-system
  targetNamespace: monitoring
  interval: 1m
      chart: kube-prometheus-stack
      version: "*"
        kind: HelmRepository
        name: prometheus-community
        namespace: flux-system
    createNamespace: true
        nodePort: 30909
        type: NodePort
        type: NodePort
        nodePort: 30000
kubectl apply -f gitops/flux/prometheus-grafana.yaml

Enable Metrics on Kube-Scheduler

For security purposes, Kube-Scheduler is not accessible by default. So, metrics aren’t scraped, but if you are running tests and need to collect metrics from it, you need to bind the address and make it accessible to other pods like the Prometheus pod, etc.

docker exec -it kind-control-plane bash
apt update
apt install vim -y
vi /etc/kubernetes/manifests/kube-scheduler.yaml

  - command:
    - kube-scheduler
    - --bind-address=  # Allow listening on all interfaces

Now the Prometheus service monitor scraping metrics from the scheduler will be in an up state.

kube-scheduler service monitor

kube-scheduler service monitor

Grab the Grafana Username and Password

kubectl get secrets prometheus-grafana -o jsonpath='{.data.admin-password}' | base64 --decode ; echo
kubectl get secrets prometheus-grafana -o jsonpath='{.data.admin-user}' | base64 --decode ; echo

Scale the Deployment to 30 and Modify Node Affinity

Initially, I had one pod running in the deployment.

kubectl get deployments.apps nginx-deployment

nginx-deployment   1/1     1            1           5s

Now, I’ll scale the deployment to 30. To do this, I will modify a copy of the deployment stored in the overlays directory.

├── gitops
│   └── kustomization
│       └── others
│           ├── base
│           └── overlays
│               └── modify-number-of-pods
│                   ├── deployment.yaml
│                   └── kustomization.yaml

I updated the replicas field and the values in the deployment file.


replicas: 30 # update this
        - matchExpressions:
            - key: type
              operator: In
                - kwokkk # update this

The above manifest, updates the deployment to 30 pods, it also uses the wrong label value, so the pods are unschedulable.

I intentionally made them unschedulable so we can see some scheduler metrics in action.

Now, push your config to GitHub and wait for 60s, so Flux pulls and updates your cluster.

git commit -am "update deployment"
git push
sleep 60
kubectl get deployments.apps nginx-deployment

nginx-deployment   0/30    30           0           16m

View Scheduler Metrics on Grafana Dashboard

Log into Grafana by visiting localhost:3000. The username and password you received may be this.

username = admin
password = prom-operator

Go to Menu > Dashboard > Kubernetes/Scheduler> Edit > Add > Visualization > Queries > Metrics

Add a metric to view all pending and unschedulable pods.

  • scheduler_pending_pods{queue="unschedulable"}

You should see a graph similar to this, telling you 30 pods are currently pending.

Prometheus metrics graph on Grafana dashboard

Prometheus metrics graph on Grafana dashboard


Now, we have learned how engineers and developers use GitOps to manage their KWOK deployment, collect metrics from the control-plane, and view them on Grafana. This method works on several other use-cases. In a public cloud’s managed Kubernetes clusters, the clusters control plane node is often restricted, however, the cloud provider exposes the Prometheus metrics to you. If you manage your own control plane node, then this article should help you learn to expose yours for testing.