Horizontal pod autoscaling #
Using KWOK, we can deploy a metric server to help us observe and trigger our HorizontalPodAutoscaler.
This image shows you what you should expect when testing this scenario. You can follow the step-by-step guide after seeing this.
Prerequisites #
- KWOK must be installed on the machine. See installation.
- Install kubectl
Get metrics usage #
wget https://github.com/kubernetes-sigs/kwok/releases/download/v0.6.0/metrics-usage.yaml
Create cluster #
kwokctl create cluster --enable-metrics-server --config ./metrics-usage.yaml
The arguments --enable-metrics-server --config ./metrics-usage.yaml
provides the ability to
use a metric server in the cluster. The metric server docker image will be downloaded and running as a docker container.
Create a node #
kwokctl scale node --replicas 1
Create the pod #
kubectl apply -f deployment.yaml
- Deployment specification:
- Current CPU usage: 200m
- CPU Request: 1000m
- HPA Utilization percentage = (Current CPU usage / CPU Request) * 100)
- (200/1000) * 100 ≈ 20%
The current CPU usage was defined in the deployment manifest and passed via the below annotation.
annotations:
kwok.x-k8s.io/usage-cpu: 200m
Wait for 50s #
sleep 50
This gives the metric-server time to collect metrics of the pod.
Confirm metric of pod #
kubectl top pod
NAME CPU(cores) MEMORY(bytes)
fake-app-bdccf9b7f-phwtl 201m 1Mi
Deploy HPA #
kubectl apply -f hpa.yaml
- Core HorizontalPodAutoscaler specification:
- MINPODS = 1
- MAXPODS = 2
- average CPU Utilization = 70%
- scaleUp/scaleDown = 100% (meaning increase/decrease current pod by a 100%)
Wait for 5s #
sleep 5
See more details about the HPA event before scaleUp #
kubectl get hpa
NAME REFERENCE TARGETS MINPODS MAXPODS REPLICAS AGE
pod-auto-scaler Deployment/fake-app cpu: 20%/70% 1 2 1 9s
Only one replica in the deployment is running, and the threshold is at 20% which is less than 70%.
Let’s increase the usage metrics of the pod #
POD_NAME=$(kubectl top pods | awk 'NR>1 {print $1}')
kubectl patch pod $POD_NAME --type=json -p='[{"op":"add","path":"/metadata/annotations","value":{"kwok.x-k8s.io/usage-cpu":"800m"}}]'
This command extracts and lists the names of the pods. Then it patches the current CPU usage of each pod’s CPU to “800m”. Here we have just one pod, so it uses that alone. This causes the CPU utilization of the pod to be over the threshold of 70%
Wait for 30s #
sleep 30
See more details about the HPA event after scaleUp #
kubectl describe hpa | awk '/Events:/ {found=1} found {print}' | tail -n 1
Normal SuccessfulRescale 20s horizontal-pod-autoscaler New size: 2; reason: cpu resource utilization (percentage of request) above target
This command displays the event data for when the deployment was auto-scaled to a 100% by the HPA.
Get scaled up HPA CPU utilization metrics #
kubectl get hpa
NAME REFERENCE TARGETS MINPODS MAXPODS REPLICAS AGE
pod-auto-scaler Deployment/fake-app cpu: 50%/70% 1 2 2 54s
We can see now that the CPU utilization of the deployment is less than the threshold. This is because the replica of the deployment is now two.
Let’s reduce the usage metrics to the pods #
POD_NAME=$(kubectl top pods | awk 'NR>1 {print $1}')
kubectl patch pod $POD_NAME --type=json -p='[{"op":"add","path":"/metadata/annotations","value":{"kwok.x-k8s.io/usage-cpu":"200m"}}]'
This command extracts and lists the names of the pods. Then it patches the current CPU usage of each pod’s CPU to “200m”. This causes the CPU utilization of the pod to be less than the threshold of 70%, hence, causing the HPA to scaled down by a 100%.
Wait for 30s #
sleep 30
See more details about the latest HPA event after scaleDown #
kubectl describe hpa | awk '/Events:/ {found=1} found {print}' | tail -n 1
Normal SuccessfulRescale 16s horizontal-pod-autoscaler New size: 1; reason: All metrics below target
Wait for 10s #
sleep 10
Get scaled down HPA CPU utilization metrics #
kubectl get hpa
NAME REFERENCE TARGETS MINPODS MAXPODS REPLICAS AGE
pod-auto-scaler Deployment/fake-app cpu: 20%/70% 1 2 1 2m23s
Delete the cluster #
kwokctl delete cluster
Conclusion #
This example demonstrates how to use KWOK to simulate horizontal pod autoscaling.