Kubernetes workshop: Resources allocation and autoscaling
Resources allocation in Kubernetes
Resources allocation in Kubernetes is made using requests
and limits
in the container's definition.
requests
: What the container is guaranteed to get. These values are used when the scheduler takes a decision on where (what node) to place a given pod.limits
: Are values that cannot be exceeded
ℹ️ You can use explain
to have a look to the documentation of resources.
1kubectl explain --recursive pod.spec.containers.resources.limits
2KIND: Pod
3VERSION: v1
4
5FIELD: limits <map[string]string>
6
7DESCRIPTION:
8 Limits describes the maximum amount of compute resources allowed. More
9...
The wordpress we've created in the previous lab doesn't have resources definition.
There are different ways to edit its current state (kubectl edit
, apply
, patch
...)
1kubectl edit deploy wordpress
replace resources: {}
with this block
1...
2 resources:
3 requests:
4 cpu: 100m
5 memory: 100Mi
6 limits:
7 cpu: 1000m
8 memory: 200Mi
9...
The pods resources usage can be displayed using (this might take a few seconds)
1kubectl top pods
2NAME CPU(cores) MEMORY(bytes)
3wordpress-694866c6b7-mqxdd 1m 171Mi
4wordpress-mysql-6c597b98bd-4mbbd 1m 531Mi
Configure the autoscaling base on cpu usage. When a pod reaches 50% of its allocated cpu a new pod is created.
1kubectl autoscale deployment wordpress --cpu-percent=50 --min=1 --max=5
2horizontalpodautoscaler.autoscaling/wordpress autoscaled
It takes up to 15 seconds (default configuration) to get the first values
1kubectl get hpa
2NAME REFERENCE TARGETS MINPODS MAXPODS REPLICAS AGE
3wordpress Deployment/wordpress <unknown>/50% 1 5 0 10s
4
5kubectl get hpa
6NAME REFERENCE TARGETS MINPODS MAXPODS REPLICAS AGE
7wordpress Deployment/wordpress 1%/50% 1 5 1 20s
Now we'll run an HTTP bench using wrk. Open a new shell and run
1kubectl run -ti --rm bench --image=jess/wrk -- /bin/sh -c 'wrk -t12 -c100 -d180s http://wordpress'
During the benchmark above (3 minutes duration) let's have a look to the hpa
1watch kubectl get hpa
2Every 2.0s: kubectl get hpa
3hostname: Tue Jun 22 11:13:08 2021
4
5NAME REFERENCE TARGETS MINPODS MAXPODS REPLICAS AGE
6wordpress Deployment/wordpress 1%/50% 1 5 1 8m28s
After a few seconds we'll see that the upscaling will be done automatically. Here the number of replicas will reach the maximum we defined (5 pods).
1Every 2.0s: kubectl get hpa
2hostname: Tue Jun 22 11:14:13 2021
3
4NAME REFERENCE TARGETS MINPODS MAXPODS REPLICAS AGE
5wordpress Deployment/wordpress 998%/50% 1 5 5 9m33s
That was a pretty simple configuration, basing the autoscaling on CPU usage for a webserver makes sense. You can also base the autoscaling on any other metrics that are reported by your application.