In the first part of this post, we visited autoscaling using the metrics server. In part two, we will look at using custom metrics, specifically those from linkerd and ingress-nginx to perform autoscaling based on latency and requests per second.
Linkerd is a light weight service mesh and CNCF project. It can help you get instant platform wide metrics for things such as success rates, latency, and much other traffic related metrics, without having to change your code.
For the first part of our autoscaling, we will look at using the metrics obtained from the linkerd proxy to scale our deployment up or down. For this to work, you need to have both linkerd and the viz plugin installed.
This tells the prometheus adapter where to find Prometheus, and configure a custom query we can use in our Horizontal Pod Autoscaler resource.
Install this through:
$ helm repo add prometheus-community https://prometheus-community.github.io/helm-charts$ helm repo update$ helm install prometheus-adapter -f prometheus-adapter.yaml prometheus-community/prometheus-adapterNAME: prometheus-adapterLAST DEPLOYED: Sat Oct 9 16:37:20 2021NAMESPACE: defaultSTATUS: deployedREVISION: 1TEST SUITE: NoneNOTES:prometheus-adapter has been deployed.In a few minutes you should be able to list metrics using the following command(s): kubectl get --raw /apis/custom.metrics.k8s.io/v1beta1
After installation, you can validate if the Prometheus Adapter is running correctly by querying the APIServices:
$ kubectl get apiservice v1beta1.custom.metrics.k8s.ioNAME SERVICE AVAILABLE AGEv1beta1.custom.metrics.k8s.io default/prometheus-adapter False (MissingEndpoints) 24s
During start-up, the status will remain at MissingEndpoints. Once the pod is up and running, AVAILABLE should jump to True:
$ kubectl get apiservice v1beta1.custom.metrics.k8s.ioNAME SERVICE AVAILABLE AGEv1beta1.custom.metrics.k8s.io default/prometheus-adapter True 43s
You should now be able to valid if your custom rules have been installed properly by using:
$ kubectl get --raw /apis/custom.metrics.k8s.io/v1beta1{"kind":"APIResourceList","apiVersion":"v1","groupVersion":"custom.metrics.k8s.io/v1beta1","resources":[....]}
$ kubectl apply -f example.yamlnamespace/scalingtest createddeployment.apps/sampleapp createdservice/sampleapp created$ kubectl get podNAME READY STATUS RESTARTS AGEsampleapp-7db7fdcd9d-95czg 2/2 Running 0 4s
Next, we deploy a load generator. For this, we make use of the slow_cooker project from Buoyant, the people behind linkerd.
kubectl run load-generator --image=buoyantio/slow_cooker -- -qps 100 -concurrency 10 http://sampleapp
This will generate 100 rps traffic against the deployed nginx. You can follow the logs of the load-generator pod to view various metrics, such as latency.
If you wait a minute or so, and run kubectl top pod, you will notice that the CPU usage of the nginx pod has risen.
$ kubectl top podNAME CPU(cores) MEMORY(bytes)load-generator 128m 5Misampleapp-7db7fdcd9d-95czg 79m 2Mi
In this second part we created a deployment that used autoscaling based on custom metrics. We've done this using both ingress-nginx and Linkerd metrics, using the Prometheus Adapter.
Using custom metrics, you can set up your autoscaling based on predictions so you can start spinning up capacity before your peak load.