kubectl top ServiceUnavailable
How to debug and resolve a ServiceUnavailable error when using kubectl top
Situation
When running kubectl top
, you may encounter the ServiceUnavailable
error, which indicates that the metrics server is unable to provide resource usage metrics.
What is the metrics-server?
The Metrics Server is a Kubernetes component that collects resource usage metrics for containers and pods running in a Kubernetes cluster. It provides a simple way to query resource usage metrics such as CPU and memory utilization, and it can be used by tools such as kubectl top
to display resource usage statistics.
The Metrics Server is a cluster-level component, meaning that it collects metrics across all namespaces and pods in the cluster.
The Metrics Server runs as a Deployment in the kube-system
namespace and requires access to the Kubernetes API server and Kubelet to collect metrics. It usually comes pre-installed by your Managed Kubernetes Service Provider, or you may install it manually.
Possible causes
Some potential causes of the "ServiceUnavailable" error when running kubectl top include:
- Pod Not yet ready: The metrics-server has just started up and is currently collecting metrics. This proces usually takes a minute and should resolve itself automatically.
- crashLoopBackOff: The metrics-server is unavailable due to issues such as
crashLoopBackOff
or misconfigured network policies. See also our pod crashloopbackoff runbook - misconfiguration: Misconfigured
APIService
resource, which is used to expose the metrics API to the Kubernetes API server. - Kubernetes API Server: The Kubernetes API server has no access to the metrics-server due to network connectivity or invalid service account permissions.
- Network Access: The metrics-server is running but cannot access the necessary Kubernetes API resources to collect metrics.
- Unable to schedule the pod: Not enough compute resources / nodes available within the cluster to schedule the metrics-server pod. To resolve this, see pod scheduling failed
Diagnosis
Make sure the pod is running. This can be done by running kubectl get pod -n kube-system
. Note that depending on your installation method of metrics-server, it may also be located in a different namespace, such as metrics-server
.
If you find that the metrics server pod is not running, you can investigate the cause of this. A common occurance when the metrics-server pod is restarted or briefly unavailable could be a cluster upgrade on AME Kubernetes. In this case the situation should automatically resolve itself.
If the metrics-server pod is running, examine logs to find causes. This can be done through kubectl logs
.
Look for any error messages that might indicate why the metrics server is unable to provide metrics. You may experience an error when doing this if the Kubernetes node on which this metrics-server pod is running has become unavailable.
Check the connectivity between the metrics server and the Kubernetes API server:
If this command returns an error, there may be a network connectivity issue that is preventing the metrics server from accessing the Kubernetes API server.
Remediation
CrashLoopBackOff
- When using network policies in the pod's namespace, make sure a network policy is in place to allow connectivity to the kubelet ports for each node, as well as the Kubernetes API Server.
- Check for a mistake in the flags provider to the metrics-server. If do not install your metrics-server own deployment this should be done by your cluster provider.
See also our pod crashloopbackoff runbook
Once the issue has been resolved, retry the kubectl top
command to ensure that resource usage metrics are now available.
Last updated on