DNS issues
Debugging and fixing DNS Resolving Issues in Kubernetes
In a Kubernetes cluster, DNS is crucial for inter-service communication and service discovery. Kubernetes uses DNS for service discovery, allowing pods to communicate with each other via service names instead of IP addresses.
DNS issues can be tricky to diagnose and solve. The main symptom is that Pods cannot reach each other using Kubernetes Service names.
Possible Causes
- CoreDNS is not running or is improperly configured.
- The kube-dns/coredns ConfigMap is incorrectly set.
- The Pod's resolv.conf file is misconfigured.
- Network policies are restricting communication.
- Node local DNS settings are incorrect.
- External factors such as CNI plugins, cloud-provider-specific settings, upstream resolvers, etc.
Diagnosis
Validate a DNS issue
First, confirm that the issue is DNS-related. Try to access the service using its IP address. If that works but the service name does not, it's likely a DNS issue.
Note that not all pods or containers contain curl
. You may need to use wget
or alternative solutions. Failing the presence of any tool, you can use kubectl debug
to attach a new container to your pod with the right debug tooling pre-installed.
CoreDNS
CoreDNS is the default DNS server in Kubernetes as of version 1.11, replacing kube-dns. It's responsible for DNS service discovery in a Kubernetes cluster, allowing Pods to communicate using service names instead of IP addresses.
The CoreDNS behavior can be customized via a ConfigMap, allowing you to specify custom DNS settings for your Kubernetes cluster. This ConfigMap is usually named coredns and located in the kube-system namespace.
Confirm that CoreDNS is running as a Deployment in the kube-system namespace (kubectl -n kube-system get pods -l k8s-app=kube-dns
). If not, check the events and logs.
DNS ConfigMap
If the CoreDNS pod is not running properly, it could indicate an issue with the core dns configmap.
Check the coredns
ConfigMap (kubectl -n kube-system get configmap coredns -o yaml
). The upstream nameserver and domain should be correctly configured.
This is an example of the corefile for configuring CoreDNS:
pod resolv.conf
Check the Pod's resolv.conf file (kubectl exec -ti [pod-name] -- cat /etc/resolv.conf
). The nameserver should be set to the ClusterIP of the kube-dns service, or if using node-local DNS, to the node-local DNS IP.
Network Policies
Network policies can restrict communication. Make sure there's no NetworkPolicy preventing DNS queries.
If needed, install a network policy to allow DNS queries:
Node Local DNS
NodeLocal DNSCache is a DNS caching agent that runs as a DaemonSet in Kubernetes. Its primary purpose is to improve DNS lookup performance and reliability, particularly in larger or high-load clusters.
NodeLocal DNSCache addresses DNS performance and timeout problems by running a DNS caching agent on each node (as a DaemonSet
), which stores the DNS query result locally on the node. When a Pod performs a DNS lookup, it contacts the local caching agent first, which can return a cached response if it's available, avoiding the need to traverse the network and query the cluster DNS service. This significantly reduces DNS lookup latency and network DNS traffic.
The NodeLocal DNSCache also helps bypass issues related to conntrack entries for DNS queries. In some cases, these conntrack entries can be a limiting factor for DNS performance, or they can cause DNS lookup timeouts.
- If your cluster uses NodeLocal DNSCache, check the logs of the node local DNS pods.
- Make sure the DNSCache pod is running correctly on your nodes.
External Factors
Consider external factors such as CNI plugins, cloud-provider-specific settings, upstream DNS servers and/or firewalls.
References
Last updated on