Pit Guide --k8s DNS coredns troubleshooting process

text

A few days ago, on ucloud built k8s cluster (build tutorial follow-up will be issued). DNS can not be found today.

Component version: k8s 1.15.0, coredns: 1.3.1

The process is this:

First, create a service with the following nginx yaml file

apiVersion: v1
kind: Service
metadata:
  name: nginx-svc-old
  labels:
    app: nginx-svc
spec:
  selector:
    app: nginx
  ports:
  - protocol: TCP
    port: 80
    targetPort: 80
---
apiVersion: apps/v1beta1
kind: Deployment
metadata:
  name: nginx-old
spec:
  replicas: 1
  template:
    metadata:
      labels:
        app: nginx
    spec:
      containers:
      - name: nginx
        image: nginx
        ports:
        - containerPort: 80

Once you've created:
image.png
because only deployed a master node. Execute the following command directly on the master host:

nslookup nginx-svc-old.default.svc

image.png
Found not resolve the domain name. Also in advance on the host configured in /etc/resolv.conf nameserver {coredns of podIP}
image.png
As a result, it is thought may be coredns a problem. .

Then create a busybox as a debugging tool with the following yaml:

apiVersion: extensions/v1beta1
kind: Deployment
metadata:
  name: busybox-deployment
spec:
  replicas: 1
  template:
    metadata:
      labels:
        app: busybox
    spec:
      restartPolicy: Always
      containers:
      - name: busybox
        command:
        - sleep
        - "3600"
        image: busybox

Here is the cut-off 2019/07/20, the latest image of busybox. Once you've created, exec into the container, a test command
image.png
found not resolved:

/ # nslookup nginx-svc-old.default.svc
Server:    10.96.0.10
Address:  10.96.0.10:53

** server can't find nginx-svc-old.default.svc: NXDOMAIN

*** Can't find nginx-svc-old.default.svc: No answer

According to resolve cluster within the domain principle coredns seen:

A service access service B, the same for the next Namespace, the pod can be directly accessed via curl b. In the case of cross-Namespace, Namespace service name corresponds to the back, such as curl b.default. How DNS resolution, depending on the configuration of the container resolv file.

View busybox resolve.conf files in the container:


[root@liabio nginx]# kubectl exec -ti busybox-deployment-59755c8c6d-rmrfq sh
/ # nslookup nginx-svc-old.default.svc
Server:    10.96.0.10
Address:  10.96.0.10:53

** server can't find nginx-svc-old.default.svc: NXDOMAIN

*** Can't find nginx-svc-old.default.svc: No answer

/ # cat /etc/resolv.conf 
nameserver 10.96.0.10
search default.svc.cluster.local svc.cluster.local cluster.local
options ndots:5
/ #

This file, configure the DNS Server, it is K8S in general, ClusterIP kubedns of Service. This IP is the virtual IP, can not ping, but can be accessed.
image.png
When the transmission request in the container, according to the resolution process will be /etc/resolv.conf. Select nameserver 10.96.0.10 parse, then nginx-svc-old, successively into the search field /etc/resolve.conf, DNS lookup, namely:

search is similar to the following (different pod, a first domain may vary)

search default.svc.cluster.local svc.cluster.local cluster.local
nginx-svc-old.default.svc.cluster.local -> nginx-svc-old.svc.cluster.local -> nginx-svc-old.cluster.local 

Until you find. So, we execute ping nginx-svc-old, or perform a ping nginx-svc-old.default, DNS requests can be done, these two different operations, will be different DNS lookup step, respectively.

Based on the above principles, see the domain name in the /etc/resolv.conf busybox no problem, nameserver points to the correct kube-dns of service clusterIP.

This time there is more suspicion core-dns problem.

But coredns view the log, you can see and there is no error:
image.png
it shows not coredns problem. .

Busybox in the reported errors, google search

*** Can't find nginx-svc-old.default.svc: No answer

image.png

Two found the following issue:

issues1:

https://github.com/kubernetes/kubernetes/issues/66924
image.png

issues2:

https://github.com/easzlab/kubeasz/issues/260
image.png
found to say this is a problem busybox mirror, mirror from 1.28.4 after this there is a problem. 1.28.4 into the mirror to try? Yaml modified version:

apiVersion: extensions/v1beta1
kind: Deployment
metadata:
  name: busybox-deployment
spec:
  replicas: 1
  template:
    metadata:
      labels:
        app: busybox
    spec:
      restartPolicy: Always
      containers:
      - name: busybox
        command:
        - sleep
        - "3600"
        image: busybox:1.28.4

After re-apply, into the container:
image.png

Indeed we can successfully resolve domain names.

Why test command executed directly on the host, the domain name can not be resolved it?
image.png

Continue google, know resolver domain name resolver:

nameserver keyword, if you do not specify the nameserver can not find the DNS server, other keywords are optional. nameserver expressed using the address specified host for the domain name server to resolve domain names. Domain name server in the order in which files appear to inquiries, and only if a nameserver is not the first query response following nameserver, generally do not specify more than three servers.
And I nameserver /etc/resolv.conf on the host as follows:
image.png
and after the first three domain name resolution server can pass.

Now try coredns one of the podIP: 192.168.155.73 into the first nameserver:
image.png
you can see now resolved.

In fact, it is best to kube-dns service of clusterIP placed in /etc/resolv.conf, this pod after the restart can be resolved.
image.png

reference

Analysis of the Linux file /etc/resolv.conf
https://blog.csdn.net/lcr_happy/article/details/54867510

CoreDNS Series 1: Kubernetes internal DNS principle, malpractice and optimizations

https://hansedong.github.io/2018/11/20/9/

History Articles

k8s load balancer] [ingress-nginx deployment

k8s failed to perform tasks using the Job how to do

Pod access from the outside of Kubernetes

k8s load balancer configuration request redirection

Teach you to easily get k8s mirroring and installation package

k8s will learn will be sort of knowledge

docker basics finishing



This public number free offer csdn download service, massive IT learning resources, if you are going into IT pit, inspirational excellent program ape, then these resources is for you, including, but not limited to, java, go, python, springcloud, elk, embedded style, big data, interview data, front-end and other resources. At the same time we set up a technical exchange group, there are many chiefs, will share technical articles from time to time, if you want to come together to improve learning, the public can reply back number [ 2 ], plus free technical exchange group invited to improve learning from each other, will not IT-related programming on a regular basis to share resources.


Scan code concerned, exciting content to the first time you push

image

Guess you like

Origin www.cnblogs.com/liabio/p/11683714.html