K8S 集群内部 POD 访问外部域名,偶尔不能完全解析问题排查
<h2 id="背景">背景</h2><ul>
<li>部分部署在 <code>dev</code> 环境服务,例如:<code>minio</code> 服务,需要对外( <code>k8s</code> 集群外部)提供服务。</li>
<li>公司 <code>dev</code> 环境 <code>POD</code> 部分业务也会走域名解析访问 <code>minio</code> 服务,服务访问 <code>minio</code> 域名,偶尔不能访问成功</li>
</ul>
<h2 id="分析">分析</h2>
<ul>
<li>启用一个自带 <code>nsloopup</code> 命令的 <code>POD</code> 进行测试</li>
</ul>
<pre><code class="language-yaml">$ cat demo.yaml
#deploy
apiVersion: apps/v1
kind: Deployment
metadata:
name: tomcat-demo
spec:
selector:
matchLabels:
app: tomcat-demo
replicas: 1
template:
metadata:
labels:
app: tomcat-demo
spec:
containers:
- name: tomcat-demo
image: tomcat:8.0.51-alpine
ports:
- containerPort: 8080
---
#service
apiVersion: v1
kind: Service
metadata:
name: tomcat-demo
spec:
ports:
- port: 80
protocol: TCP
targetPort: 8080
selector:
app: tomcat-demo
</code></pre>
<pre><code>kubectl apply -f dome.yaml
</code></pre>
<ul>
<li>进入容器中,指定 <code>coredns IP</code> 地址后,对域名进行解析</li>
</ul>
<pre><code class="language-shell">$ kubectl exec -it tomcat-dome-xxxx -- bash
# 使用 ping 命令进行测试
bash-4.4# ping dev-minio.evescn.com
ping: bad address 'dev-minio.evescn.com'
bash-4.4# ping dev-minio.evescn.com
ping: bad address 'dev-minio.evescn.com'
bash-4.4#
bash-4.4#
bash-4.4#
bash-4.4# ping dev-minio.evescn.com
PING dev-minio.edocyun.com.cn (172.16.0.223): 56 data bytes
64 bytes from 172.16.0.223: seq=0 ttl=63 time=0.368 ms
64 bytes from 172.16.0.223: seq=1 ttl=63 time=0.391 ms
64 bytes from 172.16.0.223: seq=2 ttl=63 time=0.354 ms
# 使用 nslookup 进行测试
bash-4.4# nslookup dev-minio.evescn.com 10.0.0.2
Server: 10.0.0.2
Address 1: 10.0.0.2 kube-dns.kube-system.svc.cluster.local
nslookup: can't resolve 'dev-minio.evescn.com': Name does not resolve
bash-4.4# nslookup dev-minio.evescn.com 10.0.0.2
Server: 10.0.0.2
Address 1: 10.0.0.2 kube-dns.kube-system.svc.cluster.local
nslookup: can't resolve 'dev-minio.evescn.com': Name does not resolve
bash-4.4# nslookup dev-minio.evescn.com 10.0.0.2
Server: 10.0.0.2
Address 1: 10.0.0.2 kube-dns.kube-system.svc.cluster.local
Name: dev-minio.evescn.com
Address 1: 172.16.0.232 172-16-0-232.node-exporter.kubesphere-monitoring-system.svc.cluster.local
Address 2: 172.16.0.231 172-16-0-231.kubelet.kube-system.svc.cluster.local
</code></pre>
<ul>
<li>问题分析</li>
</ul>
<p>经过上面的 <code>ping</code> 测试和 <code>nslookup</code> 测试,分析发现 <code>k8s</code> 集群内部 <code>pod</code> 解析外部域名,先走 <code>coredns</code> 内部域名解析,再走局域网 <code>dns</code> 解析。而无法解析的时候问题原因是:<code>coredns</code> 解析就返回报错了,定位问题为 <code>coredns</code> 解析外部域名存在问题。</p>
<p>网上查看问题,发现可能是 <code>coredns</code> 解析问题导致</p>
<ul>
<li>查看 <code>coredns</code> 的配置文件如下</li>
</ul>
<pre><code class="language-yaml">apiVersion: v1
data:
Corefile: |
.:53 {
errors
health {
lameduck 5s
}
ready
kubernetes cluster.local in-addr.arpa ip6.arpa {
pods insecure
fallthrough in-addr.arpa ip6.arpa
}
prometheus :9153
forward . /etc/resolv.conf
cache 30
reload
loadbalance
}
kind: ConfigMap
metadata:
name: coredns
namespace: kube-system
</code></pre>
<pre><code>其中 forward . /etc/resolv.conf 配置表示使用当coredns内部不能解析的时候,
向宿主机上的resolv.conf文件中配置的nameserver转发dns解析请求,
当宿主机上namserver有多个时,默认采用的时random的方式随机转发,失败后就返回错误。
</code></pre>
<ul>
<li>宿主机 <code>/etc/resolv.conf</code></li>
</ul>
<pre><code class="language-shell">$ cat /etc/resolv.conf
# Generated by NetworkManager
nameserver 172.16.0.50
nameserver 114.114.114.114
</code></pre>
<ul>
<li>将其 forward 的 policy 设置为 sequential</li>
</ul>
<pre><code> forward . /etc/resolv.conf {
max_concurrent 1000 # 新增配置
policy sequential # 新增配置
}
</code></pre>
<h2 id="解决方案">解决方案</h2>
<ul>
<li>编辑 <code>coredns</code> 配置文件,修改配置,并重启 <code>POD </code></li>
</ul>
<pre><code class="language-yaml">$ kubectl -n kube-system edit cm coredns
apiVersion: v1
data:
Corefile: |
.:53 {
errors
health {
lameduck 5s
}
ready
kubernetes cluster.local in-addr.arpa ip6.arpa {
pods insecure
fallthrough in-addr.arpa ip6.arpa
}
prometheus :9153
forward . /etc/resolv.conf {
max_concurrent 1000 # 新增配置
policy sequential # 新增配置
}
cache 30
reload
loadbalance
}
kind: ConfigMap
metadata:
name: coredns
namespace: kube-system
$ kubectl -n kube-system delete pods coredns-xxxxxx
</code></pre>
<h2 id="参考博客">参考博客</h2>
<pre><code>https://blog.csdn.net/u013812710/article/details/119897020
</code></pre><br><br>
来源:https://www.cnblogs.com/evescn/p/16480697.html
頁:
[1]