修改 kubernetes master 主机名(hostname)与节点名称(node name)
<p>这篇博文记录的是修改 k8s 集群 master(control plane) 的主机名与节点名称的操作步骤,是 用 master 服务器镜像恢复出新集群 的后续博文,目标是将 master 主机名与节点名称由 <code>k8s-master0</code> 修改为 <code>kube-master0</code>。</p><p>服务器操作系统是 Ubuntu 18.04,Kubernetes 版本是 1.20.2。</p>
<h3 id="第1次修改尝试">第1次修改尝试</h3>
<p>修改 master 服务器 hostname</p>
<pre><code class="language-sh">hostnamectl set-hostname kube-master0
</code></pre>
<p>替换 /etc/kubernetes/manifests 中与主机名相关的配置</p>
<pre><code class="language-sh">oldhost=k8s-master0
newhost=kube-master0
cd /etc/kubernetes/manifests
find . -type f | xargs grep $oldhost
find . -type f | xargs sed -i "s/$oldhost/$newhost/"
find . -type f | xargs grep $newhost
</code></pre>
<p>替换 kubeadm-config 中的主机名</p>
<pre><code class="language-sh">kubectl edit cm kubeadm-config -n kube-system
:%s/k8s-master0/kube-master0
</code></pre>
<p>重启相关服务是配置修改生效</p>
<pre><code class="language-sh">systemctl daemon-reload && systemctl restart kubelet && systemctl restart docker
</code></pre>
<p>进入 etcd 容器确认 member 名称是否已更新</p>
<pre><code class="language-sh">docker exec -it $(docker ps -f name=etcd_etcd -q) /bin/sh
etcdctl --endpoints 127.0.0.1:2379 --cacert /etc/kubernetes/pki/etcd/ca.crt --cert /etc/kubernetes/pki/etcd/server.crt --key /etc/kubernetes/pki/etcd/server.key member list
896d19d1d0a08f49, started, kube-master0, https://10.0.9.171:2380, https://10.0.9.171:2379, false
</code></pre>
<p>查看 node name 是否已经改过来</p>
<pre><code class="language-sh">$ kubectl get nodes
NAME STATUS ROLES AGE VERSION
k8s-master0 NotReady control-plane,master 372d v1.20.2
</code></pre>
<p>很遗憾,没改过来。</p>
<h3 id="第2次修改尝试">第2次修改尝试</h3>
<p>通过 <code>kubectl edit node k8s-master0</code> 查看节点配置有3个地方还在使用 <code>k8s-master0</code></p>
<ol>
<li>metadata -> labels: <code>kubernetes.io/hostname: kube-master0</code>(可以直接修改)</li>
<li>metadata: <code>name: k8s-master0</code>(无法修改,报错"error: At least one of apiVersion, kind and name was changed")</li>
<li>status -> addresses:(修改后再次打开又恢复为原值)</li>
</ol>
<pre><code class="language-yaml">- address: k8s-master0
type: Hostname
</code></pre>
<p>修改 node 配置文件的方法未成功。</p>
<h3 id="第3次修改尝试">第3次修改尝试</h3>
<p>尝试通过 etcdctl 直接修改 etcd 数据库中包含 k8s-master0 的配置数据</p>
<p>设置 etcdctl 的环境变量</p>
<pre><code class="language-sh">export ETCDCTL_CACERT=/etc/kubernetes/pki/etcd/ca.crt
export ETCDCTL_CERT=/etc/kubernetes/pki/etcd/server.crt
export ETCDCTL_KEY=/etc/kubernetes/pki/etcd/server.key
export ETCDCTL_ENDPOINTS=10.0.9.171:2379
</code></pre>
<p>导出所有配置</p>
<pre><code class="language-sh">etcdctl get "" --prefix-w json > etcd-kv.json
</code></pre>
<p>基于 etcd-kv.json 导出所有包含 k8s-master0 的配置</p>
<pre><code class="language-sh">for k in $(cat etcd-kv.json | jq '.kvs[].key' | cut -d '"' -f2); do echo $k | base64 --decode; echo; done | grep k8s-master0 > kv_k8s-master0.txt
</code></pre>
<p>导出结果如下</p>
<pre><code class="language-text">/registry/crd.projectcalico.org/blockaffinities/k8s-master0-192-168-70-128-26
/registry/crd.projectcalico.org/ipamhandles/ipip-tunnel-addr-k8s-master0
/registry/csinodes/k8s-master0
/registry/events/default/k8s-master0.165a969b97e7c4ea
...
/registry/events/kube-system/etcd-k8s-master0.165a984e78509ebd
...
/registry/events/kube-system/kube-apiserver-k8s-master0.165a96905a9bf40c
...
/registry/events/kube-system/kube-controller-manager-k8s-master0.165a7016cd8a6ca9
...
/registry/events/kube-system/kube-scheduler-k8s-master0.165a7016cead2a32
...
/registry/leases/kube-node-lease/k8s-master0
/registry/minions/k8s-master0
/registry/pods/kube-system/etcd-k8s-master0
/registry/pods/kube-system/kube-apiserver-k8s-master0
/registry/pods/kube-system/kube-controller-manager-k8s-master0
/registry/pods/kube-system/kube-scheduler-k8s-master0
</code></pre>
<p>通过下面的命令添加 /registry/minions/k8s-master0</p>
<pre><code>key=/registry/minions/k8s-master0
etcdctl get $key --print-value-only > kv-temp.txt
sed -i "s/k8s-master0/kube-master0/" kv-temp.txt
cat kv-temp.txt | etcdctl put `echo $key | sed "s/k8s-master0/kube-master0/"`
</code></pre>
<p>添加之后运行 kubectl get nodes 报错</p>
<pre><code class="language-text">Error from server: proto: Unknown: illegal tag 0 (wire type 0)
</code></pre>
<p>给 etcdctl 加了 -w fields 参数后消除了上面的报错,但通过 etcdctl 修改的尝试也失败了,详见博问 https://q.cnblogs.com/q/133164/</p>
<h3 id="第4次修改尝试">第4次修改尝试</h3>
<p>导出 k8s-master0 的 node 配置文件</p>
<pre><code class="language-sh">kubectl get node k8s-master0 -o yaml > kube-master0.yml
</code></pre>
<p>将配置文件中的 k8s-master0 替换为 kube-master0</p>
<pre><code class="language-sh">sed -i "s/k8s-master0/kube-master0/" kube-master0.yml
</code></pre>
<p>将宿主机 hostname 修改为 kube-master0</p>
<pre><code class="language-sh">hostnamectl set-hostname kube-master0
</code></pre>
<p>替换 /etc/kubernetes/manifests 中与主机名相关的配置</p>
<pre><code class="language-sh">oldhost=k8s-master0
newhost=kube-master0
cd /etc/kubernetes/manifests
find . -type f | xargs sed -i "s/$oldhost/$newhost/"
</code></pre>
<p>通过 etcdctl 从 etcd 中删除 /registry/minions/k8s-master0</p>
<pre><code>etcdctl del /registry/minions/k8s-master0
</code></pre>
<p>用之前导出并修改的配置文件部署 kube-master0 node</p>
<pre><code class="language-sh">kubectl apply -f kube-master0.yml
</code></pre>
<p>这样一番操作后,kubectl get nodes 列表中出现了 kube-master0,但处于 NotReady 状态</p>
<pre><code class="language-sh">NAME STATUS ROLES AGE VERSION
kube-master0 NotReady control-plane,master 97m v1.20.2
</code></pre>
<p>syslog 中的错误日志之一</p>
<blockquote>
<p>Jan 20 18:20:27 kube-master0 kubelet: E0120 18:20:27.460470 23220 controller.go:144] failed to ensure lease exists, will retry in 7s, error: leases.coordination.k8s.io "kube-master0" is forbidden: User "system:node:k8s-master0" cannot get resource "leases" in API group "coordination.k8s.io" in the namespace "kube-node-lease": can only access node lease with the same name as the requesting node</p>
</blockquote>
<p>从日志中的 <code>User system:node:k8s-master0"</code> 获知 node 的用户名还没改过来,查看 /etc/kubernetes/kubelet.conf</p>
<pre><code class="language-conf">users:
- name: default-auth
user:
client-certificate: /var/lib/kubelet/pki/kubelet-client-current.pem
client-key: /var/lib/kubelet/pki/kubelet-client-current.pem
</code></pre>
<p>用户信息是来自 <code>/var/lib/kubelet/pki/</code> 中的证书文件 kubelet-client-current.pem,用 openssl 命令查看证书绑定的 common name (CN)</p>
<pre><code class="language-sh">$ openssl x509 -noout -subject -in kubelet-client-current.pem
subject=O = system:nodes, CN = system:node:k8s-master0
</code></pre>
<p>原来证书还是改名之前的,需要针对新主机名为节点的 kubelet 重新生成证书。</p>
<p>经过一番折腾后,用下面的 kubeadm 命令轻松搞定:</p>
<pre><code class="language-sh">kubeadm init phase kubeconfig kubelet
</code></pre>
<p>运行上面的命令重新生成证书后,/etc/kubernetes/kubelet.conf 中 users 部分变成下面的内容:</p>
<pre><code class="language-conf">users:
- name: system:node:kube-master0
user:
client-certificate-data:
***...
client-key-data:
***...
</code></pre>
<p>重启 kubelet</p>
<pre><code class="language-sh">systemctl restart kubelet
</code></pre>
<p>终于大功告成!</p>
<pre><code class="language-sh">$ kubectl get nodes
NAME STATUS ROLES AGE VERSION
kube-master0 Ready control-plane,master 18h v1.20.2
</code></pre>
<p>2022年5月21日补充:还需要删除 /etc/kubernetes/pki/etcd 中除了 ca.crt 与 ca.key 之外的证书文件,用下面的命令重新生成证书</p>
<pre><code class="language-bash">kubeadm init phase certs etcd-server
kubeadm init phase certs etcd-peer
kubeadm init phase certs etcd-healthcheck-client
</code></pre><br><br>
来源:https://www.cnblogs.com/dudu/p/14286983.html
頁:
[1]