Kubernetes安装报错总结
<p>1.kubeadm init初使化报错</p><p># kubeadm init --kubernetes-version=v1.13.3 --pod-network-cidr=10.244.0.0/16 --service-cidr=10.96.0.0/12 --ignore-preflight-errors=Swap</p>
<p> Using Kubernetes version: v1.13.3</p>
<p> Running pre-flight checks</p>
<p>: running with swap on is not supported. Please disable swap</p>
<p>: this Docker version is not on the list of validated versions: 18.09.2. Latest validated version: 18.06</p>
<p> Pulling images required for setting up a Kubernetes cluster</p>
<p> This might take a minute or two, depending on the speed of your internet connection</p>
<p> You can also perform this action in beforehand using 'kubeadm config images pull'</p>
<p>error execution phase preflight: Some fatal errors occurred:</p>
<p>: failed to pull image k8s.gcr.io/kube-apiserver:v1.13.3: output: Error response from daemon: Get https://k8s.gcr.io/v2/: net/http: request canceled while waiting for connection (Client.Timeout exceeded while awaiting headers)</p>
<p>, error: exit status 1</p>
<p>: failed to pull image k8s.gcr.io/kube-controller-manager:v1.13.3: output: Error response from daemon: Get https://k8s.gcr.io/v2/: net/http: request canceled while waiting for connection (Client.Timeout exceeded while awaiting headers)</p>
<p>, error: exit status 1</p>
<p> </p>
<p>解决方法:</p>
<p># cat 12.sh </p>
<p>#!/bin/bash</p>
<p>docker pull mirrorgooglecontainers/kube-apiserver:v1.13.3</p>
<p>docker pull mirrorgooglecontainers/kube-controller-manager:v1.13.3</p>
<p>docker pull mirrorgooglecontainers/kube-scheduler:v1.13.3</p>
<p>docker pull mirrorgooglecontainers/kube-proxy:v1.13.3</p>
<p>docker pull mirrorgooglecontainers/pause:3.1</p>
<p>docker pull mirrorgooglecontainers/etcd:3.2.24</p>
<p>docker pull coredns/coredns:1.2.6</p>
<p>docker tag mirrorgooglecontainers/kube-proxy:v1.13.3 k8s.gcr.io/kube-proxy:v1.13.3</p>
<p>docker tag mirrorgooglecontainers/kube-scheduler:v1.13.3 k8s.gcr.io/kube-scheduler:v1.13.3</p>
<p>docker tag mirrorgooglecontainers/kube-apiserver:v1.13.3 k8s.gcr.io/kube-apiserver:v1.13.3</p>
<p>docker tag mirrorgooglecontainers/kube-controller-manager:v1.13.3 k8s.gcr.io/kube-controller-manager:v1.13.3</p>
<p>docker tag mirrorgooglecontainers/etcd:3.2.24 k8s.gcr.io/etcd:3.2.24</p>
<p>docker tag coredns/coredns:1.2.6 k8s.gcr.io/coredns:1.2.6</p>
<p>docker tag mirrorgooglecontainers/pause:3.1 k8s.gcr.io/pause:3.1</p>
<p>docker rmi mirrorgooglecontainers/kube-apiserver:v1.13.3</p>
<p>docker rmi mirrorgooglecontainers/kube-controller-manager:v1.13.3</p>
<p>docker rmi mirrorgooglecontainers/kube-scheduler:v1.13.3</p>
<p>docker rmi mirrorgooglecontainers/kube-proxy:v1.13.3</p>
<p>docker rmi mirrorgooglecontainers/pause:3.1</p>
<p>docker rmi mirrorgooglecontainers/etcd:3.2.24</p>
<p>docker rmi coredns/coredns:1.2.6:q</p>
<p># ./12.sh</p>
<p> </p>
<p>2.关闭swap交换分区</p>
<p>Unfortunately, an error has occurred:</p>
<p>timed out waiting for the condition</p>
<p>This error is likely caused by:</p>
<p>- The kubelet is not running</p>
<p>- The kubelet is unhealthy due to a misconfiguration of the node in some way (required cgroups disabled)</p>
<p>If you are on a systemd-powered system, you can try to troubleshoot the error with the following commands:</p>
<p>- 'systemctl status kubelet'</p>
<p>- 'journalctl -xeu kubelet'</p>
<p> </p>
<p>解决方法:</p>
<p># echo "KUBELET_EXTRA_ARGS=--fail-swap-> /etc/sysconfig/kubelet</p>
<p># vim /etc/fstab </p>
<p>#UUID=c5f6d686-6b5a-48ae-92c0-df2f44b6402b swap swap defaults 0 0 --注释swap</p>
<p># </p>
<p> </p>
<p>3.k8s从节点pod不能显示</p>
<p> </p>
<p># kubectl get pods</p>
<p>The connection to the server localhost:8080 was refused - did you specify the right host or port?</p>
<p># </p>
<p> </p>
<p>解决方法:</p>
<p># scp -r /etc/kubernetes/admin.conf root@k8s2:/etc/kubernetes/ --将admin.conf文件拷贝到其它从节点</p>
<p># vim /root/.bash_profile --在各个从节点添加环境变量</p>
<p>export KUBECONFIG=/etc/kubernetes/admin.conf</p>
<p># source /root/.bash_profile </p>
<p> </p>
<p>4.使用Harbor上传镜像报错</p>
<p> </p>
<p># docker login http://192.168.8.10</p>
<p>Username: admin</p>
<p>Password: </p>
<p>Error response from daemon: Get https://192.168.8.10/v2/: dial tcp 192.168.8.10:443: connect: connection refused</p>
<p> </p>
<p>解决方法:</p>
<p># vim /usr/lib/systemd/system/docker.service --修改参数</p>
<p>ExecStart=/usr/bin/dockerd -H fd:// --insecure-registry=192.168.8.10</p>
<p># systemctl daemon-reload</p>
<p># systemctl restart docker</p>
<p># ps -ef | grep -i docker</p>
<p>root 808561 1884 0 16:08 ? 00:00:00 containerd-shim -namespace moby -workdir /var/lib/containerd/io.containerd.runtime.v1.linux/moby/5f13be571995f00b5ec00b8941f612bbf0b5429ae183ab0553e579d31c43f300 -address /run/containerd/containerd.sock -containerd-binary /usr/bin/containerd -runtime-root /var/run/docker/runtime-runc</p>
<p>root 874547 1 6 16:50 ? 00:00:00 /usr/bin/dockerd -H fd:// --insecure-registry=192.168.8.10</p>
<p># docker-compose ps</p>
<p> Name Command State Ports </p>
<p>------------------------------------------------------------------------------------------------------------------------------</p>
<p>harbor-adminserver /harbor/start.sh Up </p>
<p>harbor-core /harbor/start.sh Up </p>
<p>harbor-db /entrypoint.sh postgres Up 5432/tcp </p>
<p>harbor-jobservice /harbor/start.sh Up </p>
<p>harbor-log /bin/sh -c /usr/local/bin/ ... Up 127.0.0.1:1514->10514/tcp </p>
<p>harbor-portal nginx -g daemon off; Up 80/tcp </p>
<p>nginx nginx -g daemon off; Up 0.0.0.0:443->443/tcp, 0.0.0.0:4443->4443/tcp, 0.0.0.0:80->80/tcp</p>
<p>redis docker-entrypoint.sh redis ... Up 6379/tcp </p>
<p>registry /entrypoint.sh /etc/regist ... Up 5000/tcp </p>
<p>registryctl /harbor/start.sh Up </p>
<p># docker login 192.168.8.10</p>
<p>Username: admin</p>
<p>Password: </p>
<p>WARNING! Your password will be stored unencrypted in /root/.docker/config.json.</p>
<p>Configure a credential helper to remove this warning. See</p>
<p>https://docs.docker.com/engine/reference/commandline/login/#credentials-store</p>
<p>Login Succeeded</p>
<p>#</p>
<p> </p>
<p>5.在kubernetes添加节点后提示CNI问题</p>
<p># kubectl describe pods ecs-web-desktop-7cbc98dcdb-nw4fw -n ecs-local-area</p>
<p> Normal SuccessfulMountVolume 8h kubelet, ecsnode03 MountVolume.SetUp succeeded for volume "volume-data"<br> Normal SuccessfulMountVolume 8h kubelet, ecsnode03 MountVolume.SetUp succeeded for volume "setenv-sh"<br> Normal SuccessfulMountVolume 8h kubelet, ecsnode03 MountVolume.SetUp succeeded for volume "default-token-69l45"<br> Warning FailedCreatePodSandBox 8h (x11 over 8h) kubelet, ecsnode03 Failed create pod sandbox: rpc error: code = Unknown desc = NetworkPlugin cni failed to set up pod "ecs-web-desktop-7cbc98dcdb-nw4fw_ecs-local-area" network: could not initialize etcdv3 client: open /etc/cni/etcd/pki/calico-etcd-client.crt: no such file or directory<br> Normal SandboxChanged 8h (x123 over 8h) kubelet, ecsnode03 Pod sandbox changed, it will be killed and re-created.<br> Normal Scheduled 7m default-scheduler Successfully assigned ecs-web-desktop-7cbc98dcdb-nw4fw to ecsnode03</p>
<p> </p>
<p>解决方法:</p>
<p>是因为master节点etcd没有新节点数据信息,过5分钟就可以了。</p>
<p> </p>
<p>6.安装kubenetes到成功后CoreDNS频繁重启(原因是Iptables规则乱了)</p>
<p># kubectl get pods -n kube-system | grep coredns</p>
<p>NAME READY STATUS RESTARTS AGE</p>
<p>coredns-5c98db65d4-8vc5h 0/1 CrashLoopBackOff 3 7m42s</p>
<p>coredns-5c98db65d4-vq9j5 0/1 CrashLoopBackOff 3 7m42s</p>
<p># kubectl logs pods coredns-5c98db65d4-8vc5h -n kube-system</p>
<p>Error from server (NotFound): pods "pods" not found</p>
<p># kubectl logs coredns-5c98db65d4-8vc5h -n kube-system</p>
<p>E0907 06:44:40.045232 1 reflector.go:134] github.com/coredns/coredns/plugin/kubernetes/controller.go:317: Failed to list *v1.Endpoints: Get https://10.96.0.1:443/api/v1/endpoints?limit=500&resourceVersion=0: dial tcp 10.96.0.1:443: connect: no route to host</p>
<p>E0907 06:44:40.045232 1 reflector.go:134] github.com/coredns/coredns/plugin/kubernetes/controller.go:317: Failed to list *v1.Endpoints: Get https://10.96.0.1:443/api/v1/endpoints?limit=500&resourceVersion=0: dial tcp 10.96.0.1:443: connect: no route to host</p>
<p>log: exiting because of error: log: cannot create log: open /tmp/coredns.coredns-5c98db65d4-8vc5h.unknownuser.log.ERROR.20190907-064440.1: no such file or directory</p>
<p># </p>
<p> </p>
<p>解决方法:</p>
<p># iptables -F</p>
<p># iptables -Z</p>
<p># systemctl restart kubelet</p>
<p># systemctl restart docker</p>
<p># kubectl get pods -n kube-system | grep coredns</p>
<p>NAME READY STATUS RESTARTS AGE</p>
<p>coredns-5c98db65d4-8vc5h 1/1 Running 7 12m</p>
<p>coredns-5c98db65d4-vq9j5 1/1 Running 8 12m</p>
<p>#</p>
<p> </p>
<p>7.集群coredns组件挂起状态(flannel组件未初使化,需要重新安装)</p>
<p># kubectl get pods -n kube-system | grep -i coredns<br>coredns-5644d7b6d9-8wvgt 0/1 ContainerCreating 0 79m<br>coredns-5644d7b6d9-pzr7g 0/1 ContainerCreating 0 79m</p>
<p># journalctl -u kubelet -f</p>
<p>Oct 19 14:46:20 k8s01 kubelet: E1019 14:46:20.701679 653 pod_workers.go:191] Error syncing pod e641b551-7f22-40fa-b847-658f6c7696fa ("tiller-deploy-8557598fbc-6jfp7_kube-system(e641b551-7f22-40fa-b847-658f6c7696fa)"), skipping: network is not ready: runtime network not ready: NetworkReady=false reason:NetworkPluginNotReady message:docker: network plugin is not ready: cni config uninitialized<br>Oct 19 14:46:20 k8s01 kubelet: E1019 14:46:20.702091 653 pod_workers.go:191] Error syncing pod bd45bbe0-8529-4ee4-9fcf-90528178dc0d ("coredns-5c98db65d4-rtktb_kube-system(bd45bbe0-8529-4ee4-9fcf-90528178dc0d)"), skipping: network is not ready: runtime network not ready: NetworkReady=false reason:NetworkPluginNotReady message:docker: network plugin is not ready: cni config uninitialized<br>Oct 19 14:46:20 k8s01 kubelet: E1019 14:46:20.702396 653 pod_workers.go:191] Error syncing pod 87d24c8c-bba8-420b-8901-9e2b8bc339ac ("coredns-5644d7b6d9-8wvgt_kube-system(87d24c8c-bba8-420b-8901-9e2b8bc339ac)"), skipping: network is not ready: runtime network not ready: NetworkReady=false reason:NetworkPluginNotReady message:docker: network plugin is not ready: cni config uninitialized</p>
<p> </p>
<p>解决方法:</p>
<p># wget https://raw.githubusercontent.com/coreos/flannel/master/Documentation/kube-flannel.yml</p>
<p># kubectl apply -f kube-flannel.yml<br>podsecuritypolicy.policy/psp.flannel.unprivileged configured<br>clusterrole.rbac.authorization.k8s.io/flannel unchanged<br>clusterrolebinding.rbac.authorization.k8s.io/flannel unchanged<br>serviceaccount/flannel unchanged<br>configmap/kube-flannel-cfg configured<br>daemonset.apps/kube-flannel-ds-amd64 configured<br>daemonset.apps/kube-flannel-ds-arm64 configured<br>daemonset.apps/kube-flannel-ds-arm configured<br>daemonset.apps/kube-flannel-ds-ppc64le configured<br>daemonset.apps/kube-flannel-ds-s390x configured<br># kubectl get pods -n kube-system | grep -i coredns<br>coredns-5644d7b6d9-8wvgt 1/1 Running 0 92m<br>coredns-5644d7b6d9-pzr7g 1/1 Running 0 92m<br>#</p>
<p> </p>
<p>8.在k8s集群中apiserver占用很高的CPU资源,导致频繁重启。</p>
<p><img src="http://img.blog.itpub.net/blog/2019/10/25/ed530eb532ac6478.png?x-oss-process=style/bb" alt="" title=""></p>
<p># kubectl get nodes<br>The connection to the server 172.31.129.93:6443 was refused - did you specify the right host or port?<br>#</p>
<p>解决方法:</p>
<p>(1).服务器时间不对</p>
<p># date<br>Fri Oct 25 05:04:24 CST 2019</p>
<p># ntpdate time1.aliyun.com<br>25 Oct 13:04:46 ntpdate: step time server 203.107.6.88 offset 28800.550337 sec<br># date<br>Fri Oct 25 13:04:54 CST 2019<br>#</p>
<p>(2).查看系统日志(从日志查看出与etcd pod有关)</p>
<p># tail -200f /var/log/messages</p>
<p>Oct 25 05:04:43 localhost kernel: XFS (dm-5): Unmounting Filesystem<br>Oct 25 05:04:43 localhost dockerd: time="2019-10-25T05:04:43.961739020+08:00" level=error msg="Handler for POST /v1.24/containers/c2d922d8a203383725b819e497272b25b8e8db315d03a75540b6fbfb3a3ed565/stop returned error: Container c2d922d8a203383725b819e497272b25b8e8db315d03a75540b6fbfb3a3ed565 is already stopped"<br>Oct 25 05:04:44 localhost kernel: XFS (dm-2): Unmounting Filesystem<br>Oct 25 05:04:44 localhost dockerd: time="2019-10-25T05:04:44.187195506+08:00" level=error msg="Handler for POST /v1.24/containers/create returned error: Conflict. The name \"/k8s_POD_etcd-ecsmaster01_kube-system_fb87a1f1730f1fe65cd7ce2b1d5a84b8_3\" is already in use by container a468c762916792739e99fc48a85c4a24ce848e09c430be1da99a44576266e548. You have to remove (or rename) that container to be able to reuse that name."<br>Oct 25 05:04:44 localhost kubelet: W1025 05:04:44.187756 8889 helpers.go:284] Unable to create pod sandbox due to conflict. Attempting to remove sandbox "a468c762916792739e99fc48a85c4a24ce848e09c430be1da99a44576266e548"<br>Oct 25 05:04:44 localhost systemd-udevd: inotify_add_watch(7, /dev/dm-2, 10) failed: No such file or directory<br>Oct 25 05:04:44 localhost kubelet: E1025 05:04:44.481171 8889 remote_runtime.go:92] RunPodSandbox from runtime service failed: rpc error: code = Unknown desc = failed to create a sandbox for pod "etcd-ecsmaster01": Error response from daemon: Conflict. The name "/k8s_POD_etcd-ecsmaster01_kube-system_fb87a1f1730f1fe65cd7ce2b1d5a84b8_3" is already in use by container a468c762916792739e99fc48a85c4a24ce848e09c430be1da99a44576266e548. You have to remove (or rename) that container to be able to reuse that name.<br>Oct 25 05:04:44 localhost kubelet: E1025 05:04:44.481217 8889 kuberuntime_sandbox.go:54] CreatePodSandbox for pod "etcd-ecsmaster01_kube-system(fb87a1f1730f1fe65cd7ce2b1d5a84b8)" failed: rpc error: code = Unknown desc = failed to create a sandbox for pod "etcd-ecsmaster01": Error response from daemon: Conflict. The name "/k8s_POD_etcd-ecsmaster01_kube-system_fb87a1f1730f1fe65cd7ce2b1d5a84b8_3" is already in use by container a468c762916792739e99fc48a85c4a24ce848e09c430be1da99a44576266e548. You have to remove (or rename) that container to be able to reuse that name.</p>
<p>(3).杀死etcd,apiserver进程</p>
<p># ps -ef | grep etcd</p>
<p># kill 6577</p>
<p># ps -ef | grep apiserver</p>
<p># kill 11737</p>
<p># systemctl restart kubelet</p>
<p>(4).问题解决</p>
<p><img src="http://img.blog.itpub.net/blog/2019/10/25/a20131861107d806.png?x-oss-process=style/bb" alt="" title=""></p>
<p># kubectl get nodes<br>NAME STATUS ROLES AGE VERSION<br>ecsmaster01 Ready master 70d v1.10.0<br>ecsnode01 Ready <none> 70d v1.10.0<br>#</p>
<p> </p>
<p>9.使用kubectl查看不到节点,apiserver进程没有</p>
<p># kubectl get nodes<br>The connection to the server 192.168.54.128:6443 was refused - did you specify the right host or port?<br># ps -ef | grep etcd<br>root 5833 4841 0 13:47 pts/0 00:00:00 grep --color=auto etcd<br># ps -ef | grep apiserver<br>root 5911 4841 0 13:47 pts/0 00:00:00 grep --color=auto apiserver</p>
<div># systemctl status kubelet<br>● kubelet.service - kubelet: The Kubernetes Node Agent<br> Loaded: loaded (/usr/lib/systemd/system/kubelet.service; enabled; vendor preset: disabled)<br> Drop-In: /usr/lib/systemd/system/kubelet.service.d<br> └─10-kubeadm.conf<br> Active: active (running) since Sat 2019-10-26 13:38:53 CST; 9min ago<br> Docs: https://kubernetes.io/docs/<br> Main PID: 657 (kubelet)<br> Memory: 104.3M<br> CGroup: /system.slice/kubelet.service<br> └─657 /usr/bin/kubelet --bootstrap-kubeconfig=/etc/kubernetes/bootstrap-kubelet.conf --kubeconfig=/etc/kubernetes/kubelet.conf --config=/var/lib/kubelet/config.yaml --cgroup-driver=cgroupfs --network-plugin=cni --pod-inf...</div>
<div>Oct 26 13:48:07 k8s01 kubelet: E1026 13:48:07.472872 657 reflector.go:123] k8s.io/kubernetes/pkg/kubelet/config/apiserver.go:46: Failed to list *v1.Pod: Get https://192.168.54.128:6443/api/v1/pods?fieldSel...onnection refused<br>Oct 26 13:48:07 k8s01 kubelet: E1026 13:48:07.480930 657 kubelet.go:2267] node "k8s01" not found<br>Oct 26 13:48:07 k8s01 kubelet: E1026 13:48:07.581915 657 kubelet.go:2267] node "k8s01" not found<br>Oct 26 13:48:07 k8s01 kubelet: E1026 13:48:07.674114 657 reflector.go:123] k8s.io/kubernetes/pkg/kubelet/kubelet.go:450: Failed to list *v1.Service: Get https://192.168.54.128:6443/api/v1/services?limit=50...onnection refused<br>Oct 26 13:48:07 k8s01 kubelet: E1026 13:48:07.682619 657 kubelet.go:2267] node "k8s01" not found<br>Oct 26 13:48:07 k8s01 kubelet: E1026 13:48:07.783349 657 kubelet.go:2267] node "k8s01" not found<br>Oct 26 13:48:07 k8s01 kubelet: E1026 13:48:07.871726 657 reflector.go:123] k8s.io/kubernetes/pkg/kubelet/kubelet.go:459: Failed to list *v1.Node: Get https://192.168.54.128:6443/api/v1/nodes?fieldSelector=...onnection refused<br>Oct 26 13:48:07 k8s01 kubelet: E1026 13:48:07.884269 657 kubelet.go:2267] node "k8s01" not found<br>Oct 26 13:48:07 k8s01 kubelet: E1026 13:48:07.913384 657 controller.go:135] failed to ensure node lease exists, will retry in 7s, error: Get https://192.168.54.128:6443/apis/coordination.k8s.io/v1/namespac...onnection refused<br>Oct 26 13:48:07 k8s01 kubelet: E1026 13:48:07.984757 657 kubelet.go:2267] node "k8s01" not found<br>Hint: Some lines were ellipsized, use -l to show in full.<br>#</div>
<p> </p>
<p>解决方法:</p>
<p># docker images --查看镜像时没有镜像<br>REPOSITORY TAG IMAGE ID CREATED SIZE<br># cat 16.sh<br>#!/bin/bash<br># download k8s 1.15.2 images<br># get image-list by 'kubeadm config images list --kubernetes-version=v1.15.2'<br># gcr.azk8s.cn/google-containers == k8s.gcr.io<br>images=(<br>kube-apiserver:v1.16.0<br>kube-controller-manager:v1.16.0<br>kube-scheduler:v1.16.0<br>kube-proxy:v1.16.0<br>pause:3.1<br>etcd:3.3.15-0<br>coredns:1.6.2<br>)<br>for imageName in ${images[@]};do<br> docker pull gcr.azk8s.cn/google-containers/$imageName <br> docker tag gcr.azk8s.cn/google-containers/$imageName k8s.gcr.io/$imageName <br> docker rmi gcr.azk8s.cn/google-containers/$imageName<br>done<br># sh 16.sh --拉取镜像</p>
<p># docker images<br>REPOSITORY TAG IMAGE ID CREATED SIZE<br>k8s.gcr.io/kube-apiserver v1.16.0 b305571ca60a 5 weeks ago 217MB<br>k8s.gcr.io/kube-proxy v1.16.0 c21b0c7400f9 5 weeks ago 86.1MB<br>k8s.gcr.io/kube-controller-manager v1.16.0 06a629a7e51c 5 weeks ago 163MB<br>k8s.gcr.io/kube-scheduler v1.16.0 301ddc62b80b 5 weeks ago 87.3MB<br>k8s.gcr.io/etcd 3.3.15-0 b2756210eeab 7 weeks ago 247MB<br>k8s.gcr.io/coredns 1.6.2 bf261d157914 2 months ago 44.1MB<br>k8s.gcr.io/pause 3.1 da86e6ba6ca1 22 months ago 742kB<br># kubectl get nodes<br>NAME STATUS ROLES AGE VERSION<br>k8s01 Ready master 48d v1.16.0<br>k8s02 Ready <none> 48d v1.16.0<br>k8s03 Ready <none> 8d v1.16.0<br>#</p>
<p> </p>
<p> </p>
<p> </p>
<p class="translate">来自 “ ITPUB博客 ” ,链接:http://blog.itpub.net/25854343/viewspace-2636166/,如需转载,请注明出处,否则将追究法律责任。</p><br><br>
来源:https://www.cnblogs.com/Gdavid/p/13353714.html
頁:
[1]