目 录CONTENT

文章目录

K8S 问题解决方案 —— Calico CNI 无法连接 Kubernetes API 服务器导致清理失败

邱少羽梦
2025-03-05 / 0 评论 / 0 点赞 / 23 阅读 / 6545 字 / 正在检测是否收录...
温馨提示:
本文最后更新于 2025-03-05,若内容或图片失效,请留言反馈。部分素材来自网络,若不小心影响到您的利益,请联系我们删除。

K8S 问题解决方案 —— Calico CNI 无法连接 Kubernetes API 服务器导致清理失败

kubeadm reset -f --cri-socket unix:///run/containerd/containerd.sock
[preflight] Running pre-flight checks
W0305 16:42:40.141287   31801 removeetcdmember.go:106] [reset] No kubeadm config, using etcd pod spec to get data directory
[reset] Stopping the kubelet service
[reset] Unmounting mounted directories in "/var/lib/kubelet"
W0305 16:42:40.166536   31801 cleanupnode.go:134] [reset] Failed to evaluate the "/var/lib/kubelet" directory. Skipping its unmount and cleanup: lstat /var/lib/kubelet: no such file or directory
W0305 16:42:42.220281   31801 cleanupnode.go:99] [reset] Failed to remove containers: [failed to stop running pod 77c3648139b04fa5925e645cf26b78fe69b122584959ef390cc080494a96799a: output: E0305 16:42:41.207891   31928 remote_runtime.go:248] "StopPodSandbox from runtime service failed" err="rpc error: code = Unknown desc = failed to destroy network for sandbox \"77c3648139b04fa5925e645cf26b78fe69b122584959ef390cc080494a96799a\": plugin type=\"calico\" failed (delete): error getting ClusterInformation: Get \"https://10.1.0.1:443/apis/crd.projectcalico.org/v1/clusterinformations/default\": dial tcp 10.1.0.1:443: connect: connection refused" podSandboxID="77c3648139b04fa5925e645cf26b78fe69b122584959ef390cc080494a96799a"
time="2025-03-05T16:42:41+08:00" level=fatal msg="stopping the pod sandbox \"77c3648139b04fa5925e645cf26b78fe69b122584959ef390cc080494a96799a\": rpc error: code = Unknown desc = failed to destroy network for sandbox \"77c3648139b04fa5925e645cf26b78fe69b122584959ef390cc080494a96799a\": plugin type=\"calico\" failed (delete): error getting ClusterInformation: Get \"https://10.1.0.1:443/apis/crd.projectcalico.org/v1/clusterinformations/default\": dial tcp 10.1.0.1:443: connect: connection refused"
: exit status 1, failed to stop running pod 31310acd9b4330ee909d06bfb986ca2d1a8340144c330458b32bb437b01b5e4e: output: E0305 16:42:42.215736   32071 remote_runtime.go:248] "StopPodSandbox from runtime service failed" err="rpc error: code = Unknown desc = failed to destroy network for sandbox \"31310acd9b4330ee909d06bfb986ca2d1a8340144c330458b32bb437b01b5e4e\": plugin type=\"calico\" failed (delete): error getting ClusterInformation: Get \"https://10.1.0.1:443/apis/crd.projectcalico.org/v1/clusterinformations/default\": dial tcp 10.1.0.1:443: connect: connection refused" podSandboxID="31310acd9b4330ee909d06bfb986ca2d1a8340144c330458b32bb437b01b5e4e"
time="2025-03-05T16:42:42+08:00" level=fatal msg="stopping the pod sandbox \"31310acd9b4330ee909d06bfb986ca2d1a8340144c330458b32bb437b01b5e4e\": rpc error: code = Unknown desc = failed to destroy network for sandbox \"31310acd9b4330ee909d06bfb986ca2d1a8340144c330458b32bb437b01b5e4e\": plugin type=\"calico\" failed (delete): error getting ClusterInformation: Get \"https://10.1.0.1:443/apis/crd.projectcalico.org/v1/clusterinformations/default\": dial tcp 10.1.0.1:443: connect: connection refused"
: exit status 1]
[reset] Deleting contents of directories: [/etc/kubernetes/manifests /etc/kubernetes/pki]
[reset] Deleting files: [/etc/kubernetes/admin.conf /etc/kubernetes/kubelet.conf /etc/kubernetes/bootstrap-kubelet.conf /etc/kubernetes/controller-manager.conf /etc/kubernetes/scheduler.conf]

The reset process does not clean CNI configuration. To do so, you must remove /etc/cni/net.d

The reset process does not reset or clean up iptables rules or IPVS tables.
If you wish to reset iptables, you must do so manually by using the "iptables" command.

If your cluster was setup to utilize IPVS, run ipvsadm --clear (or similar)
to reset your system's IPVS tables.

The reset process does not clean your kubeconfig files and you must remove them manually.
Please, check the contents of the $HOME/.kube/config file.

根据错误信息,问题主要出在 Calico CNI 无法连接 Kubernetes API 服务器(10.1.0.1:443)导致清理失败。以下是分步解决方案:


分步解决方案

  1. 停止相关服务并清理容器

    # 停止 kubelet 和 containerd
    sudo systemctl stop kubelet
    sudo systemctl stop containerd
    
    # 手动清理所有容器(containerd 环境)
    sudo crictl rm -fa 2>/dev/null || \
      sudo ctr -n k8s.io containers list | awk '{print $1}' | xargs -r sudo ctr -n k8s.io containers delete
    
  2. 删除 CNI 配置和网络命名空间

    # 删除 CNI 配置文件
    sudo rm -rf /etc/cni/net.d/*
    
    # 清理残留网络接口和命名空间
    sudo ip link delete cni0 2>/dev/null
    sudo ip link delete flannel.1 2>/dev/null
    sudo rm -rf /var/run/calico /var/run/flannel
    
  3. 清理 iptables 和 IPVS 规则

    # 重置 iptables
    sudo iptables -F
    sudo iptables -t nat -F
    sudo iptables -t mangle -F
    sudo iptables -X
    
    # 如果使用 IPVS
    sudo ipvsadm --clear
    
  4. 重置 kubeadm 并清理残留文件

    # 执行 kubeadm reset
    sudo kubeadm reset -f --cri-socket unix:///run/containerd/containerd.sock
    
    # 清理残余目录
    sudo rm -rf /var/lib/kubelet /etc/kubernetes $HOME/.kube
    
  5. 重启节点(可选)

    # 确保所有配置生效
    sudo reboot
    

关键解释

  • Calico 连接失败:因 kubeadm reset 已关闭 API 服务器,导致 Calico 无法通过 10.1.0.1:443 通信。需手动清理容器和网络配置。
  • 强制清理容器:直接通过 crictlctr 删除容器,绕过 kubelet 管理。
  • 网络残留处理:删除 CNI 配置和网络接口,防止旧配置影响新集群。
  • iptables/IPVS:确保无旧规则干扰后续网络组件安装。

执行完上述步骤后,可重新初始化集群(kubeadm init)或进行其他操作。

注意事项

  1. 如果出现文件无法删除,可以重启服务器后重新操作

其它链接

0
  1. 支付宝打赏

    qrcode alipay
  2. 微信打赏

    qrcode weixin
  3. QQ打赏

    qrcode qq

评论区