- Turn off Adblocks
Last active
August 17, 2024 09:53
-
-
Save HoKim98/c50515e16d391e7c73b7d6dae9d66622 to your computer and use it in GitHub Desktop.
Resolve some issues while operating K8S on Bare-Metal
- Backup your ETCD data to the safe area.
- Open the
etcd.envfile on one of your ETCD cluster nodes and append below.ETCD_FORCE_NEW_CLUSTER=trueETCD_INITIAL_CLUSTER=(remove the broken nodes)
- Restart
etcdservice. - Check whether
etcdservice is running.- Check whether the broken nodes are removed from the member list.
- Remove the
ETCD_FORCE_NEW_CLUSTERflag and restartetcdservice again. - Wait some minutes and check whether your kubernetes cluster is recovered.
- Restarting
kubeletis recommended: it will recover broken core k8s services. - Restarting your provisioning services are recommeded.
- Rebooting the nodes will resolve most of the issues about containers.
- Restarting
- Backup your data to the safe area
- ETCD: /opt/etcd/ /etc/etcd /etc/etcd.env
- Control Plane: /etc/kubernetes /var/lib/kubelet
- Rook Ceph: /var/lib/rook
- Drain the nodes
- Reinstall the OS
- Rook Ceph: DO NOT WIPE THE DATA VOLUME
- Restore the data and reinstall the K8S
- Undrain the nodes
- Add an ETCD node to existing kubernetes ETCD cluster.
etcdctl member add [new-node-name] --peer-urls=https://[new-node-ip]:2380- You may use cert files to grant the command like below:
--cacert /etc/etcd/ssl/ca.pem--cert /etc/etcd/ssl/admin-[old-node-k8s-name].pem--key /etc/etcd/ssl/admin-[old-node-k8s-name]-key.pem
- Update
/etc/kubernetes/manifests/kube-apiserver.yaml.--etcd-servers=https://[new-node-ip]:2379- The kubernetes manifest directory may be differ (i.e. Kubespray)
- Restart
kubeletservice.systemctl restart kubelet.service
- Wait some seconds and check the K8S cluster is running.
- Remove the old ETCD node from your cluster.
etcdctl member remove [old-node-id]
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment