When one of the control plane instances fail (i.e. instance has failed at
cloud provider), it’s necessary to replace the failed instance with a new one,
as fast as possible to avoid losing etcd quorum and blocking all
kube-apiserver
operations.
This guide demonstrates how to restore your cluster to the normal state (i.e. to have all kube-apiservers with etcd instances running and healthy).
kubectl get node --selector node-role.kubernetes.io/master
outputIf the instance is not in the appropriate healthy state (i.e. underlying hardware issues), and/or is unresponsive (for a myriad of reasons), it’s often easier to replace it then trying to fix it. Delete (in cloud console) the malfunctioning instance if there is still one in the running state.
Even when one etcd member is physically (and abruptly) removed, etcd ring still hopes it might come back online at a later time. Unfortunately, this is not our case and we need to let etcd ring know that dead etcd member is gone forever (i.e. remove dead etcd member from the known peers list).
First of all, check your Nodes
kubectl get node --selector node-role.kubernetes.io/master -o wide
Failed control plane node will be displayed as NotReady or even absent from the output (running Cloud Controller Manager will remove the Node object eventually).
Even when a control plane node is absent, there are still other alive nodes, that contain healthy etcd ring members. Exec into the shell of one of those alive etcd containers:
kubectl -n kube-system exec -it etcd-<ALIVE-HOSTNAME> sh
Setup client TLS authentication in order to be able to communicate with etcd:
export ETCDCTL_API=3
export ETCDCTL_CACERT=/etc/kubernetes/pki/etcd/ca.crt
export ETCDCTL_CERT=/etc/kubernetes/pki/etcd/healthcheck-client.crt
export ETCDCTL_KEY=/etc/kubernetes/pki/etcd/healthcheck-client.key
Retrieve currently known members list:
etcdctl member list
Example output:
2ce40012b4b4e4e6, started, ip-172-31-153-216.eu-west-3.compute.internal, https://172.31.153.216:2380, https://172.31.153.216:2379, false
2e39cf93b81fb7ed, started, ip-172-31-153-246.eu-west-3.compute.internal, https://172.31.153.246:2380, https://172.31.153.246:2379, false
6713c8f2e74fb553, started, ip-172-31-153-235.eu-west-3.compute.internal, https://172.31.153.235:2380, https://172.31.153.235:2379, false
By comparing the Nodes list with etcd members list (hostnames, IPs) we can find the ID of the missing etcd member (dead etcd member would be missing from the Nodes list, or will be in NotReady state).
For example, it’s found that there is not control plane Node with IP
172.31.153.235
. It means etcd member ID 6713c8f2e74fb553
is the one we are
looking for to remove.
To remove dead etcd member:
etcdctl member remove 6713c8f2e74fb553
Member 6713c8f2e74fb553 removed from cluster 4ec111e0dee094c3
Now, members list should display only 2 members.
etcdctl member list
Example output:
2ce40012b4b4e4e6, started, ip-172-31-153-216.eu-west-3.compute.internal, https://172.31.153.216:2380, https://172.31.153.216:2379, false
2e39cf93b81fb7ed, started, ip-172-31-153-246.eu-west-3.compute.internal, https://172.31.153.246:2380, https://172.31.153.246:2379, false
Exit the shell in the etcd pod.
Assuming you’ve used Terraform to provision your cloud infrastructure, use
terraform apply
to restore cloud infrastructure to the declared state,
e.g. 3 control plane instances for the Highly-Available clusters.
From your local machine:
terraform apply
The result should be 3 running control plane VM instances. Two existing and currently members of the cluster, and the fresh one which will be joined to the cluster as replacement for failed VM.
kubeone apply
kubeone apply
will install Kubernetes binaries and dependencies on the
freshly created instance and join it back to the cluster as one of the control
plane nodes.
If you’re using Terraform, make sure to regenerate the Terraform state file
using the terraform output
command.
terraform output -json > tf.json
Run the following apply
command:
kubeone apply --manifest kubeone.yaml -t tf.json
The apply
command will analyze the cluster, and find the instance that needs
to be provisioned and joined the cluster. You’ll be asked to confirm your
intention to provision a new node by typing yes
.
INFO[15:33:55 CEST] Determine hostname…
INFO[15:33:59 CEST] Determine operating system…
INFO[15:34:02 CEST] Running host probes…
INFO[15:34:02 CEST] Electing cluster leader…
INFO[15:34:02 CEST] Elected leader "ip-172-31-220-51.eu-west-3.compute.internal"…
INFO[15:34:05 CEST] Building Kubernetes clientset…
INFO[15:34:06 CEST] Running cluster probes…
The following actions will be taken:
Run with --verbose flag for more information.
+ join control plane node "ip-172-31-221-102.eu-west-3.compute.internal" (172.31.221.102) using 1.18.6
+ ensure machinedeployment "marko-1-eu-west-3b" with 1 replica(s) exists
+ ensure machinedeployment "marko-1-eu-west-3c" with 1 replica(s) exists
+ ensure machinedeployment "marko-1-eu-west-3a" with 1 replica(s) exists
Do you want to proceed (yes/no):
After confirming the intention, KubeOne will start provisioning the newly
created instance. This can take several minutes. After the command is done,
you can run kubectl get nodes
to verify that all nodes are running and
healthy.