Cluster reconciliation is a process of determining the actual state of the cluster and taking actions based on the difference between the actual and the expected states. The cluster reconciliation is capable of automatically installing, upgrading, and repairing the cluster. On top of that, the reconciliation can change cluster properties, such as apply addons, create new worker nodes, and/or enable/disable features.
The cluster reconciliation process is implemented as part of the kubeone apply
command. The apply
command runs a set of predefined probes to determine the
actual state, while the expected state is defined as a KubeOne configuration
manifest. Generally, the predefined probes determine:
Depending on the difference between the determined actual state and the
expected state, the apply
command would take the following actions:
n
members, quorum is (n/2)+1
The following flowchart describes the reconciliation process graphically:
The apply
command has ability to detect is cluster in an unhealthy
state. The cluster is considered unhealthy if there’s at least one node that’s
unhealthy, which can happen if:
In such a case, there are two options:
kubeone apply
to join the new instance a clusterIf there are multiple unhealthy instances, it might be required to replace and repair instance by instance in order to maintain the etcd quorum. KubeOne recommends which instances are safe to be deleted without affecting the quorum. It’s strongly advised to follow the order or otherwise you’re risking losing the quorum and all cluster data. If it’s not possible to repair the cluster without affecting the quorum, KubeOne will fail to repair the cluster. In that case, disaster recovery might be required.
The apply command doesn’t modify or delete existing MachineDeployments. Modifying existing MachineDeployments should be done by the operator either by using kubectl or the Kubernetes API directly.
To make managing MachineDeployments easier, the operator can generate the manifest containing all MachineDeployments defined in the KubeOne configuration (or Terraform state) by using the following command:
kubeone config machinedeployments --manifest config.yaml -t tf.json
The apply
command doesn’t remove or unprovision the static worker
nodes. That can be done by removing the appropriate instance manually.
If there is CCM (cloud-controller-manager) running in the cluster, the Node
object for the removed worker node should be deleted automatically.
If there’s no CCM running in the cluster, you can remove the Node object
manually using kubectl.
Currently, the apply
command doesn’t reconcile features. If you
enable/disable any feature on already provisioned cluster, you have to
explicitly run the upgrade process for changes to be in the effect.
You don’t have to change the Kubernetes version, instead, you can use the
--force-upgrade
flag to force the upgrade process:
kubeone apply --manifest config.yaml -t . --force-upgrade