The control plane is a set of components and services that serve the Kubernetes API, manage worker nodes, and continuously reconcile desired state using control loops. The control plane components are usually placed on dedicated node(s) which are often referred to as control plane nodes.
It’s recommended to run multiple replicas of control plane components to ensure the fault tolerance and resilience. If one of control plane nodes fail, other nodes will continue serving user’s requests and ensure that the workload is up and running. Running multiple replicas of the control plane components is called Highly-Available Control Plane.
Replicas are run on different control plane nodes. It’s advised to have an odd number of nodes (e.g. 3 or 5) and a minimum of 3 control plane nodes. For more details, check the etcd documentation on how quorum works.
All required infrastructure for the cluster along with instances needed for control plane nodes are managed by the user. It can be done manually or by integrating with tools such as Terraform.
Instances for worker nodes can be managed in two ways:
Using Kubermatic machine-controller is highly advised if your provider is natively supported. Otherwise, KubeOne Static Workers are recommended instead.
To make it easier to get started, we provide example Terraform scripts that you can use to create the needed infrastructure and instances. The example Terraform scripts are available for all natively supported providers and can be found on GitHub.
The example Terraform scripts are supposed to be used as a foundation for building your own scripts. The scripts are optimized for ease of use and using in E2E tests, and therefore might not be suitable for the production usage out of the box.
Please check the Production Recommendations document for more details about making the example configs suitable for the production usage.
KubeOne integrates with Terraform by reading the Terraform state for the information about the cluster including:
All you need to do to utilize the integration is to ensure that you have
the appropriate output.tf
file along with your other Terraform files. It’s
required that your output.tf
file follows the template used by KubeOne, which
can be found along with the example Terraform scripts
(an example for AWS).
KubeOne takes care of the full cluster lifecycle including: provisioning, upgrading, repairing, and unprovisioning the clusters. KubeOne utilizes Kubernetes’ kubeadm for handling provisioning and upgrading tasks. Kubeadm allows us to follow the best practices and create conformant and production-ready clusters.
Most of the tasks are carried out by running commands over SSH, therefore the SSH access to control plane nodes is required. Such tasks include installing and upgrading binaries, generating and distributing configuration files and certificates, running kubeadm, and more. Manifests are mostly applied programmatically using client-go and controller-runtime libraries.
This approach allows us to manage clusters on any infrastructure, is it cloud, on-prem, baremetal, Edge, or IoT.
Clusters are defined declaratively using the KubeOne Configuration Manifest. The configuration manifest is a YAML file that defines properties of a cluster such as:
You can grab the KubeOne Configuration Manifest reference by running the following command:
kubeone config print --full
Kubermatic machine-controller is an open-source Cluster API implementation that takes care of:
You can find more details about machine-controller in the Managing Worker Nodes Using Kubermatic machine-controller document.
Cluster API is a Kubernetes sub-project focused on providing declarative APIs and tooling to simplify provisioning, upgrading, and operating multiple Kubernetes clusters. You can learn more about the Cluster API by checking out the Cluster API repository and the Cluster API documentation website.
We use Cluster API for managing worker nodes, while control plane nodes are managed as described in the Cluster Provisioning and Management section.
The Cluster API controller (e.g. Kubermatic machine-controller) is responsible for acting on Cluster API objects — Machines, MachineSets, and MachineDeployments. The controller takes care of reconciling the desired state and ensuring that the requested machines exist and are part of the cluster.
Machines (machines.cluster.k8s.io
) define a single machine and node in the
cluster. In our case, a worker node is requested by creating a Machine
object which contains all the needed information to create the instance
(e.g. region, instance size, security groups…). Machines are often compared
to Pods, i.e. Machine is a atomic unit representing a single node.
MachineSets (machinesets.cluster.k8s.io
) have a purpose to maintain a stable
set of Machines running at any given time. It’s often used to guarantee the
availability of a specified number of Machines. As such, MachineSets work
similar as ReplicaSets.
MachineDeployments (machinedeployments.cluster.k8s.io
) are similar to the
Deployments. They are used to provide declarative updates for
MachineSets/Machines and allow advanced use cases such as rolling updates.