Control Plane Expose Strategy
Expose Strategies
This chapter describes the control plane expose strategies in the Kubermatic Kubernetes Platform (KKP).
The expose strategy defines how the control plane components (e.g. Kubernetes API server) are exposed
outside the seed cluster - to the worker nodes and the actual cluster users.
The expose strategies rely on a component called nodeport-proxy. It is a L4 service proxy based on Envoy,
capable of routing the traffic based on TCP destination port, SNI or HTTP/2 tunnels, depending on which expose strategy is used.
There are 3 different expose strategies in KKP: NodePort
, LoadBalancer
and Tunneling
.
Different user clusters in KKP may be using different expose strategies at the same time.
NodePort
NodePort is the default expose strategy in KKP. With this strategy a k8s service of type NodePort is created for each
exposed component (e.g. Kubernetes API Server) of each user cluster. This implies, that each apiserver will be
exposed on a randomly assigned TCP port from the nodePort range configured for the seed cluster.
Each cluster normally consumes 2 NodePorts (one for the apiserver + one for the worker-nodes to control-plane “tunnel”),
which limits the number of total clusters per Seed based on the NodePort range configured in the Seed cluster.
By default, the nodeport-proxy is enabled for this expose strategy. If services of type LoadBalancer are
available in the Seed, all the services of all user clusters will be made available through a single LoadBalancer
service in front of the nodeport-proxy. This approach however has a limitation on some cloud platforms, where the
number of listeners per load balancer is limited, which can limit the total number of user clusters per Seed even more.
The nodeport-proxy can be disabled
if your platform doesn’t support LoadBalancer services or if the front LoadBalancer service is not required.
LoadBalancer
In the LoadBalancer expose strategy, a dedicated service of type LoadBalancer will be created for each user cluster.
This strategy requires services of type LoadBalancer to be available on the Seed cluster and usually results into higher cost of cloud resources.
However, this expose strategy supports more user clusters per Seed, and provides better ingress traffic isolation per user cluster.
The port number on which the apiserver is exposed via the LoadBalancer service is still randomly allocated from the
NodePort range configured in the Seed cluster.
Tunneling
The Tunneling expose strategy addresses both the scaling issues of the NodePort strategy and cost issues of the LoadBalancer strategy.
With this strategy, the traffic is routed to the based on a combination of SNI and HTTP/2 tunnels by the nodeport-proxy.
Another benefit of this expose strategy is that control plane components of each user cluster are always exposed
on fixed port numbers: 6443
for the apiserver and 8088
for the worker-nodes to control-plane “tunnel”.
This allows for more strict firewall configuration than exposing the whole nodePort range.
Note that only the port 6443
needs to be allowed for external apiserver access, the port 8088
needs to be allowed
only between the worker-nodes network and the seed cluster network.
The following limitations apply to the Tunneling expose strategy:
- It is not supported in set-ups where the worker nodes should pass from a
corporate proxy (HTTPS proxy) to reach the control plane.
- An agent is deployed on each worker node to provide access to the apiserver from
within the cluster via the
kubernetes
service. The agent binds the IP used as the
kubernetes
service endpoint. This address can not collide with any other address in the cluster / datacenter.
The default value of tunneling agent IP is 100.64.30.10
. The default value can be changed via the cluster API (spec.clusterNetwork.tunnelingAgentIP
).
Configuring the Expose Strategy
The expose strategy can be configured at 3 levels:
- globally,
- on Seed level,
- on User Cluster level.
Seed level configuration overrides the global one, and user cluster level configuration overrides both.
Global Configuration
The expose strategy can be configured globally with the KubermaticConfiguration
as follows:
apiVersion: kubermatic.k8c.io/v1
kind: KubermaticConfiguration
metadata:
name: kubermatic
namespace: kubermatic
spec:
exposeStrategy: NodePort
The valid values for the exposeStrategy
are NodePort
/ LoadBalancer
/ Tunneling
.
NOTE: If the exposeStrategy
is not specified in the KubermaticConfiguration
, it would default to NodePort
.
Seed Level Configuration
The expose strategy can be overridden at Seed level in the Seed
CR, e.g.:
apiVersion: kubermatic.k8c.io/v1
kind: Seed
metadata:
name: kubermatic
namespace: kubermatic
spec:
# Override the global expose strategy with 'LoadBalancer'
exposeStrategy: LoadBalancer
User Cluster Level Configuration
The expose strategy can be also overridden on the user cluster level. To do that, configure the
desired expose strategy in cluster’s spec.exposeStrategy
in the cluster API, for example:
apiVersion: kubermatic.k8c.io/v1
kind: Cluster
metadata:
name: clustername
spec:
# Override the expose strategy for this cluster only
exposeStrategy: Tunneling
Disabling Nodeport-Proxy for the NodePort Expose Strategy
By default, the nodeport-proxy is enabled when using the NodePort
expose strategy.
If services of type LoadBalancer are available in the Seed, all the services of all user clusters will be made
available through a single LoadBalancer service in front of the nodeport-proxy.
The nodeport-proxy can be disabled if your platform doesn’t support LoadBalancer services or if the front LoadBalancer service is not required.
If nodeport-proxy is disabled, the DNS entries for the Seed cluster need to point directly to the Seed cluster’s node IPs.
Note that this approach has limitations in case of Seed cluster node failures and DNS entries need
to be manually re-configured upon Seed cluster node rotation.
The nodeport-proxy can be disabled at the Seed level, as shown on the following example:
apiVersion: kubermatic.k8c.io/v1
kind: Seed
metadata:
name: kubermatic
namespace: kubermatic
spec:
# Configure the expose strategy to NodePort and disable the nodeport-proxy
exposeStrategy: NodePort
nodeportProxy:
disable: true
Migrating the Expose Strategy for Existing Clusters
The expose strategy of a user cluster normally cannot be changed after the cluster creation.
However, for experienced KKP administrators, it is still possible to migrate a user cluster from one expose strategy to another using some manual steps.
This procedure will cause temporary outage in the user cluster, so it should be performed during a maintenance window. It is also recommended trying this procedure first on a testing cluster with the same setup (same Seed, same cloud provider, same worker node OS images, etc.) before performing it on a production cluster.
Step 1
In order to allow the expose strategy migration, the cluster first needs to be labeled with the unsafe-expose-strategy-migration
label (e.g. unsafe-expose-strategy-migration: "true"
).
By putting this label on your cluster you acknowledge that this type of upgrade is not supported by Kubermatic and you are fully responsible for the consequences it may have.
It is recommended to suspend/terminate all workloads (e.g. by draining all active nodes via kubectl drain
) as all nodes will
lose connectivity to the cluster control plane upon updating the expose strategy. Therefore, updating the expose strategy
in the next step will stop the ability to coordinate and properly terminate workloads on existing nodes. A planned shutdown
might be desired to prevent potential data loss in your application stack.
Step 2
At this point, you are able to change the expose strategy of the cluster in the Cluster API.
Change the Cluster spec.exposeStrategy
to the desired version:
either using KKP API endpoint /api/v2/projects/{project_id}/clusters/{cluster_id}
,
or by editing the cluster CR in the Seed Cluster (kubectl edit cluster <cluster-name>
).
When migrating from the Tunneling expose strategy (to any other), it is also necessary to delete the clusterNetwork.tunnelingAgentIP
in the cluster spec.
Now wait until control-plane components in the seed cluster redeploy.
Step 3
At this point, all existing kubeconfig files used to access the cluster are invalid and not working anymore.
To access the cluster, download a new kubeconfig file.
Step 4
Perform a rolling restart of all machine deployments in the user cluster. All nodes need to be rotated so that
kubelet running on the nodes starts using the new API server endpoint, and all workloads in the cluster do the
same as well. This can be done from KKP UI, or using kubectl, e.g.:
forceRestartAnnotations="{\"spec\":{\"template\":{\"metadata\":{\"annotations\":{\"forceRestart\":\"$(date +%s)\"}}}}}"
for md in $(kubectl get machinedeployments -n kube-system --no-headers | awk '{print $1}'); do
kubectl patch machinedeployment -n kube-system $md --type=merge -p $forceRestartAnnotations
done
Afterwards, all Node
objects that show up as NotReady
should be deleted. No quick shell script is provided because
nodes might still be joining the cluster and temporarily show up as NotReady
. Those should of course not be deleted.