This version is under construction, please use an official release version

Upgrading to KKP 2.26

Upgrading to KKP 2.26 is only supported from version 2.25. Do not attempt to upgrade from versions prior to that and apply the upgrade step by step over minor versions instead (e.g. from 2.24 to 2.25 and then to 2.26). It is also strongly advised to be on the latest 2.25.x patch release before upgrading to 2.26.

This guide will walk you through upgrading Kubermatic Kubernetes Platform (KKP) to version 2.26. For the full list of changes in this release, please check out the KKP changelog for v2.26. Please read the full document before proceeding with the upgrade.

Pre-Upgrade Considerations

Please review known issues before upgrading to understand if any issues might affect you.

KKP 2.26 adjusts the list of supported Kubernetes versions and removes support for Kubernetes 1.27. Existing user clusters need to be migrated to 1.28 or later before the KKP upgrade can begin.

Helm Chart Versioning

Beginning with KKP 2.26, Helm chart versions now use strict semvers without a leading “v” (i.e. 1.2.3 instead of v1.2.3). This change was made to improve compatibility with GitOps tooling that is very strict. The Git tags and container tags have not changed.

Helm Chart Upgrades

KKP 2.26 ships a lot of major version upgrades for the Helm charts, most notably

  • Loki & Promtail v2.5 to v2.9.x
  • Grafana 9.x to 10.4.x

Some of these updates require manual intervention or at least checking whether a given KKP system is affected by upstream changes. Please read the following sections carefully before beginning the upgrade.

Velero 1.13

Velero was updated from 1.10 to 1.13, which includes a number of significant improvements internally.

  • The default file-level backup tool was changed from Restic to Kopia. To keep backwards-compatibility, the KKP velero chart now explicitly configures Restic, but we expect that switching to Kopia will be mandatory in the future. Please use the restic.uploaderType variable in the values.yaml to switch to Kopia when desired.

cert-manager 1.14

The configuration syntax for cert-manager has changed slightly.

  • Breaking: If you have .featureGates value set in values.yaml, the features defined there will no longer be passed to cert-manager webhook, only to cert-manager controller. Use the webhook.featureGates field instead to define features to be enabled on webhook.
  • Potentially breaking: Webhook validation of CertificateRequest resources is stricter now: all KeyUsages and ExtendedKeyUsages must be defined directly in the CertificateRequest resource, the encoded CSR can never contain more usages that defined there.

oauth2-proxy (IAP) 7.6

This upgrade includes one breaking change:

  • A change to how auth routes are evaluated using the flags skip-auth-route/skip-auth-regex: the new behaviour uses the regex you specify to evaluate the full path including query parameters. For more details please read the detailed PR description.
  • The environment variable OAUTH2_PROXY_GOOGLE_GROUP has been deprecated in favor of OAUTH2_PROXY_GOOGLE_GROUPS. Next major release will remove this option.

Loki & Promtail 2.9 (Seed MLA)

The Loki upgrade from 2.5 to 2.9 might be the most significant bump in this KKP release. Due to the amount of changes, it’s necessary to delete the existing loki StatefulSet and letting Helm recreate it. Deleting the StatefulSet will not touch the PVCs and the new StatefulSet’s pods will reuse the existing PVCs after the upgrade.

Before upgrading, review your values.yaml for Loki, as a number of syntax changes were made:

  • Most importantly, loki.config is now a templated string that aggregates many other individual values specified in loki, for example loki.tableManager gets rendered into loki.config.table_manager, and loki.loki.schemaConfig gets rendered into loki.config.schema_config. To follow these changes, if you have loki.config in your values.yaml, rename it to loki.loki. Ideally you should not need to manually override the templating string in loki.config from the upstream chart anymore. Additionally, some values are moved out or renamed slightly:
    • loki.config.schema_config becomes loki.loki.schemaConfig
    • loki.config.table_manager becomes loki.tableManager (sic)
    • loki.config.server was removed, if you need to specify something, use loki.loki.server
  • The base volume path for the Loki PVC was changed from /data/loki to /var/loki.
  • Configuration for the default image has changed, there is no loki.image.repository field anymore, it’s now loki.image.registry and loki.image.repository.
  • loki.affinity is now a templated string and enabled by default; if you use multiple Loki replicas, your cluster needs to have multiple nodes to host these pods.
  • All fields related to the Loki pod (loki.tolerations, loki.resources, loki.nodeSelector etc.) were moved below loki.singleBinary.
  • Self-monitoring, Grafana Agent and selftests are disabled by default now, reducing the default resource requirements for the logging stack.
  • loki.singleBinary.persistence.enableStatefulSetAutoDeletePVC is set to false to ensure that when the StatefulSet is deleted, the PVCs will not also be deleted. This allows for easier upgrades in the future, but if you scale down Loki, you would have to manually deleted the leftover PVCs.

Alertmanager 0.27 (Seed MLA)

This version removes the v1 API which was deprecated since 2019. If you have custom integrations with Alertmanager, ensure none of them use the now removed API.

blackbox-exporter 0.25 (Seed MLA)

This version changes the proxy_connect_header configuration structure to match Prometheus (see PR); update your values.yaml accordingly if you configured this option.

helm-exporter 1.2.16 (Seed MLA)

KKP 2.26 removes the custom Helm chart and instead now reuses the official upstream chart. Before upgrading you must delete the existing Helm release in your cluster:

$ helm --namespace monitoring delete helm-exporter

Afterwards you can install the new release from the chart.

kube-state-metrics 2.12 (Seed MLA)

As is typical for kube-state-metrics, the upgrade simple, but the devil is in the details. There were many minor changes since v2.8, please review the changelog carefully if you built upon metrics provided by kube-state-metrics:

  • The deprecated experimental VerticalPodAutoscaler metrics are no longer supported, and have been removed. It’s recommend to use CustomResourceState metrics to gather metrics from custom resources like the Vertical Pod Autoscaler.
  • Label names were regulated to adhere with OTel-Prometheus standards, so existing label names that do not follow the same may be replaced by the ones that do. Please refer to the PR for more details.
  • Label and annotation metrics aren’t exposed by default anymore to reduce the memory usage of the default configuration of kube-state-metrics. Before this change, they used to only include the name and namespace of the objects which is not relevant to users not opting in these metrics.

node-exporter 1.7 (Seed MLA)

This new version comes with a few minor backwards-incompatible changes:

  • metrics of offline CPUs in CPU collector were removed
  • bcache cache_readaheads_totals metrics were removed
  • ntp collector was deprecated
  • supervisord collector was deprecated

Prometheus 2.51 (Seed MLA)

Prometheus had many improvements and some changes to the remote-write functionality that might affect you:

  • Remote-write:
    • raise default samples per send to 2,000
    • respect Retry-After header on 5xx errors
    • error storage.ErrTooOldSample is now generating HTTP error 400 instead of HTTP error 500
  • Scraping:
    • Do experimental timestamp alignment even if tolerance is bigger than 1% of scrape interval

nginx-ingress-controller 1.10

nginx v1.10 brings quite a few potentially breaking changes:

  • does not support chroot image (this will be fixed on a future minor patch release)
  • dropped Opentracing and zipkin modules, just Opentelemetry is supported as of this release
  • dropped support for PodSecurityPolicy
  • dropped support for GeoIP (legacy), only GeoIP2 is supported
  • The automatically generated NetworkPolicy from nginx 1.9.3 is now disabled by default, refer to https://github.com/kubernetes/ingress-nginx/pull/10238 for more information.

Dex 2.39

The validation of username and password in the LDAP connector is much more strict now. Dex uses the EscapeFilter function to check for special characters in credentials and prevent injections by denying such requests. Please ensure this is not an issue before upgrading.

Upgrade Procedure

Before starting the upgrade, make sure your KKP Master and Seed clusters are healthy with no failing or pending Pods. If any Pod is showing problems, investigate and fix the individual problems before applying the upgrade. This includes the control plane components for user clusters, unhealthy user clusters should not be submitted to an upgrade.

KKP Master Upgrade

Download the latest 2.26.x release archive for the correct edition (ce for Community Edition, ee for Enterprise Edition) from the release page and extract it locally on your computer. Make sure you have the values.yaml you used to deploy KKP 2.26 available and already adjusted for any 2.26 changes (also see Pre-Upgrade Considerations), as you need to pass it to the installer. The KubermaticConfiguration is no longer necessary (unless you are adjusting it), as the KKP operator will use its in-cluster representation. From within the extracted directory, run the installer:

$ ./kubermatic-installer deploy kubermatic-master --helm-values path/to/values.yaml

# example output for a successful upgrade
INFO[0000] 🚀 Initializing installer…                     edition="Enterprise Edition" version=v2.26.0
INFO[0000] 🚦 Validating the provided configuration…
WARN[0000]    Helm values: kubermaticOperator.imagePullSecret is empty, setting to spec.imagePullSecret from KubermaticConfiguration
INFO[0000] âś… Provided configuration is valid.
INFO[0000] 🚦 Validating existing installation…
INFO[0001]    Checking seed cluster…                     seed=kubermatic
INFO[0001] âś… Existing installation is valid.
INFO[0001] 🛫 Deploying KKP master stack…
INFO[0001]    💾 Deploying kubermatic-fast StorageClass…
INFO[0001]    âś… StorageClass exists, nothing to do.
INFO[0001]    📦 Deploying nginx-ingress-controller…
INFO[0001]       Deploying Helm chart…
INFO[0002]       Updating release from 2.25.3 to 2.26.0…
INFO[0005]    âś… Success.
INFO[0005]    📦 Deploying cert-manager…
INFO[0005]       Deploying Custom Resource Definitions…
INFO[0006]       Deploying Helm chart…
INFO[0007]       Updating release from 2.25.3 to 2.26.0…
INFO[0026]    âś… Success.
INFO[0026]    📦 Deploying Dex…
INFO[0027]       Updating release from 2.25.3 to 2.26.0…
INFO[0030]    âś… Success.
INFO[0030]    📦 Deploying Kubermatic Operator…
INFO[0030]       Deploying Custom Resource Definitions…
INFO[0034]       Deploying Helm chart…
INFO[0035]       Updating release from 2.25.3 to 2.26.0…
INFO[0064]    âś… Success.
INFO[0064]    📦 Deploying Telemetry
INFO[0065]       Updating release from 2.25.3 to 2.26.0…
INFO[0066]    âś… Success.
INFO[0066]    📡 Determining DNS settings…
INFO[0066]       The main LoadBalancer is ready.
INFO[0066]
INFO[0066]         Service             : nginx-ingress-controller / nginx-ingress-controller
INFO[0066]         Ingress via hostname: <Load Balancer>.eu-central-1.elb.amazonaws.com
INFO[0066]
INFO[0066]       Please ensure your DNS settings for "<KKP FQDN>" include the following records:
INFO[0066]
INFO[0066]          <KKP FQDN>.    IN  CNAME  <Load Balancer>.eu-central-1.elb.amazonaws.com.
INFO[0066]          *.<KKP FQDN>.  IN  CNAME  <Load Balancer>.eu-central-1.elb.amazonaws.com.
INFO[0066]
INFO[0066] 🛬 Installation completed successfully. ✌

Upgrading seed clusters is not necessary, unless you are running the minio Helm chart or User Cluster MLA as distributed by KKP on them. They will be automatically upgraded by KKP components.

You can follow the upgrade process by either supervising the Pods on master and seed clusters (by simply checking kubectl get pods -n kubermatic frequently) or checking status information for the Seed objects. A possible command to extract the current status by seed would be:

$ kubectl get seeds -A -o jsonpath="{range .items[*]}{.metadata.name} - {.status}{'\n'}{end}"
kubermatic - {"clusters":5,"conditions":{"ClusterInitialized":{"lastHeartbeatTime":"2024-03-11T10:53:34Z","message":"All KKP CRDs have been installed successfully.","reason":"CRDsUpdated","status":"True"},"KubeconfigValid":{"lastHeartbeatTime":"2024-03-11T16:50:09Z","reason":"KubeconfigValid","status":"True"},"ResourcesReconciled":{"lastHeartbeatTime":"2024-03-11T16:50:14Z","reason":"ReconcilingSuccess","status":"True"}},"phase":"Healthy","versions":{"cluster":"v1.27.11","kubermatic":"v2.25.0"}}

Of particular interest to the upgrade process is if the ResourcesReconciled condition succeeded and if the versions.kubermatic field is showing the target KKP version. If this is not the case yet, the upgrade is still in flight. If the upgrade is stuck, try kubectl -n kubermatic describe seed <seed name> to see what exactly is keeping the KKP Operator from updating the Seed cluster.

Post-Upgrade Considerations

Deprecations and Removals

Some functionality of KKP has been deprecated or removed with KKP 2.26. You should review the full changelog and adjust any automation or scripts that might be using deprecated fields or features. Below is a list of changes that might affect you:

  • TBD

Next Steps