This page contains an administrator guide for the User Cluster MLA Stack. The user guide is available at User Cluster MLA User Guide page.
The User Cluster MLA stack components have to be manually installed into every KKP Seed cluster.
At the minimal scale (to process MLA data from several user clusters), the stack requires the following resources in the seed cluster:
Apart from that, it will claim the following storage from the kubermatic-fast storage class:
The kubermatic/mla Github repository contains all the Helm charts of the User Cluster MLA stack and scripts to install them. Clone or download it, so that we can deploy the MLA stack into a KKP Seed cluster. Please make sure you are using the tag that is matching your KKP version as described in the “KKP Compatibility Matrix”.
Before deploying the MLA stack into the KKP Seed cluster, let’s create two Kubernetes Secrets that contain credentials for MinIO and Grafana, and which will be used by the MLA stack and KKP controllers. The MLA repo contains a Helm chart that will auto-generate the necessary Secrets - for creating them, simply run:
helm --namespace mla upgrade --atomic --create-namespace --install mla-secrets charts/mla-secrets --values config/mla-secrets/values.yaml
After the secrets are created, the MLA stack can be deployed by using the helper script:
./hack/deploy-seed.sh
This will deploy all MLA stack components with the default settings, which may be sufficient for smaller scale setups (several user clusters). If any customization is needed for any of the components, the steps in the helper script can be manually reproduced with tweaked Helm values. See the “Setup Customization” section for more information.
After deploying MLA components into a KKP Seed cluster, Grafana and Alertmanager UI are exposed only via ClusterIP services by default. To expose them to users outside of the Seed cluster with proper authentication in place, we will use the IAP Helm chart from the Kubermatic repository.
As a matter of rule, to integrate well with KKP UI, Grafana and Alertmanager should be exposed at the URL https://<any-prefix>.<seed-name>.<kkp-domain>, for example:
https://grafana.<seed-name>.<kkp-domain>https://alertmanager.<seed-name>.<kkp-domain>The prefixes chosen for Grafana and Alertmanager then need to be configured in the KKP Admin Panel Configuration to enable KKP UI integration.
Let’s start with preparing the values.yaml for the IAP Helm Chart. A starting point can be found in the config/iap/values.example.yaml file of the MLA repository:
kkp.example.com in iap.oidc_issuer_url).grafana.seed-cluster-x.kkp.example.com in iap.deployments.grafana.ingress.host).iap.deployments.grafana.client_secret + iap.deployments.grafana.encryption_key and iap.deployments.alertmanager.client_secret + iap.deployments.alertmanager.encryption_key to the newly generated key values (they can be generated e.g. with cat /dev/urandom | tr -dc A-Za-z0-9 | head -c32).iap.deployments.grafana.config and iap.deployments.alertmanager.config (e.g. modify YOUR_GITHUB_ORG and YOUR_GITHUB_TEAM placeholders). Please check the OAuth Provider Configuration for more details.It is also necessary to set up your infrastructure accordingly:
iap.deployments.grafana.ingress.host and iap.deployments.alertmanager.ingress.host so that it points to the ingress-controller service of KKP.RedirectURIs with your domain name used in iap.deployments.grafana.ingress.host and iap.deployments.alertmanager.ingress.host and secret with your iap.deployments.grafana.client_secret and iap.deployments.alertmanager.client_secret:dex:
clients:
- RedirectURIs:
- https://grafana.seed-cluster-x.kkp.example.com/oauth/callback
id: mla-grafana
name: mla-grafana
secret: YOUR_CLIENT_SECRET
- RedirectURIs:
- https://alertmanager.seed-cluster-x.kkp.example.com/oauth/callback
id: mla-alertmanager
name: mla-alertmanager
secret: YOUR_CLIENT_SECRET
At this point, we can install the IAP Helm chart into the mla namespace, e.g. as follows:
helm --namespace mla upgrade --atomic --create-namespace --install iap charts/iap --values config/iap/values.yaml
For more information about how to secure your services in KKP using IAP and Dex, please check Securing System Services Documentation.
Once the User Cluster MLA stack is installed in all necessary seed clusters, it needs to be configured as described in this section.
Since the User Cluster MLA feature is in alpha stage, it has to be explicitly enabled via a feature gate in the KubermaticConfiguration, e.g.:
apiVersion: operator.kubermatic.io/v1alpha1
kind: KubermaticConfiguration
metadata:
name: kubermatic
namespace: kubermatic
spec:
featureGates:
UserClusterMLA:
enabled: true
Since the MLA stack has to be manually installed into every KKP Seed Cluster, it is necessary to explicitly enable it on the Seed Cluster level after it is installed. This can be done via mla.user_cluster_mla_enabled option of the Seed Custom Resource / API object, e.g.:
apiVersion: kubermatic.k8s.io/v1
kind: Seed
metadata:
name: europe-west3-c
namespace: kubermatic
spec:
mla:
user_cluster_mla_enabled: true
There are several options in the KKP “Admin Panel” which are related to user cluster MLA, as shown on the picture below:

User Cluster Logging:
User Cluster Monitoring:
User Cluster Alertmanager Prefix:
alertmanager the final URL would be https://alertmanager.<seed-name>.<kkp-domain>.User Cluster Grafana Prefix:
grafana the final URL would be https://grafana.<seed-name>.<kkp-domain>.KKP provides several addons for user clusters, that can be helpful when the User Cluster Monitoring feature is enabled, namely:
When these addons are deployed to user clusters, no further configuration of the user cluster MLA stack is needed, the exposed metrics will be scraped by user cluster Prometheus and become available in Grafana automatically.
Before addons can be deployed into KKP user clusters, the KKP installation has to be configured to enable them
as accessible addons. The node-exporter and kube-state-metrics
addons are part of the KKP default accessible addons, so they should be available out-of-the box, unless the KKP installation
administrator has changed it.
The default settings of the MLA stack components are sufficient for smaller scale setups (several user clusters). Whenever a larger scale is needed these settings should be adapted accordingly.
User Cluster MLA stack components setting can be adapted by modifying (using your own) their value.yaml files. Available Helm chart options can be reviewed in the MLA repo:
For larger scales, you will may start with tweaking the following:
values.yaml - persistence.size) - default: 50Giingester.replicas) - default 3ingester.persistentVolume.size) - default 10Giingester.replicas) - default 3ingester.persistentVolume.size) - default 10GiFor more details about configuring these components in an HA manner, you can review the following links:
Cortex:
Loki:
Grafana:
By default, the MLA stack is configured to hold the logs and metrics in the object store for 7 days. This can be overridden for logs and metrics separately:
For the metrics:
config.limits.max_query_lookback to the desired value (default: 168h = 7 days).lifecycleMgr.buckets[name=cortex].expirationDays to the value used in the cortex Helm chart + 1 day (default: 8d).For the logs:
loki.config.chunk_store_config.max_look_back_period to the desired value (default: 168h = 7 days).lifecycleMgr.buckets[name=loki].expirationDays to the value used in the loki Helm chart + 1 day (default: 8d).In the User Cluster MLA Grafana, there are several predefined Grafana dashboards that are automatically available across all Grafana organizations (KKP projects). The KKP administrators have ability to modify the list of these dashboards.
There are three ways for managing them:
Modify the already existing (pre-created) configmaps with the grafana-dashboards prefix in the mla namespace in the Seed cluster. These configmaps contain the Grafana dashboards that are already available across all KKP projects. You can add or remove Dashboards by modifying these configmaps. Be aware that these changes can be overwritten by MLA stack upgrade.
Create a new configmap with the grafana-dashboards name prefix in the mla namespace in the Seed cluster. You can add multiple such configmaps with your dashboards json data. For example:
apiVersion: v1
kind: ConfigMap
metadata:
name: grafana-dashboards-example
namespace: mla
data:
example-dashboard.json: <your-dashboard-json-data>
dashboards section of the values.yaml file.After the new dashboards are applied to the Seed Cluster, they will become available across all Grafana Organizations, and they can be found in the Grafana UI under Dashboards -> Manage.
In order to prevent from denial of service by abusive users of misconfigured applications, the write path and read path of the User Cluster MLA stack can be configured with rate-limits per user cluster.
Rate-limiting can be configured via the following API endpoints of MLAAdminSetting - available only for KKP administrator users:
GET /api/v2/projects/{project_id}/clusters/{cluster_id}/mlaadminsetting - get admin settingsPOST /api/v2/projects/{project_id}/clusters/{cluster_id}/mlaadminsetting - create admin settingsPUT /api/v2/projects/{project_id}/clusters/{cluster_id}/mlaadminsetting - update admin settingsDELETE /api/v2/projects/{project_id}/clusters/{cluster_id}/mlaadminsetting - delete admin settingsBy default, no rate-limiting is applied. Configuring the rate-limiting options with zero values has the same effect.
For metrics, the following rate-limiting options are supported as part of the monitoringRateLimits:
| Option | Direction | Enforced by | Description |
|---|---|---|---|
ingestionRate | Write path | Cortex | Ingestion rate limit in samples per second (Cortex ingestion_rate). |
ingestionBurstSize | Write path | Cortex | Maximum number of series per metric (Cortex max_series_per_metric). |
maxSeriesPerMetric | Write path | Cortex | Maximum number of series per this user cluster (Cortex max_series_per_user). |
maxSeriesTotal | Write path | Cortex | Maximum number of series per this user cluster (Cortex max_series_per_user). |
queryRate | Read path | MLA Gateway | Query request rate limit per second (NGINX rate in r/s). |
queryBurstSize | Read path | MLA Gateway | Query burst size in number of requests (NGINX burst). |
maxSamplesPerQuery | Read path | Cortex | Maximum number of samples during a query (Cortex max_samples_per_query). |
maxSeriesPerQuery | Read path | Cortex | Maximum number of timeseries during a query (Cortex max_series_per_query). |
For logs, the following rate-limiting options are supported as part of the loggingRateLimits:
| Option | Direction | Enforced by | Description |
|---|---|---|---|
ingestionRate | Write path | MLA Gateway | Ingestion rate limit in requests per second (NGINX rate in r/s). |
ingestionBurstSize | Write path | MLA Gateway | Ingestion burst size in number of requests (NGINX burst). |
queryRate | Read path | MLA Gateway | Query request rate limit per second (NGINX rate in r/s). |
queryBurstSize | Read path | MLA Gateway | Query burst size in number of requests (NGINX burst). |
This chapter describes some potential problems that you may face in a KKP installation and the steps you can take to resolve then.
Prometheus / Loki datasource for an user cluster is not available in the Grafana UI:

Metrics / Logs are not available in Grafana UI for some user cluster:

kubectl get pods -n mla-system
Output will be similar to this:
NAME READY STATUS RESTARTS AGE
prometheus-68f7485456-jj7v6 1/1 Running 0 11m
promtail-cm4qd 1/1 Running 0 6m11s
kubectl get pods -n cluster-cxfmstjqkw | grep mla-gateway
Output will be similar to this:
mla-gateway-6dd8c68d67-knmq7 1/1 Running 0 22m
mla namespace in the seed cluster:kubectl get pods -n mla
Before proceeding with any of the following steps, make sure that you backup all data that you may still need - metrics data / logs in the object store, alertmanager / rules configuration, Grafana dashboards.
In order to uninstall the User Cluster MLA stack from a seed cluster (and all user clusters serviced by that seed cluster), follow the 3 steps in this order:
In order to disable the User Cluster MLA feature for a Seed Cluster, set the mla.user_cluster_mla_enabled option of the Seed Custom Resource / API object to false, e.g.:
apiVersion: kubermatic.k8s.io/v1
kind: Seed
metadata:
name: europe-west3-c
namespace: kubermatic
spec:
mla:
user_cluster_mla_enabled: false
In order to uninstall the user cluster MLA stack components from a Seed cluster, first disable it in the Seed Custom Resource / API object as described in the previous section. After that, you can safely remove the resources in the mla namespace of the Seed Cluster.
You can do that on per-component basis using Helm - see the list of the helm Charts in the mla namespace:
helm ls -n mla
E.g. to uninstall Cortex, run:
helm delete cortex -n mla