Creating a Kubernetes Cluster
In this tutorial, we’re going to show how to use Kubermatic KubeOne to create
a highly-available Kubernetes cluster. The tutorial covers downloading KubeOne
and setting up the environment, creating the infrastructure using example
Terraform configs, and finally, provisioning a cluster using KubeOne. As a
result, you’ll get a production-ready and Kubernetes/CNCF conformant cluster,
running on a cloud provider of your choice.
Prerequisites
This tutorial assumes that you’re using Linux or macOS. KubeOne currently
doesn’t release Windows binaries. If you’re using Windows, we recommend
checking out the Windows Subsystem for Linux (WSL).
How Kubermatic KubeOne Works
Kubermatic KubeOne is a CLI tool for managing highly-available Kubernetes
clusters in any environment (cloud, on-prem, baremetal, edge…). Clusters
created by KubeOne are production-ready and Kubernetes/CNCF conformant out of
the box. Generally, KubeOne runs the following tasks:
- install dependencies and required packages (container runtime, kubelet,
kubeadm…)
- run Kubernetes’ Kubeadm to provision a Kubernetes cluster
- deploy components such as CNI, metrics-server, and Kubermatic
machine-controller
- create worker nodes by creating the appropriate MachineDeployment object(s)
Infrastructure Management
The infrastructure for the control plane is created by the user. This allows
KubeOne to work with any provider out of the box. Once the infrastructure is
created, the user provides information about the instances that will be used,
the load balancer that’s running in the front of the control plane nodes, and
the information that will be used later to create the worker nodes.
To make this task easier, KubeOne integrates with Terraform by reading the
Terraform state, and provides example Terraform configs that can be used to
create the infrastructure. We’ll use both the Terraform integration and the
example configs in this tutorial.
The infrastructure for the worker nodes can be managed in two ways:
- automatically, by using Kubermatic machine-controller (deployed by default
for supported providers)
- by creating the instances manually and using KubeOne to provision
them
The first approach is recommended if your provider is
natively-supported (AWS, Azure, DigitalOcean, GCP,
Hetzner Cloud, OpenStack, Packet, and VMware vSphere), and we will use it in
this tutorial. If your provider is not supported (e.g. bare-metal), you can
check the KubeOne Static Workers feature for more information
about the second approach.
The example Terraform configs that we’ll be using satisfy all infrastructure
requirements out of the box. However, if you’re planning on customizing configs
or using different approach, make sure to adhere to the requirements described
in the Infrastructure Management document.
Below, you can find a diagram that shows how KubeOne works.
Default Configuration
By default, KubeOne installs the following components:
- Container Runtime: Docker (containerd is supported as an alternative)*
- CNI: Canal (based on Calico and Flannel) (WeaveNet and user-provided CNI are
supported)
- metrics-server for collecting and exposing metrics from
Kubelets
- NodeLocal DNSCache for caching DNS queries to improve the
cluster performance
- Kubermatic machine-controller, a Cluster-API based
implementation for managing worker nodes
* containerd will be installed by default for newly-created Kubernetes 1.21+
clusters
It’s possible to configure which components are installed and how they are
configured by adjusting the KubeOne configuration manifest that we’ll create
later in the Step 5 (Provisioning The Cluster). To see possible configuration
options, refer to the configuration manifest reference which can be obtained
by running kubeone config print --full
.
Step 1 — Downloading KubeOne
The easiest way to download KubeOne is to use our installation script.
The following command will download and run the script:
curl -sfL https://get.kubeone.io | sh
The script downloads the latest version of KubeOne from GitHub, and unpacks it
in the /usr/local/bin
directory. Additionally, the script unpacks the example
Terraform configs, addons, and helper scripts in your current working
directory. At the end of the script output, you can find the path to the
unpacked files:
The addons and helper scripts are supposed to be used for advanced deployments
and we will not use those in this tutorial. They’re not required by KubeOne, so
you’re not required to keep to them.
...
Kubermatic KubeOne has been installed into /usr/local/bin/kubeone
Terraform example configs, addons, and helper scripts have been downloaded into the ./kubeone_1.2.0-beta.1_linux_amd64 directory
You can confirm that KubeOne has been installed successfully by running the
kubeone version
command. If you see an error, ensure that /usr/local/bin
is
in your PATH
or modify the script to install KubeOne in a different place.
You can also check the Getting KubeOne guide for alternative
installation methods.
As described in the How KubeOne Works section, we’ll use Terraform to manage
the infrastructure for the control plane, therefore we need to install it.
Terraform has several installation methods: manually, using a package manager
such as apt
, using Homebrew (for macOS users). In this tutorial, we’ll do it
manually, but you can check out the
official installation guide for other options.
First, visit the Terraform download page and grab the
link for the latest version for your platform and architecture.
KubeOne requires Terraform 1.0 or newer. You can download it from the
browser, or use cURL
such as:
curl -LO https://releases.hashicorp.com/terraform/1.0.0/terraform_1.0.0_linux_amd64.zip
Once you download the archive, unzip it:
unzip terraform_1.0.0_linux_amd64.zip
Finally, move the unpacked terraform
binary to somewhere in your PATH
.
We’ll use /usr/local/bin
for the purposes of this tutorial:
mv terraform /usr/local/bin
After that is done, Terraform is ready to be used. You can run
terraform version
to check is it properly installed. If you see an error,
ensure that the directory where you moved the binary is in your PATH
.
Step 3 — Configuring The Environment
KubeOne and Terraform require you to have the cloud provider credentials
exported as the environment variables. Below you can find required environment
variables for each natively-supported cloud provider.
You need an IAM account
with the appropriate permissions for Terraform to create the infrastructure
and for machine-controller to create worker nodes.
Environment Variable | Description |
---|
AWS_ACCESS_KEY_ID | The AWS Access Key |
AWS_SECRET_ACCESS_KEY | The AWS Secret Access Key |
AWS_PROFILE | Name of the profile defined in the ~/.aws/credentials file. This variable is considered only if AWS_ACCESS_KEY_ID or AWS_SECRET_ACCESS_KEY is unset |
The following environment variables are needed by machine-controller for
creating worker nodes.
For the Terraform reference please take a look at
Azure provider docs.
Environment Variable | Description |
---|
ARM_CLIENT_ID | Azure ClientID |
ARM_CLIENT_SECRET | Azure Client secret |
ARM_TENANT_ID | Azure TenantID |
ARM_SUBSCRIPTION_ID | Azure SubscriptionID |
You need an API Access Token
with read and write permission for Terraform to create the infrastructure,
machine-controller to create the worker nodes, and for DigitalOcean Cloud
Controller Manager.
Environment Variable | Description |
---|
DIGITALOCEAN_TOKEN | The DigitalOcean API Access Token with read/write permissions |
You need an Service Account
with the appropriate permissions for Terraform to create the infrastructure
and for machine-controller to create worker nodes.
The needed permissions are are:
- Compute Admin:
roles/compute.admin
- Service Account User:
roles/iam.serviceAccountUser
- Viewer:
roles/viewer
If the gcloud
CLI is installed,
a service account can be created like follow:
# create new service account
gcloud iam service-accounts create k1-cluster-provisioner
# get your service account id
gcloud iam service-accounts list
# get your project id
gcloud projects list
# create policy binding
gcloud projects add-iam-policy-binding YOUR_PROJECT_ID --member 'serviceAccount:YOUR_SERVICE_ACCOUNT_ID' --role='roles/compute.admin'
gcloud projects add-iam-policy-binding YOUR_PROJECT_ID --member 'serviceAccount:YOUR_SERVICE_ACCOUNT_ID' --role='roles/iam.serviceAccountUser'
gcloud projects add-iam-policy-binding YOUR_PROJECT_ID --member 'serviceAccount:YOUR_SERVICE_ACCOUNT_ID' --role='roles/viewer'
A Google Service Account for the platform has to be created, see
Creating and managing service accounts.
The result is a JSON file containing the fields:
type
project_id
private_key_id
private_key
client_email
client_id
auth_uri
token_uri
auth_provider_x509_cert_url
client_x509_cert_url
# create a new json key for your service account
gcloud iam service-accounts keys create --iam-account YOUR_SERVICE_ACCOUNT k1-cluster-provisioner-sa-key.json
Also, the Compute Engine API has to be enabled for the project in the
Google APIs Console.
Once you have the Service Account, you need to set GOOGLE_CREDENTIALS
environment variable:
# export JSON file content of created service account json key
export GOOGLE_CREDENTIALS=$(cat ./k1-cluster-provisioner-sa-key.json)
Environment Variable | Description |
---|
GOOGLE_CREDENTIALS | GCE Service Account |
You need a Hetzner API Token for Terraform to create the infrastructure,
machine-controller to create worker nodes, and for Hetzner Cloud Controller
Manager.
Environment Variable | Description |
---|
HCLOUD_TOKEN | The Hetzner API Access Token |
The following environment variables are needed by Terraform for creating the
infrastructure and for machine-controller to create the worker nodes.
Environment Variable | Description |
---|
OS_AUTH_URL | The URL of OpenStack Identity Service |
OS_USERNAME | The username of the OpenStack user |
OS_PASSWORD | The password of the OpenStack user |
OS_REGION_NAME | The name of the OpenStack region |
OS_DOMAIN_NAME | The name of the OpenStack domain |
OS_TENANT_ID | The ID of the OpenStack tenant |
OS_TENANT_NAME | The name of the OpenStack tenant |
You need an API Access Token
for Terraform to create the infrastructure, machine-controller to create worker
nodes, and for Packet Cloud Controller Manager.
Environment Variable | Description |
---|
PACKET_AUTH_TOKEN | Packet auth token |
PACKET_PROJECT_ID | Packet project ID |
The following environment variables are needed by machine-controller for
creating the worker nodes.
For the Terraform reference, please take a look at
vSphere provider docs
Environment Variable | Description |
---|
VSPHERE_SERVER | The address of the vSphere instance |
VSPHERE_USER | The username of the vSphere user |
VSPHERE_PASSWORD | The password of the vSphere user |
As credentials are deployed to the cluster, it’s recommended to use
dedicated, non-administrator credentials whenever it’s possible.
To set the environment variable, use the export
command such as:
export <VARIABLE_NAME>=<VARIABLE_VALUE>
Step 4 — Creating The Infrastructure
With KubeOne installed and the environment configured, we’re ready to create
the infrastructure for our cluster. We’ll use the example Terraform configs for
this step.
First, move to the directory with the example Terraform configs that has been
created while installing KubeOne in the Step 1. For example (the directory
name depends on the latest KubeOne version):
cd ./kubeone_1.2.0-beta.1_linux_amd64/examples/terraform
In this directory, you can find a subdirectory for each supported provider.
Move to the directory for a provider that you want to use. For example, if you
want to run your cluster on GCP:
Before we can use Terraform, we need to initialize the directory structure and
download the required plugins. This is done by using the init
command:
In the same directory, create a file called terraform.tfvars
which will
contain Terraform variables used to customize the infrastructure creation
process. We’ll set the following two variables:
cluster_name = "kubeone-cluster"
ssh_public_key_file = "~/.ssh/id_rsa.pub"
The cluster_name
variable is used as a prefix for cloud resources.
The ssh_public_key_file
is a path to a SSH public key that will be deployed
on instances. KubeOne connects to instances over SSH to provision and configure
them. If you don’t have an SSH key, you can generate one by running
ssh-keygen
.
The terraform.tfvars
files can also be used to customize properties such as
instances size. We’ll use the default settings, but if you wish to customize
settings, you can check the
Using Terraform Configs document for more details.
Once the terraform.tfvars
file is created, run the plan
command to see what
changes will be made:
If you agree with the proposed changes, you can apply them by running the
apply
command. You’ll be asked to confirm your intention by typing yes
.
If you’re provisioning a cluster on GCP, initial terraform apply
requires
variable control_plane_target_pool_members_count
to be set to 1.
This command will create all the infrastructure needed to get started.
This might take up to several minutes, depending on the cloud provider.
Once terraform apply
is complete, check the output and ensure that the
endpoint
variable in the kubeone_api
block is set (i.e. it’s not an empty
string). If it isn’t set, wait a minute or two, and then run
terraform refresh
.
Finally, we need to save the Terraform state in a format that can be parsed by
KubeOne for information about the infrastructure (security groups, IAM roles,
etc.), instances, and load balancer. That information will be used for
accessing the instances, provisioning the cluster, and later for creating the
worker nodes. The format is already defined in a file called output.tf
and
everything you need to do is to run the output
command:
terraform output -json > tf.json
This command will create a file called tf.json
with JSON representation of
the Terraform state that can be parsed by KubeOne. With that done, we’re ready
to provision our cluster which will do in the next step.
Step 5 — Provisioning The Cluster
Now that we have the infrastructure, we can use KubeOne to provision a
Kubernetes cluster.
The first step is to create a KubeOne configuration manifest that describes how
the cluster will be provisioned, which Kubernetes version will be used,
and more. The manifest can be saved in a file called kubeone.yaml
. In the
following table you can find example configuration manifest for each
supported provider.
apiVersion: kubeone.io/v1beta1
kind: KubeOneCluster
versions:
kubernetes: '1.20.4'
cloudProvider:
aws: {}
Make sure to replace the placeholder values with real values in the
cloud-config section.
In the Kubermatic documentation
you can find more information regarding how to set up a service account.
This service account is needed to proceed.
All required resources (VPCs, Resource Groups, Route Tables…) are created by
Terraform out of the box. The resources are prefixed with your cluster name
defined in terraform.tfvars
(Step 4), so you need to replace <cluster-name>
with your cluster name in the cloud-config example below.
Warning: due to Azure limitations, you can have only one Basic SKU Load
Balancer per Availability Set. Since we already create a Basic SKU load
balancer for the API server in the Availability Set used by control plane
nodes, you can’t create other load balancers in the same set. This also means
that you can’t create Kubernetes Load Balancer Services because the creation
would fail due to the mentioned limit.
To mitigate this, our Terraform configs will create a dedicated Availability
Set to be used for worker nodes and Kubernetes Load Balancer Services. With
that setup, all pods exposed via a Kubernetes Load Balancer Service must be
scheduled on worker nodes. Scheduling pods on control plane nodes would make
Azure CCM fail to find underlying instances and add them to the appropriate
Azure load balancer because the newly-created load balancer and control plane
nodes are in different availability sets.
Please check the Production Recommendations
document for more details.
apiVersion: kubeone.io/v1beta1
kind: KubeOneCluster
versions:
kubernetes: '1.20.4'
cloudProvider:
azure: {}
cloudConfig: |
{
"cloud": "AZUREPUBLICCLOUD",
"tenantId": "<YOUR-TENANT-ID>",
"subscriptionId": "<YOUR-SUBSCRIPTION-ID>",
"aadClientId": "<YOUR-CLIENT-ID>",
"aadClientSecret": "<YOUR-CLIENT-SECRET>",
"resourceGroup": "<cluster-name>-rg",
"location": "<YOUR-CLUSTER-REGION>",
"vnetName": "<cluster-name>-vpc",
"subnetName": "<cluster-name>-subnet",
"routeTableName": "<cluster-name>-rt",
"securityGroupName": "<cluster-name>-sg",
"primaryAvailabilitySetName": "<cluster-name>-avset-workers",
"useInstanceMetadata": true,
"loadBalancerSku": ""
}
external: true
instructs KubeOne to deploy the
DigitalOcean Cloud Controller Manager.
The CCM provides additional cluster features, such as LoadBalancer Services,
and fetches information about nodes from the API.
apiVersion: kubeone.io/v1beta1
kind: KubeOneCluster
versions:
kubernetes: '1.20.4'
cloudProvider:
digitalocean: {}
external: true
Setting regional = true
in the cloud-config is required when control plane
nodes are across multiple availability zones. We deploy control plane hosts
in multiple AZs by default in our example Terraform configs.
apiVersion: kubeone.io/v1beta1
kind: KubeOneCluster
versions:
kubernetes: '1.20.4'
cloudProvider:
gce: {}
cloudConfig: |
[global]
regional = true
Due to how GCP LBs work, initial terraform apply
requires variable control_plane_target_pool_members_count
to be set
to 1.
terraform apply -var=control_plane_target_pool_members_count=1
Once initial kubeone install
or kubeone apply
is done, the control_plane_target_pool_members_count
should not be
used.
external: true
instructs KubeOne to deploy the
Hetzner Cloud Controller Manager.
The Hetzner CCM fetches information about nodes from the API.
apiVersion: kubeone.io/v1beta1
kind: KubeOneCluster
versions:
kubernetes: '1.20.4'
cloudProvider:
hetzner: {}
external: true
Make sure to replace the placeholder values with real values in the
cloud-config section.
apiVersion: kubeone.io/v1beta1
kind: KubeOneCluster
versions:
kubernetes: '1.20.4'
cloudProvider:
openstack: {}
cloudConfig: |
[Global]
username=OS_USERNAME
password=OS_PASSWORD
auth-url=https://OS_AUTH_URL/v3
tenant-id=OS_TENANT_ID
domain-id=OS_DOMAIN_ID
[LoadBalancer]
subnet-id=SUBNET_ID
external: true
instructs KubeOne to deploy the
Packet Cloud Controller Manager.
The Packet CCM fetches information about nodes from the API.
It’s important to provide custom clusterNetwork settings in order to avoid
colliding with the Packet private network which is 10.0.0.0/8
.
apiVersion: kubeone.io/v1beta1
kind: KubeOneCluster
versions:
kubernetes: "1.20.4"
cloudProvider:
packet: {}
external: true
clusterNetwork:
podSubnet: "192.168.0.0/16"
serviceSubnet: "172.16.0.0/12"
Make sure to replace the placeholder values with real values in the
cloud-config section. The vsphere-ccm-credentials
Secret is created
automatically by KubeOne as of v1.0.4.
apiVersion: kubeone.io/v1beta1
kind: KubeOneCluster
versions:
kubernetes: '1.20.4'
cloudProvider:
vsphere: {}
cloudConfig: |
[Global]
secret-name = "vsphere-ccm-credentials"
secret-namespace = "kube-system"
port = "443"
insecure-flag = "0"
[VirtualCenter "1.1.1.1"]
[Workspace]
server = "1.1.1.1"
datacenter = "dc-1"
default-datastore="exsi-nas"
resourcepool-path="kubeone"
folder = "kubeone"
[Disk]
scsicontrollertype = pvscsi
[Network]
public-network = "NAT Network"
Before proceeding, make sure to select the Kubernetes version that you want
to use, and replace the placeholder values with real values where applicable.
In the following table, you can find a list of supported Kubernetes version
for latest KubeOne versions (you can run kubeone version
to find the version
that you’re running).
KubeOne version | 1.21 | 1.20 | 1.19 | 1.18 | 1.17 |
---|
v1.2+ | ✓ | ✓ | ✓ | ✓ | - |
v1.0+ | - | - | ✓ | ✓ | ✓** |
** Kubernetes 1.17 has reached End-of-Life (EOL) and is not recommended
for new clusters
Now, we’re ready to provision the cluster! This is done by running the
kubeone apply
command and providing it the configuration manifest and the
Terraform state file that we created in the previous step.
kubeone apply -m kubeone.yaml -t tf.json
This command analyzes the provided instances by running a set of probes to
determine is it needed to provision a cluster or is there already a Kubernetes
cluster running. If the cluster is already there, the probes will check is the
cluster healthy and is the actual state matching the expected state defined by
the configuration manifest. This allows us to use one single command for all
operations (provision, upgrade, enable features, and more). This process is
called Cluster Reconciliation and is described with additional details in the
Cluster Reconciliation document.
The output will show steps that will be taken to provision a cluster. You’ll be
asked to confirm the intention to provision a cluster by typing yes
.
INFO[11:37:21 CEST] Determine hostname…
INFO[11:37:28 CEST] Determine operating system…
INFO[11:37:30 CEST] Running host probes…
The following actions will be taken:
Run with --verbose flag for more information.
+ initialize control plane node "ip-172-31-220-51.eu-west-3.compute.internal" (172.31.220.51) using 1.20.4
+ join control plane node "ip-172-31-221-177.eu-west-3.compute.internal" (172.31.221.177) using 1.20.4
+ join control plane node "ip-172-31-222-48.eu-west-3.compute.internal" (172.31.222.48) using 1.20.4
+ ensure machinedeployment "marko-1-eu-west-3a" with 1 replica(s) exists
+ ensure machinedeployment "marko-1-eu-west-3b" with 1 replica(s) exists
+ ensure machinedeployment "marko-1-eu-west-3c" with 1 replica(s) exists
Do you want to proceed (yes/no):
After confirming your intention to provision the cluster, the provisioning will
start. It usually takes 5-10 minutes for cluster to be provisioned. At the end,
you should see output such as the following one:
...
INFO[11:46:54 CEST] Downloading kubeconfig…
INFO[11:46:54 CEST] Ensure node local DNS cache…
INFO[11:46:54 CEST] Activating additional features…
INFO[11:46:56 CEST] Applying canal CNI plugin…
INFO[11:47:10 CEST] Creating credentials secret…
INFO[11:47:10 CEST] Installing machine-controller…
INFO[11:47:17 CEST] Installing machine-controller webhooks…
INFO[11:47:17 CEST] Waiting for machine-controller to come up…
INFO[11:48:07 CEST] Creating worker machines…
At this point, your cluster is provisioned, but it may take several more
minutes for worker nodes to get created by Kubermatic machine-controller and
joined the cluster. In meanwhile, you can configure the access to the cluster.
Step 6 — Configuring The Cluster Access
KubeOne automatically downloads the Kubeconfig file for the cluster. It’s named
as <cluster_name>-kubeconfig, where <cluster_name> is the name
provided in the terraform.tfvars
file. You can use it with kubectl such as:
kubectl --kubeconfig=<cluster_name>-kubeconfig
or export the KUBECONFIG
environment variable:
export KUBECONFIG=$PWD/<cluster_name>-kubeconfig
If you want to learn more about kubeconfig and managing access to your
clusters, you can check the
Configure Access To Multiple Clusters document.
You can try to list all nodes in the cluster to confirm that you can access
the cluster:
You should see output such as the following one. You should have three control
plane node and one (three for AWS) worker node(s). If you don’t see all worker
nodes yet, note that it might take several minutes for all nodes to join the
cluster.
NAME STATUS ROLES AGE VERSION
ip-172-31-220-166.eu-west-3.compute.internal Ready <none> 38m v1.20.4
ip-172-31-220-51.eu-west-3.compute.internal Ready master 43m v1.20.4
ip-172-31-221-177.eu-west-3.compute.internal Ready master 42m v1.20.4
ip-172-31-221-18.eu-west-3.compute.internal Ready <none> 38m v1.20.4
ip-172-31-222-211.eu-west-3.compute.internal Ready <none> 38m v1.20.4
ip-172-31-222-48.eu-west-3.compute.internal Ready master 41m v1.20.4
Conclusion
Congratulations!!! You have successfully provisioned a Kubernetes cluster using
Kubermatic KubeOne. You’re now ready to run your workload on this cluster.
We recommend checking the following learn more section for additional resources
and recommendations.
Learn More