Install Kubernetes using kubeadm
Using kubeadm,a minimum viable Kubernetes cluster can be created that conforms to best practices. In fact, kubeadm can be used to set up a cluster that will pass the Kubernetes Conformance tests. kubeadm also supports other cluster lifecycle functions, such as bootstrap tokens and cluster upgrades. [1]
- 1. Installing kubeadm and container runtime
- 2. Creating a cluster with kubeadm
- 2.1. Customizing components with the kubeadm API
- 2.2. Initializing control-plane node
- 2.3. Joining the work nodes
- 2.4. Joing the stacked control plane and etcd nodes
- 2.5. Removing the nodes
- 2.6. Installing Addons
- 3. Upgrading kubeadm clusters
- References
1. Installing kubeadm and container runtime
-
A compatible Linux host.
The Kubernetes project provides generic instructions for Linux distributions based on Debian and Red Hat, and those distributions without a package manager. [2]
-
2 GB or more of RAM per machine (any less will leave little room for other apps), 2 CPUs or more.
The
--ignore-preflight-errors=NumCPU,Mem
flag can also be used to ignore the preflight error onkubeadm init
orkubeadm join
. -
Full network connectivity between all machines in the cluster (public or private network is fine).
kubeadm similarly to other Kubernetes components tries to find a usable IP on the network interfaces associated with a default gateway on a host. Such an IP is then used for the advertising and/or listening performed by a component. [1]
To find out what this IP is on a Linux host:
ip route show # Look for a line starting with "default via"
-
Unique hostname, MAC address, and product_uuid for every node.
The MAC address of the network interfaces can be got using the command
ip link
orifconfig -a
The product_uuid can be checked by using the command
sudo cat /sys/class/dmi/id/product_uuid
-
Certain ports are open on the machines.
Table 1. Control plane Protocol
Direction
Port Range
Purpose
Used By
TC
Inbound
6443
Kubernetes API server
All
TCP
Inbound
2379-2380
etcd server client API
kube-apiserver, etcd
TCP
Inbound
10250
Kubelet API
Self, Control plane
TCP
Inbound
10259
kube-scheduler
Self
TCP
Inbound
10257
kube-controller-manager
Self
Table 2. Worker node(s) Protocol
Direction
Port Range
Purpose
Used By
TCP
Inbound
10250
Kubelet API
Self, Control plane
TCP
Inbound
30000-32767
NodePort Services
All
These required ports need to be open in order for Kubernetes components to communicate with each other. The pod network plugin used may also require certain ports to be open.
-
Swap disabled.
The default behavior of a kubelet (/ˈkuːblɛt/) is to fail to start if swap memory is detected on a node.
-
To tolerate swap, add
failSwapOn: false
to kubelet configuration or as a command line argument. -
Note: even if
failSwapOn: false
is provided, workloads wouldn’t have swap access by default.
To check swap status, use: [3]
swapon --show
Or to show physical memory as well as swap usage:
free -h
-
1.1. Installing a container runtime
To run containers in Pods, Kubernetes uses a container runtime.
By default, Kubernetes uses the Container Runtime Interface (CRI) to interface with a chosen container runtime.
If a runtime isn’t specified, kubeadm automatically tries to detect an installed container runtime by scanning through a list of known endpoints. [2]
Runtime |
Path to Unix domain socket |
containerd |
|
CRI-O |
|
Docker Engine (using cri-dockerd) |
|
If multiple or no container runtimes are detected kubeadm will throw an error and will request to specify which one to use.
Docker Engine does not implement the CRI which is a requirement for a container runtime to work with Kubernetes. For that reason, an additional service cri-dockerd has to be installed. cri-dockerd is a project based on the legacy built-in Docker Engine support that was removed from the kubelet in version 1.24. |
Kubernetes 1.26 defaults to using v1 of the CRI API. If a container runtime does not support the v1 API, the kubelet falls back to using the (deprecated) v1alpha2 API instead. [4]
// Show the details of the `cri` plugin on an existed containerd using `ctr`
$ sudo ctr plugins ls -d id==cri
Type: io.containerd.grpc.v1
ID: cri
Requires:
io.containerd.event.v1
io.containerd.service.v1
io.containerd.warning.v1
Platforms: linux/amd64
Exports:
CRIVersion v1
CRIVersionAlpha v1alpha2
1.1.1. Forwarding IPv4 and letting iptables see bridged traffic
Verify that the br_netfilter
module is loaded by running lsmod | grep br_netfilter
.
To load it explicitly, run sudo modprobe br_netfilter
.
In order for a Linux node’s iptables to correctly view bridged traffic, verify that net.bridge.bridge-nf-call-iptables
is set to 1
in the sysctl
config. For example:
cat <<EOF | sudo tee /etc/modules-load.d/k8s.conf
overlay
br_netfilter
EOF
sudo modprobe overlay
sudo modprobe br_netfilter
# sysctl params required by setup, params persist across reboots
cat <<EOF | sudo tee /etc/sysctl.d/k8s.conf
net.bridge.bridge-nf-call-iptables = 1
net.bridge.bridge-nf-call-ip6tables = 1
net.ipv4.ip_forward = 1
EOF
# Apply sysctl params without reboot
sudo sysctl --system
# Verify that the `br_netfilter`, `overlay` modules are loaded
lsmod | grep br_netfilter
lsmod | grep overlay
# Verify that the
# `net.bridge.bridge-nf-call-iptables`, `net.bridge.bridge-nf-call-ip6tables`, and `net.ipv4.ip_forward`
# system variables are set to `1`
sudo sysctl net.bridge.bridge-nf-call-iptables net.bridge.bridge-nf-call-ip6tables net.ipv4.ip_forward
1.1.2. Cgroup drivers
Both kubelet and the underlying container runtime need to interface with control groups to enforce resource management for pods and containers and set resources such as cpu/memory requests and limits.
It’s critical that the kubelet and the container runtime uses the same cgroup driver and are configured the same. [4]
The cgroupfs driver is NOT recommended when systemd is the init system because systemd expects a single cgroup manager on the system.
Starting with v1.22 and later, when creating a cluster with kubeadm, if the user does not set the cgroupDriver field under KubeletConfiguration, kubeadm defaults it to systemd. |
Check the Cgroup driver of the kubelet in the cluster-level of an existed cluster:
$ kubectl get -n kube-system cm kubelet-config -oyaml | grep cgroupDriver
cgroupDriver: systemd
Check the systemd driver status of the containerd runtime using crictl
.
$ sudo crictl info | jq '.config.containerd.runtimes.runc.options'
{
. . .
"SystemdCgroup": true
}
1.1.3. Containerd
Follow the instructions for getting started with containerd.
For more information about Cgroups, see Linux CGroups and Containers. For more information about containerd, see RUNC CONTAINERD CRI DOCKERSHIM. |
In the containerd config /etc/containerd/config.toml
:
-
To use the systemd cgroup driver:
[plugins."io.containerd.grpc.v1.cri".containerd.runtimes.runc.options] SystemdCgroup = true
-
To overwrite the sandbox (pause) image:
[plugins."io.containerd.grpc.v1.cri"] sandbox_image = "registry.k8s.io/pause:3.2"
Please note, that it is a best practice for kubelet to declare the matching pod-infra-container-image
. If not configured, kubelet may attempt to garbage collect the pause image. -
Find or overwrite the settings for persistent and runtime storage locations as well as grpc, debug, and metrics addresses for the various APIs.
#root = "/var/lib/containerd" #state = "/run/containerd"
-
Check the CRI integration plugin status.
$ sudo ctr plugin ls id==cri TYPE ID PLATFORMS STATUS io.containerd.grpc.v1 cri linux/amd64 ok
-
Check the systemd driver status using
crictl
.$ sudo crictl info -o go-template --template '{{.config.containerd.runtimes.runc.options.SystemdCgroup}}' true
1.2. Installing kubeadm, kubelet and kubectl
Note: The legacy package repositories (apt.kubernetes.io and yum.kubernetes.io ) have been deprecated and frozen starting from September 13, 2023. Using the new package repositories hosted at pkgs.k8s.io is strongly recommended and required in order to install Kubernetes versions released after September 13, 2023. The deprecated legacy repositories, and their contents, might be removed at any time in the future and without a further notice period. The new package repositories provide downloads for Kubernetes versions starting with v1.24.0. [2]
|
-
Debian-based distributions
sudo apt-get update && sudo apt-get install -y apt-transport-https ca-certificates curl curl -fsSL https://pkgs.k8s.io/core:/stable:/v1.29/deb/Release.key \ | sudo gpg --dearmor -o /etc/apt/keyrings/kubernetes-apt-keyring.gpg (1) echo 'deb [signed-by=/etc/apt/keyrings/kubernetes-apt-keyring.gpg] https://pkgs.k8s.io/core:/stable:/v1.26/deb/ /' \ | sudo tee /etc/apt/sources.list.d/kubernetes.list (2) sudo apt-get update sudo apt-get install -y kubelet kubeadm kubectl (3) sudo apt-mark hold kubelet kubeadm kubectl
1 Download the public signing key for the Kubernetes package repositories. The same signing key is used for all repositories so the version in the URL can be disregarded. 2 Please NOTE that this repository have packages only for Kubernetes 1.26; for other Kubernetes minor versions, change the Kubernetes minor version in the URL to match the desired minor version. Such as: deb [signed-by=/etc/apt/keyrings/kubernetes-apt-keyring.gpg] https://pkgs.k8s.io/core:/stable:/v1.29/deb/ / deb [signed-by=/etc/apt/keyrings/kubernetes-apt-keyring.gpg] https://pkgs.k8s.io/core:/stable:/v1.28/deb/ / deb [signed-by=/etc/apt/keyrings/kubernetes-apt-keyring.gpg] https://pkgs.k8s.io/core:/stable:/v1.27/deb/ / deb [signed-by=/etc/apt/keyrings/kubernetes-apt-keyring.gpg] https://pkgs.k8s.io/core:/stable:/v1.26/deb/ /
3 The installing package version can also be specified: $ apt-cache madison kubeadm | head -n 5 kubeadm | 1.26.4-1.1 | https://pkgs.k8s.io/core:/stable:/v1.26/deb Packages kubeadm | 1.26.3-1.1 | https://pkgs.k8s.io/core:/stable:/v1.26/deb Packages kubeadm | 1.26.2-1.1 | https://pkgs.k8s.io/core:/stable:/v1.26/deb Packages kubeadm | 1.26.1-1.1 | https://pkgs.k8s.io/core:/stable:/v1.26/deb Packages kubeadm | 1.26.0-2.1 | https://pkgs.k8s.io/core:/stable:/v1.26/deb Packages $ sudo apt-get install -y kubelet=1.26.0-2.1 kubeadm=1.26.0-2.1 kubectl=1.26.0-2.1
Output shell completion code for the specified shell (bash or zsh).
# Install the bash-completion framework sudo apt-get install -y bash-completion # Output bash completion sudo sh -c 'kubeadm completion bash > /etc/bash_completion.d/kubeadm' sudo sh -c 'kubectl completion bash > /etc/bash_completion.d/kubectl' sudo sh -c 'crictl completion > /etc/bash_completion.d/crictl' # Load the completion code for bash into the current shell source /etc/bash_completion
-
Red Hat-based distributions
# This overwrites any existing configuration in /etc/yum.repos.d/kubernetes.repo cat <<EOF | sudo tee /etc/yum.repos.d/kubernetes.repo [kubernetes] name=Kubernetes baseurl=https://pkgs.k8s.io/core:/stable:/v1.26/rpm/ enabled=1 gpgcheck=1 gpgkey=https://pkgs.k8s.io/core:/stable:/v1.26/rpm/repodata/repomd.xml.key exclude=kubelet kubeadm kubectl cri-tools kubernetes-cni (1) EOF # Set SELinux in permissive mode (effectively disabling it) (2) sudo setenforce 0 sudo sed -i 's/^SELINUX=enforcing$/SELINUX=permissive/' /etc/selinux/config sudo yum install -y kubelet kubeadm kubectl --disableexcludes=kubernetes (3) sudo systemctl enable --now kubelet
1 The exclude
parameter in the repository definition ensures that the packages related to Kubernetes are not upgraded upon runningyum update
as there’s a special procedure that must be followed for upgrading Kubernetes.Please NOTE that this repository have packages only for Kubernetes 1.26; for other Kubernetes minor versions, change the Kubernetes minor version in the URL to match the desired minor version.
2 Setting SELinux in permissive mode by running setenforce 0
andsed
effectively disables it. This is required to allow containers to access the host filesystem, which is needed by pod networks for example. It’s required to do this until SELinux support is improved in the kubelet.The SELinux can be left enabled if knowing how to configure it but it may require settings that are not supported by kubeadm.
3 The installing package version can also be specified: $ yum --showduplicates --disableexcludes=kubernetes list kubeadm | tail -n 5 kubeadm.x86_64 1.26.0-150500.2.1 kubernetes kubeadm.x86_64 1.26.1-150500.1.1 kubernetes kubeadm.x86_64 1.26.2-150500.1.1 kubernetes kubeadm.x86_64 1.26.3-150500.1.1 kubernetes kubeadm.x86_64 1.26.4-150500.1.1 kubernetes $ sudo yum --disableexcludes=kubernetes install kubelet-1.26.0-150500.2.1 kubeadm-1.26.0-150500.2.1 kubectl-1.26.0-150500.2.1
Output shell completion code for the specified shell (bash or zsh).
# Install the bash-completion framework sudo yum install -y bash-completion # Output bash completion sudo sh -c 'kubeadm completion bash > /etc/bash_completion.d/kubeadm' sudo sh -c 'kubectl completion bash > /etc/bash_completion.d/kubectl' sudo sh -c 'crictl completion > /etc/bash_completion.d/crictl' # Load the completion code for bash into the current shell source /usr/share/bash-completion/bash_completion
It may be needed to set the runtime endpoint of the crictl explicity, such as sudo crictl config --set runtime-endpoint=unix:///run/containerd/containerd.sock .
|
Consider enabling the containerd snapshotters feature on Docker Engine.
The cgroup driver can also be explicitly specified to systemd on Docker.
|
2. Creating a cluster with kubeadm
Kubeadm has commands that can pre-pull the required images when creating a cluster without an internet connection on its nodes.
The images can be listed and pulled using the kubeadm config images sub-command:
kubeadm config images list # [--kubernetes-version=v1.26.0] [--image-repository=registry.k8s.io]
kubeadm config images pull # [--kubernetes-version=v1.26.0] [--image-repository=registry.k8s.io]
Kubeadm allows using a custom image repository for the required images. For example:
kubernetes_version=v1.26.0
sudo kubeadm config images pull \
--kubernetes-version=$kubernetes_version \
--image-repository=registry.cn-hangzhou.aliyuncs.com/google_containers
Use ctr
to retag the images in the k8s.io
namespace back to the default repository registry.k8s.io
:
#!/bin/sh
kubernetes_version=v1.26.0
image_repository=registry.cn-hangzhou.aliyuncs.com/google_containers
images=$(kubeadm config images list \
--kubernetes-version $kubernetes_version \
--image-repository $image_repository)
for i in $images; do
case "$i" in
*coredns*)
new_repo="registry.k8s.io/coredns"
;;
*)
new_repo="registry.k8s.io"
;;
esac
newtag=$(echo "$i" | sed "s@$image_repository@$new_repo@")
sudo ctr -n k8s.io images tag $i $newtag
done
Or, remove these images by using crictl
:
sudo crictl images | \
grep registry.cn-hangzhou.aliyuncs.com/google_containers | \
awk '{print $1":"$2}' | \
xargs sudo ctr -n k8s.io i rm
The image repository behavior of the kubeadm init
can also be overrided by using kubeadm with a configuration file.
# Run `kubeadm config print init-defaults` to see the default Init configuration.
apiVersion: kubeadm.k8s.io/v1beta3
kind: ClusterConfiguration
imageRepository: registry.k8s.io
2.1. Customizing components with the kubeadm API
The preferred way to configure kubeadm is to pass an YAML configuration file with the --config
option. A kubeadm config file could contain multiple configuration types separated using three dashes (---
).
apiVersion: kubeadm.k8s.io/v1beta4
kind: InitConfiguration
---
apiVersion: kubeadm.k8s.io/v1beta4
kind: ClusterConfiguration
---
apiVersion: kubelet.config.k8s.io/v1beta1
kind: KubeletConfiguration
---
apiVersion: kubeproxy.config.k8s.io/v1alpha1
kind: KubeProxyConfiguration
---
apiVersion: kubeadm.k8s.io/v1beta4
kind: JoinConfiguration
2.1.1. Customizing the control plane with flags in ClusterConfiguration
The kubeadm ClusterConfiguration
object exposes a way for users to override the default flags passed to control plane components such as the APIServer, ControllerManager, Scheduler and Etcd. [6]
apiVersion: kubeadm.k8s.io/v1beta3
kind: ClusterConfiguration
apiServer:
timeoutForControlPlane: 4m0s
controllerManager: {}
scheduler: {}
etcd:
local:
dataDir: /var/lib/etcd
networking:
dnsDomain: cluster.local
serviceSubnet: 10.96.0.0/12
dns: {}
imageRepository: registry.k8s.io
kubernetesVersion: 1.26.0
certificatesDir: /etc/kubernetes/pki
clusterName: kubernetes
2.1.2. Customizing with patches
Kubeadm allows passing a directory with patch files to InitConfiguration
and JoinConfiguration
on individual nodes. These patches
can be used as the last customization step before component configuration is written to disk.
apiVersion: kubeadm.k8s.io/v1beta3
kind: InitConfiguration
patches:
directory: /home/user/somedir
---
apiVersion: kubeadm.k8s.io/v1beta3
kind: JoinConfiguration
patches:
directory: /home/user/somedir
2.1.3. Customizing the kubelet
Some kubelet configuration details need to be the same across all kubelets involved in the cluster, while other configuration aspects need to be set on a per-kubelet basis to accommodate the different characteristics of a given machine (such as OS, storage, and networking). [7]
2.1.3.1. Kubelet configuration patterns
-
Propagating cluster-level configuration to each kubelet
The kubelet with default values to be used can be provided by
kubeadm init
andkubeadm join
commands. Interesting examples include using a different container runtime or setting the default subnet used by services.To make the services to use the subnet 10.96.0.0/12 as the default for services, pass the
--service-cidr
parameter to kubeadm:kubeadm init --service-cidr 10.96.0.0/12
The kubelet provides a versioned, structured API object that can configure most parameters in the kubelet and push out this configuration to each running kubelet in the cluster, called
KubeletConfiguration
, and can be passed tokubeadm init
and kubeadm will apply the same baseKubeletConfiguration
to all nodes in the cluster.kind: ClusterConfiguration apiVersion: kubeadm.k8s.io/v1beta3 --- apiVersion: kubelet.config.k8s.io/v1beta1 kind: KubeletConfiguration clusterDNS: - 10.96.0.10 cgroupDriver: systemd
-
Providing instance-specific configuration details
Some hosts require specific kubelet configurations due to differences in hardware, operating system, networking, or other host-specific parameters. The following list provides a few examples.
-
The path to the DNS resolution file, as specified by the
--resolv-conf
kubelet configuration flag, may differ among operating systems, or depending on whether you are using systemd-resolved. If this path is wrong, DNS resolution will fail on the Node whose kubelet is configured incorrectly. -
The Node API object
.metadata.name
is set to the machine’s hostname by default, unless you are using a cloud provider. You can use the--hostname-override
flag to override the default behavior if you need to specify a Node name different from the machine’s hostname. -
Currently, the kubelet cannot automatically detect the cgroup driver used by the container runtime, but the value of
--cgroup-driver
must match the cgroup driver used by the container runtime to ensure the health of the kubelet. -
To specify the container runtime you must set its endpoint with the
--container-runtime-endpoint=<path>
flag.
The recommended way of applying such instance-specific configuration is by using KubeletConfiguration patches.
-
2.1.3.2. Configure kubelets using kubeadm
When you call kubeadm init
, the kubelet configuration is marshalled to disk at /var/lib/kubelet/config.yaml
, and also uploaded to a kubelet-config
ConfigMap in the kube-system
namespace of the cluster.
To address the second pattern of providing instance-specific configuration details, kubeadm writes an environment file to /var/lib/kubelet/kubeadm-flags.env
, which contains a list of flags to pass to the kubelet when it starts. The flags are presented in the file like this:
KUBELET_KUBEADM_ARGS="--flag1=value1 --flag2=value2 ..."
In addition to the flags used when starting the kubelet, the file also contains dynamic parameters such as the cgroup driver and whether to use a different container runtime socket (--cri-socket
).
When you run kubeadm join
, kubeadm uses the Bootstrap Token credential to perform a TLS bootstrap, which fetches the credential needed to download the kubelet-config
ConfigMap and writes it to /var/lib/kubelet/config.yaml
. The dynamic environment file is generated in exactly the same way as kubeadm init
.
2.1.3.3. The kubelet drop-in file for systemd
kubeadm ships with configuration for how systemd should run the kubelet [7], written to /etc/systemd/system/kubelet.service.d/10-kubeadm.conf
and is used by systemd. For example:
[Service]
Environment="KUBELET_KUBECONFIG_ARGS=--bootstrap-kubeconfig=/etc/kubernetes/bootstrap-kubelet.conf --kubeconfig=/etc/kubernetes/kubelet.conf"
Environment="KUBELET_CONFIG_ARGS=--config=/var/lib/kubelet/config.yaml"
# This is a file that "kubeadm init" and "kubeadm join" generate at runtime, populating
# the KUBELET_KUBEADM_ARGS variable dynamically
EnvironmentFile=-/var/lib/kubelet/kubeadm-flags.env
# This is a file that the user can use for overrides of the kubelet args as a last resort. Preferably,
# the user should use the .NodeRegistration.KubeletExtraArgs object in the configuration files instead.
# KUBELET_EXTRA_ARGS should be sourced from this file.
EnvironmentFile=-/etc/default/kubelet
ExecStart=
ExecStart=/usr/bin/kubelet $KUBELET_KUBECONFIG_ARGS $KUBELET_CONFIG_ARGS $KUBELET_KUBEADM_ARGS $KUBELET_EXTRA_ARGS
This file specifies the default locations for all of the files managed by kubeadm for the kubelet.
-
The KubeConfig file to use for the TLS Bootstrap is
/etc/kubernetes/bootstrap-kubelet.conf
, but it is only used if/etc/kubernetes/kubelet.conf
does not exist. -
The KubeConfig file with the unique kubelet identity is
/etc/kubernetes/kubelet.conf
. -
The file containing the kubelet’s ComponentConfig is
/var/lib/kubelet/config.yaml
. -
The dynamic environment file that contains
KUBELET_KUBEADM_ARGS
is sourced from/var/lib/kubelet/kubeadm-flags.env
. -
The file that can contain user-specified flag overrides with
KUBELET_EXTRA_ARGS
is sourced from/etc/default/kubelet
(for DEBs), or/etc/sysconfig/kubelet
(for RPMs).KUBELET_EXTRA_ARGS
is last in the flag chain and has the highest priority in the event of conflicting settings.
2.1.3.4. Configurations for local ephemeral storage
Nodes have local ephemeral storage, backed by locally-attached writeable devices or, sometimes, by RAM. [8] [9]
Pods use ephemeral local storage for scratch space, caching, and for logs. The kubelet can provide scratch space to Pods using local ephemeral storage to mount emptyDir volumes into containers.
The kubelet also uses this kind of storage to hold node-level container logs, container images, and the writable layers of running containers.
Note: The kubelet tracks tmpfs emptyDir volumes as container memory use, rather than as local ephemeral storage.
|
Note: The kubelet will only track the root filesystem for ephemeral storage. OS layouts that mount a separate disk to /var/lib/kubelet or /var/lib/containers will not report ephemeral storage correctly.
|
The kubelet writes logs to files inside its configured log directory (/var/log by default); and has a base directory for other locally stored data (/var/lib/kubelet by default).
|
The kubelet recognizes two specific filesystem identifiers: [10]
-
nodefs
: The node’s main filesystem, used for local disk volumes, emptyDir volumes not backed by memory, log storage, and more. For example,nodefs
contains/var/lib/kubelet/
. -
imagefs
: An optional filesystem that container runtimes use to store container images and container writable layers. [11]The containerd runtime uses a TOML configuration file to control where persistent (default "/var/lib/containerd") and ephemeral data (default "/run/containerd") is stored.
Kubelet auto-discovers these filesystems and ignores other node local filesystems. Kubelet does not support other configurations.
2.2. Initializing control-plane node
The control-plane node is the machine where the control plane components run, including etcd (the cluster database) and the API Server (which the kubectl command line tool communicates with). [1]
kubernetes_version=v1.26.0
sudo kubeadm init \
--kubernetes-version=$kubernetes_version \
--control-plane-endpoint=cluster-endpoint \
--apiserver-advertise-address=192.168.0.100 \
--pod-network-cidr=10.244.0.0/16 \
--service-cidr=10.96.0.0/12 \
--image-repository=registry.cn-hangzhou.aliyuncs.com/google_containers \
--ignore-preflight-errors=NumCPU,Mem \
--dry-run
-
(Recommended) If you have plans to upgrade this single control-plane kubeadm cluster to high availability you should specify the
--control-plane-endpoint
to set the shared endpoint for all control-plane nodes. Such an endpoint can be either a DNS name or an IP address of a load-balancer. -
Choose a Pod network add-on, and verify whether it requires any arguments to be passed to
kubeadm init
. Depending on which third-party provider you choose, you might need to set the--pod-network-cidr
to a provider-specific value. -
(Optional) kubeadm tries to detect the container runtime by using a list of well known endpoints. To use different container runtime or if there are more than one installed on the provisioned node, specify the
--cri-socket
argument to kubeadm.
Considerations about apiserver-advertise-address and ControlPlaneEndpoint
-
Unless otherwise specified, kubeadm uses the network interface associated with the default gateway to set the advertise address for this particular control-plane node’s API server. To use a different network interface, specify the
--apiserver-advertise-address=<ip-address>
argument tokubeadm init
. -
While
--apiserver-advertise-address
can be used to set the advertise address for this particular control-plane node’s API server,--control-plane-endpoint
can be used to set the shared endpoint for all control-plane nodes. -
--control-plane-endpoint
allows both IP addresses and DNS names that can map to IP addresses. Such as:192.168.56.130 cluster-endpoint
Where
192.168.56.130
is the IP address of this node andcluster-endpoint
is a custom DNS name that maps to this IP. Later you can modifycluster-endpoint
to point to the address of your load-balancer in an high availability scenario.
Run the following command to init a control panel:
kubernetes_version=v1.26.0
sudo kubeadm init \
--kubernetes-version=$kubernetes_version \
--control-plane-endpoint=cluster-endpoint \
--pod-network-cidr=10.244.0.0/16
You should now deploy a pod network to the cluster.
Run "kubectl apply -f [podnetwork].yaml" with one of the options listed at:
https://kubernetes.io/docs/concepts/cluster-administration/addons/
You can now join any number of control-plane nodes by copying certificate authorities
and service account keys on each node and then running the following as root:
kubeadm join cluster-endpoint:6443 --token ed790l.ylclzoyoa7l9v0e9 \
--discovery-token-ca-cert-hash sha256:cb046f4d8183a66f930155654cc34354612eeab839d7ed97971154fa8f35072f \
--control-plane
Then you can join any number of worker nodes by running the following on each as root:
kubeadm join cluster-endpoint:6443 --token ed790l.ylclzoyoa7l9v0e9 \
--discovery-token-ca-cert-hash sha256:cb046f4d8183a66f930155654cc34354612eeab839d7ed97971154fa8f35072f
2.2.1. Installing a Pod network add-on
You must deploy a Container Network Interface (CNI) based Pod network add-on so that your Pods can communicate with each other. Cluster DNS (CoreDNS) will not start up before a network is installed.
|
Flannel is a simple and easy way to configure a layer 3 network fabric designed for Kubernetes. For Kubernetes v1.17+, deploying Flannel with kubectl:
-
Deploying Flannel with kubectl
kubectl apply -f https://github.com/flannel-io/flannel/releases/latest/download/kube-flannel.yml
If you use custom podCIDR (not
10.244.0.0/16
) you first need to download the above manifest and modify the network to match your one. -
Deploying Flannel with helm
# Needs manual creation of namespace to avoid helm error kubectl create ns kube-flannel kubectl label --overwrite ns kube-flannel pod-security.kubernetes.io/enforce=privileged helm repo add flannel https://flannel-io.github.io/flannel/ helm install flannel --set podCidr="10.244.0.0/16" --namespace kube-flannel flannel/flannel # helm install flannel oci://registry-1.docker.io/qqbuby/flannel --namespace kube-flannel --version v0.24.4
Flannel may be paired with several different backends. Once set, the backend should not be changed at runtime.
-
VXLAN is the recommended choice.
-
host-gw is recommended for more experienced users who want the performance improvement and whose infrastructure support it (typically it can’t be used in cloud environments).
-
UDP is suggested for debugging only or for very old kernels that don’t support VXLAN.
Several external projects provide Kubernetes Pod networks using CNI, some of which also support Network Policy. See a list of add-ons that implement the Kubernetes networking model.
2.2.2. Control plane node isolation
By default, Pods will not be scheduled on the control plane nodes for security reasons. To be able to schedule Pods on the control plane nodes, run:
kubectl taint nodes --all node-role.kubernetes.io/control-plane-
2.3. Joining the work nodes
To add new nodes to your cluster do the following for each machine:
-
SSH to the machine
-
Become root (e.g.
sudo su -
) -
Install a runtime if needed
-
Run the command that was output by
kubeadm init
. For example:Then you can join any number of worker nodes by running the following on each as root: kubeadm join cluster-endpoint:6443 --token ed790l.ylclzoyoa7l9v0e9 \ --discovery-token-ca-cert-hash sha256:cb046f4d8183a66f930155654cc34354612eeab839d7ed97971154fa8f35072f
If you do not have the token, you can get it by running the following command on the control-plane node:
kubeadm token list
By default, tokens expire after 24 hours. If you are joining a node to the cluster after the current token has expired, you can create a new token by running the following command on the control-plane node:
kubeadm token create
If you don’t have the value of --discovery-token-ca-cert-hash
, you can get it by running the following command chain on the control-plane node:
openssl x509 -pubkey -in /etc/kubernetes/pki/ca.crt | openssl rsa -pubin -outform der 2>/dev/null | \
openssl dgst -sha256 -hex | sed 's/^.* //'
You can also run the following command to create and print join command:
kubeadm token create --print-join-command
2.4. Joing the stacked control plane and etcd nodes
-
Upload the certificates that should be shared across all the control-plane instances to the cluster, and note the certificate key.
sudo kubeadm init phase upload-certs --upload-certs
[upload-certs] Storing the certificates in Secret "kubeadm-certs" in the "kube-system" Namespace [upload-certs] Using certificate key: a455917454410f7d8bcdfa5795ed54526c7484e4e6316ef57a3aa16c3454ada2
-
Run the command that was output by
kubeadm init
with the additional--certificate-key <certificate key>
generated above.You can now join any number of control-plane nodes by copying certificate authorities and service account keys on each node and then running the following as root: kubeadm join cluster-endpoint:6443 --token ed790l.ylclzoyoa7l9v0e9 \ --discovery-token-ca-cert-hash sha256:cb046f4d8183a66f930155654cc34354612eeab839d7ed97971154fa8f35072f \ --control-plane
If the following error occurs, check etcd endpoint connectivity and time synchronization between nodes. Time skew can invalidate certificates and disrupt etcd’s consensus mechanisms, hindering cluster operations.
[check-etcd] Checking that the etcd cluster is healthy I0226 10:44:22.265859 4919 local.go:71] [etcd] Checking etcd cluster health I0226 10:44:22.266518 4919 local.go:74] creating etcd client that connects to etcd pods I0226 10:44:22.266642 4919 etcd.go:215] retrieving etcd endpoints from "kubeadm.kubernetes.io/etcd.advertise-client-urls" annotation in etcd Pods I0226 10:44:22.267022 4919 envvar.go:172] "Feature gate default state" feature="InformerResourceVersion" enabled=false I0226 10:44:22.267134 4919 envvar.go:172] "Feature gate default state" feature="WatchListClient" enabled=false I0226 10:44:22.295054 4919 etcd.go:149] etcd endpoints read from pods: https://192.168.56.130:2379 context deadline exceeded error syncing endpoints with etcd
kubeadm join cluster-endpoint:6443 --token ed790l.ylclzoyoa7l9v0e9 \ --discovery-token-ca-cert-hash sha256:cb046f4d8183a66f930155654cc34354612eeab839d7ed97971154fa8f35072f \ --control-plane \ --certificate-key a455917454410f7d8bcdfa5795ed54526c7484e4e6316ef57a3aa16c3454ada2
This node has joined the cluster and a new control plane instance was created: * Certificate signing request was sent to apiserver and approval was received. * The Kubelet was informed of the new secure connection details. * Control plane label and taint were applied to the new node. * The Kubernetes control plane instances scaled up. * A new etcd member was added to the local/stacked etcd cluster. To start administering your cluster from this node, you need to run the following as a regular user: mkdir -p $HOME/.kube sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config sudo chown $(id -u):$(id -g) $HOME/.kube/config Run 'kubectl get nodes' to see this node join the cluster.
$ kubectl get nodes NAME STATUS ROLES AGE VERSION node-0 Ready control-plane 92m v1.26.0 node-2 Ready control-plane 27s v1.26.13
2.5. Removing the nodes
Talking to the control-plane node with the appropriate credentials, run:
kubectl drain <node name> --delete-emptydir-data --force --ignore-daemonsets
Before removing the node, reset the state installed by kubeadm:
kubeadm reset
Now remove the node:
kubectl delete node <node name>
2.6. Installing Addons
Add-ons extend the functionality of Kubernetes.
2.6.1. Ingress controllers
In order for the Ingress resource to work, the cluster must have an ingress controller running. Unlike other types of controllers which run as part of the kube-controller-manager
binary, Ingress controllers are not started automatically with a cluster. Kubernetes as a project supports and maintains AWS, GCE, and nginx ingress controllers. [14]
There are multiple ways to install the Ingress-Nginx Controller: [15]
-
with Helm, using the project repository chart;
-
with
kubectl apply
, using YAML manifests; -
with specific addons (e.g. for minikube or MicroK8s).
You can also expose the Ingress Nginx over a NodePort service. [16]
|
Aliyun (a Chinese corporation) provides a mirror repository (
|
Checking ingress controller version Run
|
2.6.2. Metrics server
Metrics Server is a scalable, efficient source of container resource metrics for Kubernetes built-in autoscaling pipelines. [18]
Metrics Server can be installed either directly from YAML manifest or via the official Helm chart. To install the latest Metrics Server release from the components.yaml
manifest, run the following command.
kubectl apply -f https://github.com/kubernetes-sigs/metrics-server/releases/latest/download/components.yaml
You can also consider updating the yaml as the following:
|
3. Upgrading kubeadm clusters
If you are performing a minor version upgrade for any kubelet, you must first drain the node (or nodes) that you are upgrading. In the case of control plane nodes, they could be running CoreDNS Pods or other critical workloads. [19] |
The Kubernetes project recommends that you match your kubelet and kubeadm versions. You can instead use a version of kubelet that is older than kubeadm, provided it is within the range of supported versions. |
If you’re using the community-owned package repositories (pkgs.k8s.io
), you need to enable the package repository for the desired Kubernetes minor release.
# /etc/apt/sources.list.d/kubernetes.list
deb [signed-by=/etc/apt/keyrings/kubernetes-apt-keyring.gpg] https://pkgs.k8s.io/core:/stable:/v1.29/deb/ /
# /etc/yum.repos.d/kubernetes.repo
[kubernetes]
name=Kubernetes
baseurl=https://pkgs.k8s.io/core:/stable:/v1.29/rpm/
enabled=1
gpgcheck=1
gpgkey=https://pkgs.k8s.io/core:/stable:/v1.29/rpm/repodata/repomd.xml.key
exclude=kubelet kubeadm kubectl cri-tools kubernetes-cni
# Find the latest 1.29 version in the list.
# It should look like 1.29.x-*, where x is the latest patch.
sudo apt update
sudo apt-cache madison kubeadm # OR apt-cache policy kubeadm
# Find the latest 1.29 version in the list.
# It should look like 1.29.x-*, where x is the latest patch.
sudo yum clean all --disablerepo="*" --enablerepo=kubernetes # Make sure the YUM cache of the kubernetes repo is cleaned.
sudo yum list --showduplicates kubeadm --disableexcludes=kubernetes
(Optional) Pre-pulled images:
#!/bin/sh
# replace x in 1.29.x with the latest patch version
kubernetes_version=v1.29.x
image_repository=registry.cn-hangzhou.aliyuncs.com/google_containers
sudo kubeadm config images pull \
--kubernetes-version=$kubernetes_version \
--image-repository=$image_repository
images=$(kubeadm config images list \
--kubernetes-version $kubernetes_version \
--image-repository $image_repository)
for i in $images; do
case "$i" in
*coredns*)
new_repo="registry.k8s.io/coredns"
;;
*)
new_repo="registry.k8s.io"
;;
esac
newtag=$(echo "$i" | sed "s@$image_repository@$new_repo@")
sudo ctr -n k8s.io images tag $i $newtag
done
3.1. Upgrading control plane nodes
The upgrade procedure on control plane nodes should be executed one node at a time.
3.1.1. Upgrade kubeadm
For the first control plane node
-
Upgrade kubeadm:
# replace x in 1.29.x-* with the latest patch version sudo apt-mark unhold kubeadm && \ sudo apt-get update && sudo apt-get install -y kubeadm='1.29.x-*' && \ sudo apt-mark hold kubeadm
# replace x in 1.29.x-* with the latest patch version sudo yum install -y kubeadm-'1.29.x-*' --disableexcludes=kubernetes
-
Verify that the download works and has the expected version:
kubeadm version
-
Verify the upgrade plan:
sudo kubeadm upgrade plan
-
Choose a version to upgrade to, and run the appropriate command. For example:
# replace x with the patch version you picked for this upgrade sudo kubeadm upgrade apply v1.29.x
For the other control plane nodes
Same as the first control plane node but use:
sudo kubeadm upgrade node
instead of:
sudo kubeadm upgrade apply
3.1.2. Upgrade kubelet and kubectl
-
Drain the node, prepare the node for maintenance by marking it unschedulable and evicting the workloads:
# replace <node-to-drain> with the name of your node you are draining kubectl drain <node-to-drain> --ignore-daemonsets
-
Upgrade the kubelet and kubectl:
# replace x in 1.29.x-* with the latest patch version sudo apt-mark unhold kubelet kubectl && \ sudo apt-get update && sudo apt-get install -y kubelet='1.29.x-*' kubectl='1.29.x-*' && \ sudo apt-mark hold kubelet kubectl
# replace x in 1.29.x-* with the latest patch version sudo yum install -y kubelet-'1.29.x-*' kubectl-'1.29.x-*' --disableexcludes=kubernetes
-
Restart the kubelet:
sudo systemctl daemon-reload sudo systemctl restart kubelet
-
Uncordon the node, bring the node back online by marking it schedulable:
# replace <node-to-uncordon> with the name of your node kubectl uncordon <node-to-uncordon>
3.2. Upgrade worker nodes
The upgrade procedure on worker nodes should be executed one node at a time or few nodes at a time, without compromising the minimum required capacity for running your workloads. [20]
3.2.1. Upgrade kubeadm
# replace x in 1.29.x-* with the latest patch version
sudo apt-mark unhold kubeadm && \
sudo apt-get update && sudo apt-get install -y kubeadm='1.29.x-*' && \
sudo apt-mark hold kubeadm
# replace x in 1.29.x-* with the latest patch version
sudo yum install -y kubeadm-'1.29.x-*' --disableexcludes=kubernetes
# For worker nodes this upgrades the local kubelet configuration:
sudo kubeadm upgrade node
3.2.2. Upgrade kubelet and kubectl
-
Drain the node, prepare the node for maintenance by marking it unschedulable and evicting the workloads:
# execute this command on a control plane node # replace <node-to-drain> with the name of your node you are draining kubectl drain <node-to-drain> --ignore-daemonsets
-
Upgrade the kubelet and kubectl:
# replace x in 1.29.x-* with the latest patch version sudo apt-mark unhold kubelet kubectl && \ sudo apt-get update && sudo apt-get install -y kubelet='1.29.x-*' kubectl='1.29.x-*' && \ sudo apt-mark hold kubelet kubectl
# replace x in 1.29.x-* with the latest patch version sudo yum install -y kubelet-'1.29.x-*' kubectl-'1.29.x-*' --disableexcludes=kubernetes
-
Restart the kubelet:
sudo systemctl daemon-reload sudo systemctl restart kubelet
-
Uncordon the node, bring the node back online by marking it schedulable:
# execute this command on a control plane node # replace <node-to-uncordon> with the name of your node kubectl uncordon <node-to-uncordon>
3.3. Verify the status of the cluster
After the kubelet is upgraded on all nodes verify that all nodes are available again by running the following command from anywhere kubectl can access the cluster:
kubectl get nodes
The STATUS
column should show Ready
for all your nodes, and the version number should be updated.
References
-
[1] https://kubernetes.io/docs/setup/production-environment/tools/kubeadm/create-cluster-kubeadm/
-
[2] https://kubernetes.io/docs/setup/production-environment/tools/kubeadm/install-kubeadm/
-
[4] https://kubernetes.io/docs/setup/production-environment/container-runtimes/
-
[5] https://kubernetes.io/docs/tasks/administer-cluster/kubeadm/configure-cgroup-driver/
-
[6] https://kubernetes.io/docs/setup/production-environment/tools/kubeadm/control-plane-flags/
-
[7] https://kubernetes.io/docs/setup/production-environment/tools/kubeadm/kubelet-integration/
-
[8] https://kubernetes.io/docs/concepts/configuration/manage-resources-containers/#local-ephemeral-storage
-
[9] https://stackoverflow.com/questions/70931881/what-does-kubelet-use-to-determine-the-ephemeral-storage-capacity-of-the-node
-
[10] https://kubernetes.io/docs/concepts/scheduling-eviction/node-pressure-eviction/
-
[11] https://kubernetes.io/blog/2024/01/23/kubernetes-separate-image-filesystem/
-
[13] https://kubernetes.io/docs/setup/production-environment/tools/kubeadm/high-availability/#stacked-control-plane-and-etcd-nodes
-
[14] https://kubernetes.io/docs/concepts/services-networking/ingress-controllers/
-
[16] https://kubernetes.github.io/ingress-nginx/deploy/baremetal/#over-a-nodeport-service
-
[17] https://minikube.sigs.k8s.io/docs/faq/#i-am-in-china-and-i-encounter-errors-when-trying-to-start-minikube-what-should-i-do
-
[19] https://kubernetes.io/docs/tasks/administer-cluster/kubeadm/kubeadm-upgrade/
-
[20] https://kubernetes.io/docs/tasks/administer-cluster/kubeadm/upgrading-linux-nodes/