Kubernetes - Node
Node
is the abstraction of a machine, which may mean different things in different environment:
- in a bare metal cluster: node = a bare metal machine
- in a cluster on a cloud provider e.g. GKE: node = a GCE VM.
- in a kind cluster: node = a docker container
To mark a Node unschedulable, run:
$ kubectl cordon $NODENAME
Pod
s that are part of a DaemonSet
tolerate being run on an unschedulable Node
. DaemonSets typically provide node-local services that should run on the Node even if it is being drained of workload applications.
Capacity
Default is 110 pods per node. Pod limit is set and enforced by kubelet
running on the node. Can be configured in kubelet
's config (the flag --max-pods
is deprecated; instead, change maxPods
in the config file specified by --config
)
Only Pods that have been assigned to a node, and are not yet terminated (Failed or Succeeded phase), are counted against this capacity.
To check the pod limit of a node:
$ kubectl get nodes <node_name> -o json | jq -r '.status.capacity.pods'
Check kubelet
status:
$ systemctl status kubelet
If you see something like /usr/bin/kubelet --max-pods ...
, the max pod is configured by --max-pods
flag. The commandline flag may be changed in the kubelet
service config, something like /etc/systemd/system/kubelet.service.d/10-kubeadm.conf
Otherwise find the config file path, which may look like /usr/bin/kubelet --config=/var/lib/kubelet/config.yaml ...
.
To change the maxPods
in config file:
$ yq -i e '.maxPods=500' /var/lib/kubelet/config.yaml
Restart kubelet
:
$ systemctl restart kubelet
npd
node-problem-detector aims to make various node problems visible to the upstream layers in the cluster management stack. It is a daemon that runs on each node, detects node problems and reports them to apiserver. node-problem-detector can either run as a DaemonSet or run standalone. Now it is running as a Kubernetes Addon enabled by default in the GKE cluster. It is also enabled by default in AKS as part of the AKS Linux Extension.
npd
(node-problem-detector) uses crictl pods --latest
to determine if containerd
is healthy. If not, npd
will constantly restart it.
Control Plane Nodes
Control plane nodes may have node-role.kubernetes.io/control-plane:NoSchedule
or PreferNoSchedule
taint:
spec:
taints:
- effect: PreferNoSchedule
key: node-role.kubernetes.io/control-plane
Pods need to tolerate node-role.kubernetes.io/control-plane
in order to run on control plane nodes:
tolerations:
- key: node-role.kubernetes.io/control-plane
operator: Exists
To remove the taint:
$ kubectl taint nodes node1 node-role.kubernetes.io/control-plane:NoSchedule-