k8s doc reading: Concepts - Cluster Architecture: Node

owan

2025-06-17 975 words 5 minutes

Contents

doc link

node 也就是安裝/執行 k8s 的 hardware source
可以是 VM 也可以是 Bare Metal

add node

先告訴大家
不同的 k8s distribution 會有不同的 add node 方式

除非你是採用原生的 k8s
不然這邊快速了解就好

如果你是採用 cloud provider 的 k8s cluster
沒意外都是 colsole 點個兩下即可

for on-premise
要新增 node 有兩種方式

self-registers
kubelet 啟動會自動跟 control plane 註冊
manually add a Node
使用 kubectl 註冊

要注意 node name 預設是 host name, 且在 k8s cluster 必須 unique 不能重複

kubectl: 操作 k8s 的工具, 後面會介紹

Node status

這邊用到一些指令
但因為還沒講 kubectl 所以大概知道就好

        
# check node status
kubectl describe node

前面有說, kubectl 是我們對 k8s 的操作工具
describe node 就是取得 node 的資訊
比如說 node ip, 執行在該 node 的 pod
cpu/memory resource usage

範例

        
        
        
    
$ k describe node node1
Name:               node1
Roles:              control-plane,etcd,master
Labels:             beta.kubernetes.io/arch=amd64
                    beta.kubernetes.io/instance-type=k3s
                    beta.kubernetes.io/os=linux
                    kubernetes.io/arch=amd64
                    kubernetes.io/hostname=node1
                    kubernetes.io/os=linux
                    node-role.kubernetes.io/control-plane=true
                    node-role.kubernetes.io/etcd=true
                    node-role.kubernetes.io/master=true
                    node.kubernetes.io/instance-type=k3s
                    p2p.k3s.cattle.io/enabled=true
Annotations:        alpha.kubernetes.io/provided-node-ip: 192.168.56.101,fd17:625c:f037:2:a00:27ff:fe94:112e
                    csi.volume.kubernetes.io/nodeid: {"driver.longhorn.io":"node1"}
                    etcd.k3s.cattle.io/local-snapshots-timestamp: 2025-06-17T07:07:56Z
                    etcd.k3s.cattle.io/node-address: 192.168.56.101
                    etcd.k3s.cattle.io/node-name: node1-0d481970
                    k3s.io/hostname: node1
                    k3s.io/internal-ip: 192.168.56.101,fd17:625c:f037:2:a00:27ff:fe94:112e
                    k3s.io/node-args:
                      ["server","--disable","traefik","--write-kubeconfig-mode","644","--cluster-init","--tls-san","192.168.56.100","--flannel-backend","none","...
                    k3s.io/node-config-hash: 3YGRMKVSKKIFXYFIXWEHZNCZIVK6HVSSOA5N7TVZOS737YFETYAA====
                    k3s.io/node-env: {}
                    node.alpha.kubernetes.io/ttl: 0
                    p2p.k3s.cattle.io/node-address: /ip4/192.168.56.101/tcp/5001/p2p/QmdFujBrJdhruLKp9Cop3DyFcciJmam3WTqW5cbykJjhhz
                    volumes.kubernetes.io/controller-managed-attach-detach: true
CreationTimestamp:  Sat, 14 Jun 2025 08:36:06 +0000
Taints:             <none>
Unschedulable:      false
Lease:
  HolderIdentity:  node1
  AcquireTime:     <unset>
  RenewTime:       Tue, 17 Jun 2025 07:08:16 +0000
Conditions:
  Type                 Status  LastHeartbeatTime                 LastTransitionTime                Reason                       Message
  ----                 ------  -----------------                 ------------------                ------                       -------
  NetworkUnavailable   False   Sat, 14 Jun 2025 08:45:49 +0000   Sat, 14 Jun 2025 08:45:49 +0000   CiliumIsUp                   Cilium is running on this node
  EtcdIsVoter          True    Tue, 17 Jun 2025 07:08:05 +0000   Sat, 14 Jun 2025 08:36:11 +0000   MemberNotLearner             Node is a voting member of the etcd cluster
  MemoryPressure       False   Tue, 17 Jun 2025 07:07:56 +0000   Sat, 14 Jun 2025 08:36:06 +0000   KubeletHasSufficientMemory   kubelet has sufficient memory available
  DiskPressure         False   Tue, 17 Jun 2025 07:07:56 +0000   Sat, 14 Jun 2025 08:36:06 +0000   KubeletHasNoDiskPressure     kubelet has no disk pressure
  PIDPressure          False   Tue, 17 Jun 2025 07:07:56 +0000   Sat, 14 Jun 2025 08:36:06 +0000   KubeletHasSufficientPID      kubelet has sufficient PID available
  Ready                True    Tue, 17 Jun 2025 07:07:56 +0000   Sat, 14 Jun 2025 08:45:56 +0000   KubeletReady                 kubelet is posting ready status
Addresses:
  InternalIP:  192.168.56.101
  InternalIP:  fd17:625c:f037:2:a00:27ff:fe94:112e
  Hostname:    node1
Capacity:
  cpu:                2
  ephemeral-storage:  61608748Ki
  hugepages-2Mi:      0
  memory:             4010048Ki
  pods:               110
Allocatable:
  cpu:                2
  ephemeral-storage:  59932990008
  hugepages-2Mi:      0
  memory:             4010048Ki
  pods:               110
System Info:
  Machine ID:                 9f6371c13b0749708493e43754b8116c
  System UUID:                c5ae2699-ab20-9440-bc68-7c08b1f869e0
  Boot ID:                    9383dce5-abe9-4ad2-8a0e-82f5a0b96b7d
  Kernel Version:             6.8.0-60-generic
  OS Image:                   Ubuntu 24.04.2 LTS
  Operating System:           linux
  Architecture:               amd64
  Container Runtime Version:  containerd://2.0.5-k3s1.32
  Kubelet Version:            v1.32.5+k3s1
  Kube-Proxy Version:         v1.32.5+k3s1
PodCIDR:                      10.42.0.0/24
PodCIDRs:                     10.42.0.0/24
ProviderID:                   k3s://node1
Non-terminated Pods:          (26 in total)
  Namespace                   Name                                           CPU Requests  CPU Limits  Memory Requests  Memory Limits  Age
  ---------                   ----                                           ------------  ----------  ---------------  -------------  ---
  haproxy-controller          haproxy-kubernetes-ingress-67b459d7f8-8h9mn    250m (12%)    0 (0%)      400Mi (10%)      0 (0%)         2d22h
  haproxy-controller          haproxy-kubernetes-ingress-67b459d7f8-zs5g5    250m (12%)    0 (0%)      400Mi (10%)      0 (0%)         2d22h
  kube-system                 cilium-envoy-slwh6                             0 (0%)        0 (0%)      0 (0%)           0 (0%)         2d22h
  kube-system                 cilium-nl6xt                                   100m (5%)     0 (0%)      10Mi (0%)        0 (0%)         2d22h
  kube-system                 cilium-operator-5444dfc444-2wttm               0 (0%)        0 (0%)      0 (0%)           0 (0%)         2d22h
  kube-system                 coredns-697968c856-9dm6b                       100m (5%)     0 (0%)      70Mi (1%)        170Mi (4%)     2d22h
  kube-system                 local-path-provisioner-774c6665dc-fbklx        0 (0%)        0 (0%)      0 (0%)           0 (0%)         2d22h
  kube-system                 metrics-server-6f4c6675d5-62p4b                100m (5%)     0 (0%)      70Mi (1%)        0 (0%)         2d22h
  longhorn-system             csi-attacher-5d68b48d9-64n7m                   0 (0%)        0 (0%)      0 (0%)           0 (0%)         2d22h
  longhorn-system             csi-attacher-5d68b48d9-m2qk6                   0 (0%)        0 (0%)      0 (0%)           0 (0%)         2d22h
  longhorn-system             csi-attacher-5d68b48d9-p7pf2                   0 (0%)        0 (0%)      0 (0%)           0 (0%)         2d22h
  longhorn-system             csi-provisioner-6fcc6478db-4r4cs               0 (0%)        0 (0%)      0 (0%)           0 (0%)         2d22h
  longhorn-system             csi-provisioner-6fcc6478db-gzczt               0 (0%)        0 (0%)      0 (0%)           0 (0%)         2d22h
  longhorn-system             csi-provisioner-6fcc6478db-xl85j               0 (0%)        0 (0%)      0 (0%)           0 (0%)         2d22h
  longhorn-system             csi-resizer-6c558c9fbc-hkhwh                   0 (0%)        0 (0%)      0 (0%)           0 (0%)         2d22h
  longhorn-system             csi-resizer-6c558c9fbc-lcm5w                   0 (0%)        0 (0%)      0 (0%)           0 (0%)         2d22h
  longhorn-system             csi-resizer-6c558c9fbc-n87lw                   0 (0%)        0 (0%)      0 (0%)           0 (0%)         2d22h
  longhorn-system             csi-snapshotter-874b9f887-8p2rl                0 (0%)        0 (0%)      0 (0%)           0 (0%)         2d22h
  longhorn-system             csi-snapshotter-874b9f887-pqv2f                0 (0%)        0 (0%)      0 (0%)           0 (0%)         2d22h
  longhorn-system             csi-snapshotter-874b9f887-rkjr4                0 (0%)        0 (0%)      0 (0%)           0 (0%)         2d22h
  longhorn-system             engine-image-ei-e09d8caa-dcgp2                 0 (0%)        0 (0%)      0 (0%)           0 (0%)         2d22h
  longhorn-system             longhorn-csi-plugin-chngl                      0 (0%)        0 (0%)      0 (0%)           0 (0%)         2d22h
  longhorn-system             longhorn-driver-deployer-54599c447c-w8fs6      0 (0%)        0 (0%)      0 (0%)           0 (0%)         2d22h
  longhorn-system             longhorn-manager-6ffpt                         0 (0%)        0 (0%)      0 (0%)           0 (0%)         2d22h
  longhorn-system             longhorn-ui-6b44759848-glhbh                   0 (0%)        0 (0%)      0 (0%)           0 (0%)         2d22h
  longhorn-system             longhorn-ui-6b44759848-jqlp5                   0 (0%)        0 (0%)      0 (0%)           0 (0%)         2d22h
Allocated resources:
  (Total limits may be over 100 percent, i.e., overcommitted.)
  Resource           Requests     Limits
  --------           --------     ------
  cpu                800m (40%)   0 (0%)
  memory             950Mi (24%)  170Mi (4%)
  ephemeral-storage  0 (0%)       0 (0%)
  hugepages-2Mi      0 (0%)       0 (0%)
Events:
  Type     Reason                          Age   From             Message
  ----     ------                          ----  ----             -------
  Normal   Starting                        27s   kubelet          Starting kubelet.
  Normal   NodeAllocatableEnforced         26s   kubelet          Updated Node Allocatable limit across pods
  Normal   NodeHasSufficientMemory         26s   kubelet          Node node1 status is now: NodeHasSufficientMemory
  Normal   NodeHasNoDiskPressure           26s   kubelet          Node node1 status is now: NodeHasNoDiskPressure
  Normal   NodeHasSufficientPID            26s   kubelet          Node node1 status is now: NodeHasSufficientPID
  Warning  Rebooted                        26s   kubelet          Node node1 has been rebooted, boot id: 9383dce5-abe9-4ad2-8a0e-82f5a0b96b7d
  Normal   NodePasswordValidationComplete  26s   k3s-supervisor   Deferred node password secret validation complete
  Normal   RegisteredNode                  14s   node-controller  Node node1 event: Registered Node node1 in Controller

從範例中可以看到其實資訊蠻細的
所以要看 status 就是記得下 describe 就對了

Node heartbeats

就是在說 control plane 會 monitor node status, default every 5 seconds
如果 node 失連
scheduler 會避開該 node schedule(編排) pod
既有的 pod 也會 evict(驅逐) (重新 schedule 在其他 node)

maintain a Node

cordon

如果今天想要對 node 進行維護
我們可以讓 node 進入 unschedulable 狀態
這是告訴 scheduler 不要再 schedule(編排)新的 pod 到該 node 上
但是既有的 pod 不會受到影響

kubectl cordon <node name>

接著在 node status 就可以看到 STATUS: SchedulingDisabled

        
$ k get node 
NAME    STATUS                     ROLES                       AGE     VERSION
node4   Ready,SchedulingDisabled   <none>                      2d22h   v1.32.5+k3s1

要復原就是讓 node status 變回 schedulable 狀態

kubectl uncordon <node name>

drain

drain 與 cordon 很像
不過他更進一步執行 evict(驅逐) 該 node 身上的 pod
也就是讓 node 完全沒有 user workload

可以直接對 node 執行 drain, k8s 會先執行 cordon 才做 evict
不會讓你發生一邊在 evict(驅逐) 又一邊再 schedule(編排) 新的 pod 進來

這邊通常會加上 --ignore-daemonsets
後面會再說明 daemonsets 是什麼

kubectl drain --ignore-daemonsets <node name>

要復原就是讓 node status 變回 schedulable 狀態

kubectl uncordon <node name>

如果這個 node 不要了
就是下

kubectl delete <node name>

以上稍微介紹 node 基本概念