Kubernetes 节点管理与调度策略深度解析

张

张建站

2026/5/20 3:44:47

10分钟阅读

Kubernetes 节点管理与调度策略深度解析引言在 Kubernetes 中节点是 Pod 运行的基础。有效的节点管理和调度策略对于集群的稳定性和资源利用率至关重要。本文将深入探讨 Kubernetes 的节点管理机制和调度策略。节点基础概念节点类型类型说明特点Master Node控制平面节点运行 API Server、Scheduler、Controller ManagerWorker Node工作节点运行 Kubelet、Kube-proxy、容器运行时Control Plane Node控制平面节点新版术语替代 Master Node节点状态# 查看节点状态 kubectl get nodes # 查看节点详细信息 kubectl describe node node-1节点条件条件说明Ready节点是否就绪接收 PodMemoryPressure节点内存压力DiskPressure节点磁盘压力PIDPressure节点进程压力NetworkUnavailable节点网络是否可用节点选择器与亲和性Node SelectorapiVersion: v1 kind: Pod metadata: name: node-selector-pod spec: containers: - name: app image: my-app:latest nodeSelector: disktype: ssdNode AffinityapiVersion: v1 kind: Pod metadata: name: node-affinity-pod spec: containers: - name: app image: my-app:latest affinity: nodeAffinity: requiredDuringSchedulingIgnoredDuringExecution: nodeSelectorTerms: - matchExpressions: - key: zone operator: In values: - us-east-1a - us-east-1b preferredDuringSchedulingIgnoredDuringExecution: - weight: 1 preference: matchExpressions: - key: disktype operator: In values: - ssdPod Affinity/Anti-AffinityapiVersion: v1 kind: Pod metadata: name: pod-affinity-pod spec: containers: - name: app image: my-app:latest affinity: podAffinity: requiredDuringSchedulingIgnoredDuringExecution: - labelSelector: matchExpressions: - key: app operator: In values: - backend topologyKey: kubernetes.io/hostname podAntiAffinity: preferredDuringSchedulingIgnoredDuringExecution: - weight: 100 podAffinityTerm: labelSelector: matchExpressions: - key: app operator: In values: - database topologyKey: kubernetes.io/hostname污点与容忍度Taints# 添加污点 kubectl taint nodes node-1 dedicatedspecial:NoSchedule # 查看污点 kubectl describe node node-1 | grep Taints # 删除污点 kubectl taint nodes node-1 dedicated-TolerationsapiVersion: v1 kind: Pod metadata: name: toleration-pod spec: containers: - name: app image: my-app:latest tolerations: - key: dedicated operator: Equal value: special effect: NoScheduleTaint 效果效果说明NoSchedule不调度到该节点除非 Pod 容忍PreferNoSchedule尽量不调度到该节点NoExecute立即驱逐不满足容忍的 Pod调度器配置默认调度器apiVersion: v1 kind: ConfigMap metadata: name: kube-scheduler-config namespace: kube-system data: config.yaml: | apiVersion: kubescheduler.config.k8s.io/v1beta3 kind: KubeSchedulerConfiguration schedulerName: default-scheduler leaderElection: leaderElect: true调度策略apiVersion: kubescheduler.config.k8s.io/v1beta3 kind: KubeSchedulerConfiguration profiles: - schedulerName: default-scheduler plugins: score: enabled: - name: NodeResourcesBalancedAllocation weight: 1 - name: NodeResourcesLeastAllocated weight: 1 disabled: - name: NodeResourcesMostAllocated调度优先级apiVersion: scheduling.k8s.io/v1 kind: PriorityClass metadata: name: high-priority value: 1000000 globalDefault: false description: High priority pods.apiVersion: v1 kind: Pod metadata: name: high-priority-pod spec: priorityClassName: high-priority containers: - name: app image: my-app:latest节点管理最佳实践节点标签管理# 添加节点标签 kubectl label nodes node-1 zoneproduction # 更新节点标签 kubectl label nodes node-1 zonestaging --overwrite # 删除节点标签 kubectl label nodes node-1 zone-节点隔离apiVersion: v1 kind: Pod metadata: name: isolated-pod spec: containers: - name: app image: my-app:latest nodeSelector: environment: production tolerations: - key: node-role.kubernetes.io/control-plane operator: Exists effect: NoSchedule节点维护# 标记节点为不可调度 kubectl cordon node-1 # 驱逐节点上的 Pod kubectl drain node-1 --ignore-daemonsets # 标记节点为可调度 kubectl uncordon node-1调度器扩展自定义调度器apiVersion: v1 kind: Deployment metadata: name: custom-scheduler namespace: kube-system spec: replicas: 1 selector: matchLabels: app: custom-scheduler template: metadata: labels: app: custom-scheduler spec: serviceAccountName: custom-scheduler containers: - name: custom-scheduler image: my-custom-scheduler:latest command: - /custom-scheduler - --scheduler-namecustom-scheduler使用自定义调度器apiVersion: v1 kind: Pod metadata: name: custom-scheduler-pod spec: schedulerName: custom-scheduler containers: - name: app image: my-app:latest节点监控与健康检查节点指标监控apiVersion: monitoring.coreos.com/v1 kind: ServiceMonitor metadata: name: node-monitor spec: selector: matchLabels: k8s-app: kubelet endpoints: - port: http-metrics interval: 30s节点健康检查apiVersion: v1 kind: Pod metadata: name: node-health-check spec: containers: - name: health-check image: busybox:1.35 command: [sh, -c, while true; do echo Healthy; sleep 5; done] nodeSelector: kubernetes.io/os: linux常见问题与解决方案问题 1Pod 无法调度排查步骤# 查看 Pod 状态 kubectl describe pod my-pod # 检查节点资源 kubectl describe node node-1 # 查看调度事件 kubectl get events --field-selector involvedObject.namemy-pod解决方案检查节点资源是否充足验证节点选择器配置检查污点和容忍度配置问题 2节点资源利用率低解决方案调整 Pod 资源请求和限制使用 HPA 自动调整副本数配置节点亲和性优化调度问题 3节点故障处理排查步骤# 检查节点状态 kubectl get nodes # 检查节点日志 kubectl logs -n kube-system kubelet-node-1 # 检查节点事件 kubectl get events --field-selector involvedObject.kindNode解决方案标记节点为不可调度驱逐节点上的 Pod修复节点问题或更换节点总结节点管理和调度策略是 Kubernetes 集群运维的核心内容。通过合理配置节点选择器、亲和性、污点和容忍度可以实现高效的资源利用和可靠的服务部署。在实际应用中需要结合监控体系和自动化工具构建稳定高效的节点管理体系。参考文献Kubernetes Node Documentation: https://kubernetes.io/docs/concepts/architecture/nodes/Kubernetes Scheduling Documentation: https://kubernetes.io/docs/concepts/scheduling-eviction/Kubernetes Taints and Tolerations: https://kubernetes.io/docs/concepts/scheduling-eviction/taint-and-toleration/

RIS辅助的模拟Air-ODE网络技术解析与应用

1. RIS辅助的模拟Air-ODE网络技术概述可重构智能表面（Reconfigurable Intelligent Surface, RIS）作为6G通信的关键技术之一，正在彻底改变传统无线通信系统的架构设计。RIS本质上是一种由大量可编程电磁单元组成的二维平面结构，每个…...

2026/5/20 3:41:45 阅读更多 →

Python异步编程模式：从同步到异步的演进

Python异步编程模式：从同步到异步的演进引言在Python开发中，异步编程模式是构建高性能应用的关键。作为一名从Rust转向Python的后端开发者，我深刻体会到异步编程在处理高并发场景时的优势。本文将深入探讨Python中的异步编程模式及其最佳实…...

2026/5/20 3:37:05 阅读更多 →

DFT笔记57

6.3 TEST RESPONSE COMPACTION 简单介绍一下Test response compaction，它是在scan chains的输出进行的，为的是减少送回到tester的测试响应的数据量。不同的是： test stimulus compression：必须是无损的 test response compaction：可以是有损的有很多test response com…...

2026/5/20 3:34:21 阅读更多 →