Kubernetes节点亲和性和Pod反亲和性无法按预期部署Pods。

6

我试图在EKS上为MongoDB实验一个2节点集群(一旦稳定后会扩大规模)。这两个节点运行在不同的AWS区域中。描述如下:

apiVersion: apps/v1beta1
kind: StatefulSet
metadata:
  name: mongod
  labels:
    name: mongo-repl
spec:
  serviceName: mongodb-service
  replicas: 2
  selector:
    matchLabels:
      app: mongod
      role: mongo
      environment: test
  template:
    metadata:
      labels:
        app: mongod
        role: mongo
        environment: test
    spec:
      terminationGracePeriodSeconds: 15
      affinity:
        nodeAffinity:
          requiredDuringSchedulingIgnoredDuringExecution:
            nodeSelectorTerms:
            - matchExpressions:
              - key: failure-domain.beta.kubernetes.io/zone
                operator: In
                values:
                - ap-south-1a
                - ap-south-1b
        podAntiAffinity:
          requiredDuringSchedulingIgnoredDuringExecution:
          - labelSelector:
              matchExpressions:
              - key: app
                operator: In
                values:
                - mongod
              - key: role
                operator: In
                values:
                - mongo
              - key: environment
                operator: In
                values:
                - test
            topologyKey: kubernetes.io/hostname
      containers:
        .....

这里的目标是不要在已运行带有标签 - app=mongod,role=mongo,environment=test 的 pod 的节点上安排另一个 pod。

当我部署规范时,只有一个 mongo pod 集合被创建到一个节点上。

ubuntu@ip-192-170-0-18:~$ kubectl describe statefulset mongod
Name:               mongod
Namespace:          default
CreationTimestamp:  Sun, 16 Feb 2020 16:44:16 +0000
Selector:           app=mongod,environment=test,role=mongo
Labels:             name=mongo-repl
Annotations:        <none>
Replicas:           2 desired | 2 total
Update Strategy:    OnDelete
Pods Status:        1 Running / 1 Waiting / 0 Succeeded / 0 Failed
Pod Template:
  Labels:  app=mongod
           environment=test
           role=mongo
  Containers:

使用kubectl命令描述名为mongod-1的Pod

Node:           <none>
Labels:         app=mongod
                controller-revision-hash=mongod-66f7c87bbb
                environment=test
                role=mongo
                statefulset.kubernetes.io/pod-name=mongod-1
Annotations:    kubernetes.io/psp: eks.privileged
Status:         Pending
....
....
Events:
  Type     Reason            Age                 From               Message
  ----     ------            ----                ----               -------
  Warning  FailedScheduling  42s (x14 over 20m)  default-scheduler  0/2 nodes are available: 1 Insufficient pods, 1 node(s) didn't match pod affinity/anti-affinity, 1 node(s) didn't satisfy existing pods anti-affinity rules.

0/2个节点可用:1个Pod不足、1个节点与Pod的亲和/反亲和不匹配、1个节点未满足现有Pods的反亲和规则。

无法弄清楚亲和度规范中的冲突。非常感谢您在此方面提供一些见解!


2月21日编辑:下面添加新错误信息的详细信息

根据建议,我现在已经扩展了工作节点并开始接收到更清晰的错误消息——

事件: 类型 原因 年龄 来自 消息 ---- ------ ---- ---- ------- 警告 调度失败 51秒 (x554超过13小时) 默认调度程序 0/2个节点可用:1个节点与Pod的亲和/反亲和不匹配,1个节点未满足现有Pods的反亲和规则,1个节点存在卷节点亲和性冲突。

因此,现在的主要问题(在扩展工作节点之后)是:

1个节点存在卷节点亲和性冲突

再次发布我的整个配置文件:

apiVersion: apps/v1beta1
    kind: StatefulSet
    metadata:
      name: mongod
      labels:
        name: mongo-repl
    spec:
      serviceName: mongodb-service
      replicas: 2
      selector:
        matchLabels:
          app: mongod
          role: mongo
          environment: test
      template:
        metadata:
          labels:
            app: mongod
            role: mongo
            environment: test
        spec:
          terminationGracePeriodSeconds: 15
          affinity:
            nodeAffinity:
              requiredDuringSchedulingIgnoredDuringExecution:
                nodeSelectorTerms:
                - matchExpressions:
                  - key: failure-domain.beta.kubernetes.io/zone
                    operator: In
                    values:
                    - ap-south-1a
                    - ap-south-1b
            podAntiAffinity:
              requiredDuringSchedulingIgnoredDuringExecution:
              - labelSelector:
                  matchExpressions:
                  - key: app
                    operator: In
                    values:
                    - mongod
                  - key: role
                    operator: In
                    values:
                    - mongo
                  - key: environment
                    operator: In
                    values:
                    - test
                topologyKey: kubernetes.io/hostname
          containers:
        - name: mongod-container
          .......
      volumes:
        - name: mongo-vol
          persistentVolumeClaim:
            claimName: mongo-pvc

PVC -- 聚氯乙烯
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: mongo-pvc
spec:
  storageClassName: gp2-multi-az
  accessModes:
    - ReadWriteOnce
  resources:
    requests:
      storage: 8Gi

PV -- 页面访问量
apiVersion: "v1"
kind: "PersistentVolume"
metadata:
  name: db-volume-0
spec:
  capacity:
    storage: 10Gi
  accessModes:
    - ReadWriteOnce
  persistentVolumeReclaimPolicy: Retain
  storageClassName: gp2-multi-az
  awsElasticBlockStore:
    volumeID: vol-06f12b1d6c5c93903
    fsType: ext4
  nodeAffinity:
    required:
      nodeSelectorTerms:
      - matchExpressions:
        - key: failure-domain.beta.kubernetes.io/zone
        #- key: topology.kubernetes.io/zone
          operator: In
          values:
          - ap-south-1a

apiVersion: "v1"
kind: "PersistentVolume"
metadata:
  name: db-volume-1
spec:
  capacity:
    storage: 10Gi
  accessModes:
    - ReadWriteOnce
  persistentVolumeReclaimPolicy: Retain
  storageClassName: gp2-multi-az
  awsElasticBlockStore:
    volumeID: vol-090ab264d4747f131
    fsType: ext4
  nodeAffinity:
    required:
      nodeSelectorTerms:
      - matchExpressions:
        - key: failure-domain.beta.kubernetes.io/zone
        #- key: topology.kubernetes.io/zone
          operator: In
          values:
          - ap-south-1b

存储类 --

kind: StorageClass
apiVersion: storage.k8s.io/v1
metadata:
  name: gp2-multi-az
  annotations:
    storageclass.kubernetes.io/is-default-class: "true"
provisioner: kubernetes.io/aws-ebs
volumeBindingMode: WaitForFirstConsumer
allowVolumeExpansion: true
parameters:
  type: gp2
  fsType: ext4
allowedTopologies:
- matchLabelExpressions:
  - key: failure-domain.beta.kubernetes.io/zone
    values:
    - ap-south-1a
    - ap-south-1b

我不想选择动态PVC。

根据@rabello的建议,添加以下输出--

kubectl get pods --show-labels
NAME       READY   STATUS    RESTARTS   AGE   LABELS
mongod-0   1/1     Running   0          14h   app=mongod,controller-revision-hash=mongod-5b4699fd85,environment=test,role=mongo,statefulset.kubernetes.io/pod-name=mongod-0
mongod-1   0/1     Pending   0          14h   app=mongod,controller-revision-hash=mongod-5b4699fd85,environment=test,role=mongo,statefulset.kubernetes.io/pod-name=mongod-1

kubectl get nodes --show-labels
NAME                                           STATUS   ROLES    AGE   VERSION              LABELS
ip-192-170-0-8.ap-south-1.compute.internal     Ready    <none>   14h   v1.14.7-eks-1861c5   beta.kubernetes.io/arch=amd64,beta.kubernetes.io/instance-type=t3.small,beta.kubernetes.io/os=linux,eks.amazonaws.com/nodegroup-image=ami-07fd6cdebfd02ef6e,eks.amazonaws.com/nodegroup=trl_compact_prod_db_node_group,failure-domain.beta.kubernetes.io/region=ap-south-1,failure-domain.beta.kubernetes.io/zone=ap-south-1a,kubernetes.io/arch=amd64,kubernetes.io/hostname=ip-192-170-0-8.ap-south-1.compute.internal,kubernetes.io/os=linux
ip-192-170-80-14.ap-south-1.compute.internal   Ready    <none>   14h   v1.14.7-eks-1861c5   beta.kubernetes.io/arch=amd64,beta.kubernetes.io/instance-type=t3.small,beta.kubernetes.io/os=linux,eks.amazonaws.com/nodegroup-image=ami-07fd6cdebfd02ef6e,eks.amazonaws.com/nodegroup=trl_compact_prod_db_node_group,failure-domain.beta.kubernetes.io/region=ap-south-1,failure-domain.beta.kubernetes.io/zone=ap-south-1b,kubernetes.io/arch=amd64,kubernetes.io/hostname=ip-192-170-80-14.ap-south-1.compute.internal,kubernetes.io/os=linux

1
@suren -- "你有两个工作节点还是一个主节点和一个工作节点?" -- 2个工作节点 - Rajesh

@AmitKumarGupta 很不幸,即使使用更大的形状也没有起作用。 可分配的: 可附加卷-AWS-EBS:"25" CPU:"2" 临时存储:"19316009748" 大页面-1Gi:"0" 大页面-2Mi:"0" 内存:1899960Ki Pod:"11" 容量: 可附加卷-AWS-EBS:"25" CPU:"2" 临时存储:20959212Ki 大页面-1Gi:"0" 大页面-2Mi:"0" 内存:2002360Ki Pod:"11"

仍然遇到相同的问题!!
- Rajesh
好的,现在清楚了!您正在尝试在不同可用区中的两个节点上挂载相同的EBS。实际上,这是不可能的。最近AWS发布了一个新功能,允许多个实例连接同一EBS,但仅限于同一可用区内的实例。因此,您需要使用另一种解决方案,如EFS来实现此目的。 - Mr.KoopaKiller
@rabello ,实际上我正在使用 2 个不同的 PV(持久化卷)分别支持 2 个不同的区域,每个区域都有一个 EBS(弹性块存储)卷。跨不同可用区域使用复制集的整个想法是为了使我的状态集具有可用性/容错性。如果这不可能,那么不就打败了可用区域的整个目的吗? - Rajesh
@AmitKumarGupta,“你是否想要完全相同的卷...”:对我来说,这是由2个EBS卷在2个可用区域支持的1个逻辑k8s卷。“为什么不想要动态存储?”:据我所知,如果我必须关闭我的集群并重新启动,如果我使用动态pvc / pvc模板,则可能无法使用现有的mongo db文件重新启动。 - Rajesh
显示剩余23条评论
1个回答

0

EBS卷是区域性的。只能被位于相同AZ中的Pod访问。您的StatefulSet允许在多个区域(ap-south-1a和ap-south-1b)中安排Pod。鉴于其他限制,调度程序可能会尝试将Pod调度到与其卷不同的AZ中的节点上。我建议您将StatefulSet限定在单个AZ中,或使用operator安装Mongo。


网页内容由stack overflow 提供, 点击上面的
可以查看英文原文,
原文链接