当附加新卷时,Kubernetes Pod处于挂起状态(EKS)

11

让我描述一下我的情景:

简要描述

当我在Kubernetes上创建一个带有1个附加卷的部署时,一切都很完美。但是当我创建相同的部署,但附加了第二个卷(总共:2个卷),该Pod会在“待处理”状态下挂起,并显示错误:

pod has unbound PersistentVolumeClaims (repeated 2 times)
0/2 nodes are available: 2 node(s) had no available volume zone.

已经检查过卷是在正确的可用区中创建的。

详细描述

我使用Amazon EKS设置了一个有2个节点的集群。我有以下默认的存储类:

kind: StorageClass
apiVersion: storage.k8s.io/v1
metadata:
  name: gp2
  annotations:
    storageclass.kubernetes.io/is-default-class: "true"
provisioner: kubernetes.io/aws-ebs
parameters:
  type: gp2
reclaimPolicy: Retain
mountOptions:
  - debug

我有一个需要两个卷的mongodb部署,一个挂载在/data/db文件夹上,另一个挂载在我需要的某个随机目录上。这是用于创建三个组件的最小yaml(我有意注释了一些行):

apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  namespace: my-project
  creationTimestamp: null
  labels:
    io.kompose.service: my-project-db-claim0
  name: my-project-db-claim0
spec:
  accessModes:
  - ReadWriteOnce
  resources:
    requests:
      storage: 5Gi
---
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  namespace: my-project
  creationTimestamp: null
  labels:
    io.kompose.service: my-project-db-claim1
  name: my-project-db-claim1
spec:
  accessModes:
  - ReadWriteOnce
  resources:
    requests:
      storage: 10Gi
---
apiVersion: extensions/v1beta1
kind: Deployment
metadata:
  namespace: my-project
  name: my-project-db
spec:
  replicas: 1
  strategy:
    type: Recreate
  template:
    metadata:
      labels:
        name: my-db
    spec:
      containers:
        - name: my-project-db-container
          image: mongo
          imagePullPolicy: Always
          resources: {}
          volumeMounts:
          - mountPath: /my_dir
            name: my-project-db-claim0
          # - mountPath: /data/db
          #   name: my-project-db-claim1
          ports:
            - containerPort: 27017
      restartPolicy: Always
      volumes:
      - name: my-project-db-claim0
        persistentVolumeClaim:
          claimName: my-project-db-claim0
      # - name: my-project-db-claim1
      #   persistentVolumeClaim:
      #     claimName: my-project-db-claim1

那个yaml文件完美运行。卷的输出为:

$ kubectl describe pv

Name:            pvc-307b755a-039e-11e9-b78d-0a68bcb24bc6
Labels:          failure-domain.beta.kubernetes.io/region=us-east-1
                failure-domain.beta.kubernetes.io/zone=us-east-1c
Annotations:     kubernetes.io/createdby: aws-ebs-dynamic-provisioner
                pv.kubernetes.io/bound-by-controller: yes
                pv.kubernetes.io/provisioned-by: kubernetes.io/aws-ebs
Finalizers:      [kubernetes.io/pv-protection]
StorageClass:    gp2
Status:          Bound
Claim:           my-project/my-project-db-claim0
Reclaim Policy:  Delete
Access Modes:    RWO
Capacity:        5Gi
Node Affinity:   <none>
Message:        
Source:
    Type:       AWSElasticBlockStore (a Persistent Disk resource in AWS)
    VolumeID:   aws://us-east-1c/vol-xxxxx
    FSType:     ext4
    Partition:  0
    ReadOnly:   false
Events:         <none>


Name:            pvc-308d8979-039e-11e9-b78d-0a68bcb24bc6
Labels:          failure-domain.beta.kubernetes.io/region=us-east-1
                failure-domain.beta.kubernetes.io/zone=us-east-1b
Annotations:     kubernetes.io/createdby: aws-ebs-dynamic-provisioner
                pv.kubernetes.io/bound-by-controller: yes
                pv.kubernetes.io/provisioned-by: kubernetes.io/aws-ebs
Finalizers:      [kubernetes.io/pv-protection]
StorageClass:    gp2
Status:          Bound
Claim:           my-project/my-project-db-claim1
Reclaim Policy:  Delete
Access Modes:    RWO
Capacity:        10Gi
Node Affinity:   <none>
Message:        
Source:
    Type:       AWSElasticBlockStore (a Persistent Disk resource in AWS)
    VolumeID:   aws://us-east-1b/vol-xxxxx
    FSType:     ext4
    Partition:  0
    ReadOnly:   false
Events:         <none>

并且pod的输出为:

$ kubectl describe pods

Name:               my-project-db-7d48567b48-slncd
Namespace:          my-project
Priority:           0
PriorityClassName:  <none>
Node:               ip-192-168-212-194.ec2.internal/192.168.212.194
Start Time:         Wed, 19 Dec 2018 15:55:58 +0100
Labels:             name=my-db
                    pod-template-hash=3804123604
Annotations:        <none>
Status:             Running
IP:                 192.168.216.33
Controlled By:      ReplicaSet/my-project-db-7d48567b48
Containers:
  my-project-db-container:
    Container ID:   docker://cf8222f15e395b02805c628b6addde2d77de2245aed9406a48c7c6f4dccefd4e
    Image:          mongo
    Image ID:       docker-pullable://mongo@sha256:0823cc2000223420f88b20d5e19e6bc252fa328c30d8261070e4645b02183c6a
    Port:           27017/TCP
    Host Port:      0/TCP
    State:          Running
      Started:      Wed, 19 Dec 2018 15:56:42 +0100
    Ready:          True
    Restart Count:  0
    Environment:    <none>
    Mounts:
      /my_dir from my-project-db-claim0 (rw)
      /var/run/secrets/kubernetes.io/serviceaccount from default-token-pf9ks (ro)
Conditions:
  Type           Status
  Initialized    True
  Ready          True
  PodScheduled   True
Volumes:
  my-project-db-claim0:
    Type:       PersistentVolumeClaim (a reference to a PersistentVolumeClaim in the same namespace)
    ClaimName:  my-project-db-claim0
    ReadOnly:   false
  default-token-pf9ks:
    Type:        Secret (a volume populated by a Secret)
    SecretName:  default-token-pf9ks
    Optional:    false
QoS Class:       BestEffort
Node-Selectors:  <none>
Tolerations:     node.kubernetes.io/not-ready:NoExecute for 300s
                node.kubernetes.io/unreachable:NoExecute for 300s
Events:
  Type     Reason                  Age                    From                                      Message
  ----     ------                  ----                   ----                                      -------
  Warning  FailedScheduling        7m22s (x5 over 7m23s)  default-scheduler                         pod has unbound PersistentVolumeClaims (repeated 2 times)
  Normal   Scheduled               7m21s                  default-scheduler                         Successfully assigned my-project/my-project-db-7d48567b48-slncd to ip-192-168-212-194.ec2.internal
  Normal   SuccessfulMountVolume   7m21s                  kubelet, ip-192-168-212-194.ec2.internal  MountVolume.SetUp succeeded for volume "default-token-pf9ks"
  Warning  FailedAttachVolume      7m13s (x5 over 7m21s)  attachdetach-controller                   AttachVolume.Attach failed for volume "pvc-307b755a-039e-11e9-b78d-0a68bcb24bc6" : "Error attaching EBS volume \"vol-01a863d0aa7c7e342\"" to instance "i-0a7dafbbdfeabc50b" since volume is in "creating" state
  Normal   SuccessfulAttachVolume  7m1s                   attachdetach-controller                   AttachVolume.Attach succeeded for volume "pvc-307b755a-039e-11e9-b78d-0a68bcb24bc6"
  Normal   SuccessfulMountVolume   6m48s                  kubelet, ip-192-168-212-194.ec2.internal  MountVolume.SetUp succeeded for volume "pvc-307b755a-039e-11e9-b78d-0a68bcb24bc6"
  Normal   Pulling                 6m48s                  kubelet, ip-192-168-212-194.ec2.internal  pulling image "mongo"
  Normal   Pulled                  6m39s                  kubelet, ip-192-168-212-194.ec2.internal  Successfully pulled image "mongo"
  Normal   Created                 6m38s                  kubelet, ip-192-168-212-194.ec2.internal  Created container
  Normal   Started                 6m37s                  kubelet, ip-192-168-212-194.ec2.internal  Started container

一切都没有问题地创建。但是,如果我在yaml文件中取消注释,使两个卷附加到db部署上,pv输出与以前相同,但pod会停滞在等待中,并显示以下输出:

$ kubectl describe pods

Name:               my-project-db-b8b8d8bcb-l64d7
Namespace:          my-project
Priority:           0
PriorityClassName:  <none>
Node:               <none>
Labels:             name=my-db
                    pod-template-hash=646484676
Annotations:        <none>
Status:             Pending
IP:                 
Controlled By:      ReplicaSet/my-project-db-b8b8d8bcb
Containers:
  my-project-db-container:
    Image:        mongo
    Port:         27017/TCP
    Host Port:    0/TCP
    Environment:  <none>
    Mounts:
      /data/db from my-project-db-claim1 (rw)
      /my_dir from my-project-db-claim0 (rw)
      /var/run/secrets/kubernetes.io/serviceaccount from default-token-pf9ks (ro)
Conditions:
  Type           Status
  PodScheduled   False 
Volumes:
  my-project-db-claim0:
    Type:       PersistentVolumeClaim (a reference to a PersistentVolumeClaim in the same namespace)
    ClaimName:  my-project-db-claim0
    ReadOnly:   false
  my-project-db-claim1:
    Type:       PersistentVolumeClaim (a reference to a PersistentVolumeClaim in the same namespace)
    ClaimName:  my-project-db-claim1
    ReadOnly:   false
  default-token-pf9ks:
    Type:        Secret (a volume populated by a Secret)
    SecretName:  default-token-pf9ks
    Optional:    false
QoS Class:       BestEffort
Node-Selectors:  <none>
Tolerations:     node.kubernetes.io/not-ready:NoExecute for 300s
                node.kubernetes.io/unreachable:NoExecute for 300s
Events:
  Type     Reason            Age                From               Message
  ----     ------            ----               ----               -------
  Warning  FailedScheduling  60s (x5 over 60s)  default-scheduler  pod has unbound PersistentVolumeClaims (repeated 2 times)
  Warning  FailedScheduling  2s (x16 over 59s)  default-scheduler  0/2 nodes are available: 2 node(s) had no available volume zone.

我已经阅读了这两个问题:

动态卷分配在错误的可用区创建EBS卷

可以在没有节点的可用区中创建EBS上的PersistentVolume(已关闭)

但是我已经检查过卷都是在群集节点实例所在的相同区域创建的。实际上,EKS默认在us-east-1bus-east-1c区域创建了两个EBS,这些卷是有效的。发布的yaml创建的卷也位于这些区域。


1
虽然没完全关联,但是我也无法将多个 PVC(单独的 PV)绑定到 m5 AWS 实例上的单个节点上(例如,单个 pvc/pv 可以正常工作)。 - Ho Man
3个回答

4

你必须删除并重新创建存储类型。现有卷不会有停机时间。volumeBindingMode是不可变的。 - Radu Cosnita

1
听起来它正在尝试在你没有任何卷的可用区创建一个卷。您可以尝试将StorageClass限制为您拥有节点的可用区。
kind: StorageClass
apiVersion: storage.k8s.io/v1
metadata:
  name: gp2
  annotations:
    storageclass.kubernetes.io/is-default-class: "true"
provisioner: kubernetes.io/aws-ebs
parameters:
  type: gp2
reclaimPolicy: Retain
mountOptions:
  - debug
allowedTopologies:
- matchLabelExpressions:
  - key: failure-domain.beta.kubernetes.io/zone
    values:
    - us-east-1b
    - us-east-1c

这与问题答案非常相似,只是描述的问题在GCP上,而在这种情况下是AWS。


已经考虑过这个问题,但是 EKS 默认创建的卷是在 us-east-1b 和 us-east-1c 中创建的,而且这两个新卷也是在这些区域中创建的。我可以更新问题并添加一些在 EC2 上拍摄的 EBS 卷截图。 - Ale Sanchez
小更新:只是为了尝试,我添加了您建议的allowedTopologies,但卷仍然在1b和1c区域创建,但Pod仍显示相同的错误,并保持“挂起”状态。 - Ale Sanchez
你解决了你的问题吗? - Rico

1
在这种情况下,您应该检查您的工作节点(EC2实例)的可用区。例如:

工作节点1 = eu-central-1b
工作节点2 = eu-central-1c

然后,在包括上述任一可用区之一中创建卷(不要使用eu-central-1a创建卷)。
创建卷后,通过将新创建的卷附加到集群来创建您的PersistentVolumePersistentVolumeClaim,如下所示。
apiVersion: v1
kind: PersistentVolume
metadata:
  labels:
    failure-domain.beta.kubernetes.io/region: eu-central-1
    failure-domain.beta.kubernetes.io/zone: eu-central-1b
  name: mongo-pv
  namespace: default
spec:
  accessModes:
  - ReadWriteOnce
  capacity:
    storage: 100Gi
  awsElasticBlockStore:
    fsType: ext4
    volumeID: aws://eu-central-1b/vol-063342ab9be5d2929

  storageClassName: gp2

---

apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: mongo-pvc
  namespace: default
spec:
    accessModes:
    - ReadWriteOnce
    resources:
      requests:
        storage: 100Gi
    storageClassName: gp2
    volumeName: mongo-pv

PersistentVolume 是一个非命名空间对象。 - Shinebayar G

网页内容由stack overflow 提供, 点击上面的
可以查看英文原文,
原文链接