我试图在EKS上为MongoDB实验一个2节点集群(一旦稳定后会扩大规模)。这两个节点运行在不同的AWS区域中。描述如下:
apiVersion: apps/v1beta1
kind: StatefulSet
metadata:
name: mongod
labels:
name: mongo-repl
spec:
serviceName: mongodb-service
replicas: 2
selector:
matchLabels:
app: mongod
role: mongo
environment: test
template:
metadata:
labels:
app: mongod
role: mongo
environment: test
spec:
terminationGracePeriodSeconds: 15
affinity:
nodeAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
nodeSelectorTerms:
- matchExpressions:
- key: failure-domain.beta.kubernetes.io/zone
operator: In
values:
- ap-south-1a
- ap-south-1b
podAntiAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
- labelSelector:
matchExpressions:
- key: app
operator: In
values:
- mongod
- key: role
operator: In
values:
- mongo
- key: environment
operator: In
values:
- test
topologyKey: kubernetes.io/hostname
containers:
.....
这里的目标是不要在已运行带有标签 - app=mongod,role=mongo,environment=test 的 pod 的节点上安排另一个 pod。
当我部署规范时,只有一个 mongo pod 集合被创建到一个节点上。
ubuntu@ip-192-170-0-18:~$ kubectl describe statefulset mongod
Name: mongod
Namespace: default
CreationTimestamp: Sun, 16 Feb 2020 16:44:16 +0000
Selector: app=mongod,environment=test,role=mongo
Labels: name=mongo-repl
Annotations: <none>
Replicas: 2 desired | 2 total
Update Strategy: OnDelete
Pods Status: 1 Running / 1 Waiting / 0 Succeeded / 0 Failed
Pod Template:
Labels: app=mongod
environment=test
role=mongo
Containers:
使用kubectl命令描述名为mongod-1的Pod
Node: <none>
Labels: app=mongod
controller-revision-hash=mongod-66f7c87bbb
environment=test
role=mongo
statefulset.kubernetes.io/pod-name=mongod-1
Annotations: kubernetes.io/psp: eks.privileged
Status: Pending
....
....
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Warning FailedScheduling 42s (x14 over 20m) default-scheduler 0/2 nodes are available: 1 Insufficient pods, 1 node(s) didn't match pod affinity/anti-affinity, 1 node(s) didn't satisfy existing pods anti-affinity rules.
0/2个节点可用:1个Pod不足、1个节点与Pod的亲和/反亲和不匹配、1个节点未满足现有Pods的反亲和规则。
无法弄清楚亲和度规范中的冲突。非常感谢您在此方面提供一些见解!
2月21日编辑:下面添加新错误信息的详细信息
根据建议,我现在已经扩展了工作节点并开始接收到更清晰的错误消息——
事件: 类型 原因 年龄 来自 消息 ---- ------ ---- ---- ------- 警告 调度失败 51秒 (x554超过13小时) 默认调度程序 0/2个节点可用:1个节点与Pod的亲和/反亲和不匹配,1个节点未满足现有Pods的反亲和规则,1个节点存在卷节点亲和性冲突。
因此,现在的主要问题(在扩展工作节点之后)是:
1个节点存在卷节点亲和性冲突
再次发布我的整个配置文件:
apiVersion: apps/v1beta1
kind: StatefulSet
metadata:
name: mongod
labels:
name: mongo-repl
spec:
serviceName: mongodb-service
replicas: 2
selector:
matchLabels:
app: mongod
role: mongo
environment: test
template:
metadata:
labels:
app: mongod
role: mongo
environment: test
spec:
terminationGracePeriodSeconds: 15
affinity:
nodeAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
nodeSelectorTerms:
- matchExpressions:
- key: failure-domain.beta.kubernetes.io/zone
operator: In
values:
- ap-south-1a
- ap-south-1b
podAntiAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
- labelSelector:
matchExpressions:
- key: app
operator: In
values:
- mongod
- key: role
operator: In
values:
- mongo
- key: environment
operator: In
values:
- test
topologyKey: kubernetes.io/hostname
containers:
- name: mongod-container
.......
volumes:
- name: mongo-vol
persistentVolumeClaim:
claimName: mongo-pvc
PVC -- 聚氯乙烯
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: mongo-pvc
spec:
storageClassName: gp2-multi-az
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 8Gi
PV -- 页面访问量
apiVersion: "v1"
kind: "PersistentVolume"
metadata:
name: db-volume-0
spec:
capacity:
storage: 10Gi
accessModes:
- ReadWriteOnce
persistentVolumeReclaimPolicy: Retain
storageClassName: gp2-multi-az
awsElasticBlockStore:
volumeID: vol-06f12b1d6c5c93903
fsType: ext4
nodeAffinity:
required:
nodeSelectorTerms:
- matchExpressions:
- key: failure-domain.beta.kubernetes.io/zone
#- key: topology.kubernetes.io/zone
operator: In
values:
- ap-south-1a
apiVersion: "v1"
kind: "PersistentVolume"
metadata:
name: db-volume-1
spec:
capacity:
storage: 10Gi
accessModes:
- ReadWriteOnce
persistentVolumeReclaimPolicy: Retain
storageClassName: gp2-multi-az
awsElasticBlockStore:
volumeID: vol-090ab264d4747f131
fsType: ext4
nodeAffinity:
required:
nodeSelectorTerms:
- matchExpressions:
- key: failure-domain.beta.kubernetes.io/zone
#- key: topology.kubernetes.io/zone
operator: In
values:
- ap-south-1b
存储类 --
kind: StorageClass
apiVersion: storage.k8s.io/v1
metadata:
name: gp2-multi-az
annotations:
storageclass.kubernetes.io/is-default-class: "true"
provisioner: kubernetes.io/aws-ebs
volumeBindingMode: WaitForFirstConsumer
allowVolumeExpansion: true
parameters:
type: gp2
fsType: ext4
allowedTopologies:
- matchLabelExpressions:
- key: failure-domain.beta.kubernetes.io/zone
values:
- ap-south-1a
- ap-south-1b
我不想选择动态PVC。
根据@rabello的建议,添加以下输出--
kubectl get pods --show-labels
NAME READY STATUS RESTARTS AGE LABELS
mongod-0 1/1 Running 0 14h app=mongod,controller-revision-hash=mongod-5b4699fd85,environment=test,role=mongo,statefulset.kubernetes.io/pod-name=mongod-0
mongod-1 0/1 Pending 0 14h app=mongod,controller-revision-hash=mongod-5b4699fd85,environment=test,role=mongo,statefulset.kubernetes.io/pod-name=mongod-1
kubectl get nodes --show-labels
NAME STATUS ROLES AGE VERSION LABELS
ip-192-170-0-8.ap-south-1.compute.internal Ready <none> 14h v1.14.7-eks-1861c5 beta.kubernetes.io/arch=amd64,beta.kubernetes.io/instance-type=t3.small,beta.kubernetes.io/os=linux,eks.amazonaws.com/nodegroup-image=ami-07fd6cdebfd02ef6e,eks.amazonaws.com/nodegroup=trl_compact_prod_db_node_group,failure-domain.beta.kubernetes.io/region=ap-south-1,failure-domain.beta.kubernetes.io/zone=ap-south-1a,kubernetes.io/arch=amd64,kubernetes.io/hostname=ip-192-170-0-8.ap-south-1.compute.internal,kubernetes.io/os=linux
ip-192-170-80-14.ap-south-1.compute.internal Ready <none> 14h v1.14.7-eks-1861c5 beta.kubernetes.io/arch=amd64,beta.kubernetes.io/instance-type=t3.small,beta.kubernetes.io/os=linux,eks.amazonaws.com/nodegroup-image=ami-07fd6cdebfd02ef6e,eks.amazonaws.com/nodegroup=trl_compact_prod_db_node_group,failure-domain.beta.kubernetes.io/region=ap-south-1,failure-domain.beta.kubernetes.io/zone=ap-south-1b,kubernetes.io/arch=amd64,kubernetes.io/hostname=ip-192-170-80-14.ap-south-1.compute.internal,kubernetes.io/os=linux
@AmitKumarGupta 很不幸,即使使用更大的形状也没有起作用。 可分配的: 可附加卷-AWS-EBS:"25" CPU:"2" 临时存储:"19316009748" 大页面-1Gi:"0" 大页面-2Mi:"0" 内存:1899960Ki Pod:"11" 容量: 可附加卷-AWS-EBS:"25" CPU:"2" 临时存储:20959212Ki 大页面-1Gi:"0" 大页面-2Mi:"0" 内存:2002360Ki Pod:"11"
仍然遇到相同的问题!! - Rajesh