Kubernetes kube-dns pod处于挂起状态。

Question

Kubernetes kube-dns pod处于挂起状态。

6

我尝试使用文档在Ubuntu虚拟机中安装和设置Kubernetes。我已经完成了3/4的步骤，现在kube-dns容器处于挂起状态。

我该如何查找问题？以下是使用kubectl get pods --namespace=kube-system和kubectl describe pod <pod name>得到的结果：

# kubectl get pods --namespace=kube-system
NAME                              READY     STATUS    RESTARTS   AGE
dummy-2088944543-jk2t2            1/1       Running   0          3h
etcd-ubuntu                       1/1       Running   0          3h
kube-apiserver-ubuntu             1/1       Running   0          3h
kube-controller-manager-ubuntu    1/1       Running   0          3h
kube-discovery-1769846148-h88v4   1/1       Running   0          3h
kube-dns-2924299975-dfp17         0/4       Pending   0          3h
kube-proxy-zdcxw                  1/1       Running   0          3h
kube-scheduler-ubuntu             1/1       Running   0          3h
weave-net-xwfhj                   2/2       Running   0          2h

# kubectl describe pod kube-dns-2924299975-dfp17
Error from server (NotFound): pods "kube-dns-2924299975-dfp17" not found

- Lakmal Vithanage

尝试使用kubectl命令检查pod配置数据中的标签。Kubernetes只会将其调度到具有匹配标签的主机/kubelet上。例如，pod可能具有标签，指示仅在具有“region=infrastructure”或类似标签的节点上运行（这只是一个虚构的例子）。此外，每个kubelet都有一个“可调度”的概念，如果未开启此选项，则不会自动部署任何内容。最后一个想法是它可能无法下载容器映像。您可以检查日志以了解原因。 - Chunko

@lakmal-vithanage，您能发布 kubectl describe pod <pod-name> 的输出吗？ - Antoine Cotten

@AntoineCotten，我已经更新了问题。 - Lakmal Vithanage

@lakmal-vithanage 您在命令中忘记了 --namespace=kube-system，因此出现了错误。您能再更新一次问题吗？ - Antoine Cotten

1个回答

网页内容由stack overflow 提供, 点击上面的

可以查看英文原文，
原文链接

- Antoine Cotten · Accepted Answer

原因

很可能是你的集群中缺乏可用的计算资源。

如果你正在使用 cluster/addons/dns 中的示例，你肯定正在使用一个带有资源请求的 Deployment，如果你点击链接，会突出显示。这可能是因为你的其他 pod 已经请求了集群中所有可用的资源，因此你的 pod 没有被调度。

你可以通过 kubectl --namespace=kube-system describe pod kube-dns-2924299975-dfp17 命令来确认这个理论，并查找以下事件：

Reason                Message
------                -------
FailedScheduling      pod (kube-dns-2924299975-dfp17) failed to fit in any node
fit failure summary on nodes : Insufficient cpu (3)

你也可以使用kubectl describe node <node-name>命令来描述节点，并查看最后的信息：

Allocated resources:
  (Total limits may be over 100 percent, i.e., overcommitted.
  CPU Requests  CPU Limits      Memory Requests Memory Limits
  ------------  ----------      --------------- -------------
  320m (8%)     300m (7%)       150Mi (1%)      150Mi (1%)

根据你的情况，CPU或内存分配应该接近100%。

解决方案

向集群添加更多计算资源/节点（首选）
从Pod中删除资源请求，以超额配置资源为代价