Kubernetes - 容器镜像已经存在于机器上

29

我在k8s上有2个相似的部署,它们从GitLab拉取相同的镜像。显然,这导致我的第二个部署出现了CrashLoopBackOff错误,我似乎无法连接到端口来检查我的容器中的/healthz。日志显示容器接收到一个中断信号,而描述该容器时显示以下消息。

 FirstSeen  LastSeen    Count   From            SubObjectPath                   Type        Reason          Message
  --------- --------    -----   ----            -------------                   --------    ------          -------
  29m       29m     1   default-scheduler                           Normal      Scheduled       Successfully assigned java-kafka-rest-kafka-data-2-development-5c6f7f597-5t2mr to 172.18.14.110
  29m       29m     1   kubelet, 172.18.14.110                          Normal      SuccessfulMountVolume   MountVolume.SetUp succeeded for volume "default-token-m4m55" 
  29m       29m     1   kubelet, 172.18.14.110  spec.containers{consul}             Normal      Pulled          Container image "..../consul-image:0.0.10" already present on machine
  29m       29m     1   kubelet, 172.18.14.110  spec.containers{consul}             Normal      Created         Created container
  29m       29m     1   kubelet, 172.18.14.110  spec.containers{consul}             Normal      Started         Started container
  28m       28m     1   kubelet, 172.18.14.110  spec.containers{java-kafka-rest-development}    Normal      Killing         Killing container with id docker://java-kafka-rest-development:Container failed liveness probe.. Container will be killed and recreated.
  29m       28m     2   kubelet, 172.18.14.110  spec.containers{java-kafka-rest-development}    Normal      Created         Created container
  29m       28m     2   kubelet, 172.18.14.110  spec.containers{java-kafka-rest-development}    Normal      Started         Started container
  29m       27m     10  kubelet, 172.18.14.110  spec.containers{java-kafka-rest-development}    Warning     Unhealthy       Readiness probe failed: Get http://10.5.59.35:7533/healthz: dial tcp 10.5.59.35:7533: getsockopt: connection refused
  28m       24m     13  kubelet, 172.18.14.110  spec.containers{java-kafka-rest-development}    Warning     Unhealthy       Liveness probe failed: Get http://10.5.59.35:7533/healthz: dial tcp 10.5.59.35:7533: getsockopt: connection refused
  29m       19m     8   kubelet, 172.18.14.110  spec.containers{java-kafka-rest-development}    Normal      Pulled          Container image "r..../java-kafka-rest:0.3.2-dev" already present on machine
  24m       4m      73  kubelet, 172.18.14.110  spec.containers{java-kafka-rest-development}    Warning     BackOff         Back-off restarting failed container

我尝试重新部署不同镜像下的部署,并且似乎运行得很正常。但是,我认为这样做并不高效,因为镜像是相同的。我该怎么办?

这是我的部署文件:

apiVersion: extensions/v1beta1
kind: Deployment
metadata:
  name: "java-kafka-rest-kafka-data-2-development"
  labels:
    repository: "java-kafka-rest"
    project: "java-kafka-rest"
    service: "java-kafka-rest-kafka-data-2"
    env: "development"
spec:
  replicas: 1
  selector:
    matchLabels:
      repository: "java-kafka-rest"
      project: "java-kafka-rest"
      service: "java-kafka-rest-kafka-data-2"
      env: "development"
  template:
    metadata:
      labels:
        repository: "java-kafka-rest"
        project: "java-kafka-rest"
        service: "java-kafka-rest-kafka-data-2"
        env: "development"
        release: "0.3.2-dev"
    spec:
      imagePullSecrets:
      - name: ...
      containers:
      - name: java-kafka-rest-development
        image: registry...../java-kafka-rest:0.3.2-dev
        env:
        - name: DEPLOYMENT_COMMIT_HASH
          value: "0.3.2-dev"
        - name: DEPLOYMENT_PORT
          value: "7533"
        livenessProbe:
          httpGet:
            path: /healthz
            port: 7533
          initialDelaySeconds: 30
          timeoutSeconds: 1
        readinessProbe:
          httpGet:
            path: /healthz
            port: 7533
          timeoutSeconds: 1
        ports:
        - containerPort: 7533
        resources:
          requests:
            cpu: 0.5
            memory: 6Gi
          limits:
            cpu: 3
            memory: 10Gi
        command:
          - /envconsul
          - -consul=127.0.0.1:8500
          - -sanitize
          - -upcase
          - -prefix=java-kafka-rest/
          - -prefix=java-kafka-rest/kafka-data-2
          - java
          - -jar
          - /build/libs/java-kafka-rest-0.3.2-dev.jar
        securityContext:
          readOnlyRootFilesystem: true
      - name: consul
        image: registry.../consul-image:0.0.10
        env:
        - name: SERVICE_NAME
          value: java-kafka-rest-kafka-data-2
        - name: SERVICE_ENVIRONMENT
          value: development
        - name: SERVICE_PORT
          value: "7533"
        - name: CONSUL1
          valueFrom:
            configMapKeyRef:
              name: consul-config-...
              key: node1
        - name: CONSUL2
          valueFrom:
            configMapKeyRef:
              name: consul-config-...
              key: node2
        - name: CONSUL3
          valueFrom:
            configMapKeyRef:
              name: consul-config-...
              key: node3
        - name: CONSUL_ENCRYPT
          valueFrom:
            configMapKeyRef:
              name: consul-config-...
              key: encrypt
        ports:
        - containerPort: 8300
        - containerPort: 8301
        - containerPort: 8302
        - containerPort: 8400
        - containerPort: 8500
        - containerPort: 8600
        command: [ entrypoint, agent, -config-dir=/config, -join=$(CONSUL1), -join=$(CONSUL2), -join=$(CONSUL3), -encrypt=$(CONSUL_ENCRYPT) ]
      terminationGracePeriodSeconds: 30
      nodeSelector:
        env: ...

可能是您的 readinessProbe 导致了容器的退出。这是 Kafka Broker 镜像还是其他类型的镜像? - Urosh T.
是的,这就是为什么我也假设。它确实是一个Kafka镜像,用于生成Kafka消息。但是我对什么导致 readinessProbe 触发感到困惑;据我理解,从GitLab拉取的镜像应该放在k8s pod上,而不受其他pod拉取的镜像的影响。 - AlphaCR
实际上,据我所知,Kafka甚至没有任何健康检查端点。您是否实现了任何自定义健康检查或其他解决方案? - Urosh T.
@UroshT。我确实已经实现了自定义健康检查,并将其粘贴到[pastebin](https://pastebin.com/PLrVpX5E)上,并添加了我的部署文件以供参考。然而,即使`readinesProbe`确实是原因,为什么他们从同一镜像拉取但不从各个镜像拉取时会影响我的部署呢? - AlphaCR
我错了,是你的 livelinessProbe 杀死了你的 pod,在日志中有这样的记录:Killing container with id docker://java-kafka-rest-development:Container failed liveness probe.. Container will be killed and recreated.。所以你想说的是,当你显式地拉取镜像时,没有问题,但当镜像没有被拉取(相同的镜像),问题就出现了? - Urosh T.
显示剩余3条评论
2个回答

23
针对这个问题,我已经找到了问题所在和解决方法。显然,问题出在我的service.yml文件中,我的目标端口与我在docker镜像中打开的端口不同。确保在docker镜像中打开的端口连接到正确的端口。
希望这可以帮助到你。

3

您也可以检查Pod的日志。 对我来说,错误在Pod中。

kubectl logs <pod> -n your-namespace

网页内容由stack overflow 提供, 点击上面的
可以查看英文原文,
原文链接