基于docker-compose的ElasticSearch健康检查失败。

7
在docker-compose中,对Elasticsearch的健康检查会导致任何依赖的服务停止,因为容器始终处于不健康状态。当我运行 docker ps -a --format "table {{.Names}}\t{{.Image}}\t{{.Status}}" 命令时,可以看到这种情况。
NAMES           IMAGE                  STATUS
elasticsearch   elasticsearch:7.12.1   Up 26 seconds (unhealthy)

我正在尝试启动 Metricbeat,使 Elasticsearch、Kibana 和 Logstash 一起启动:
metricbeat:
  image: elastic/metricbeat:7.12.1
  user: root
  depends_on:
    elasticsearch:
      condition: service_healthy
    kibana:
      condition: service_healthy
    logstash:
      condition: service_healthy
    redis:
      condition: service_healthy

我该如何确保elasticsearch(以及其他容器)处于健康状态,并允许metricbeat在所有可用资源的基础上启动?

除非绝对必要,否则我会避免为它们创建Docker镜像。

我的docker-compose配置如下:

version: '3.7'
services:
  elasticsearch:
    # specifying discovery.type='single-node' bypasses bootstrapping checks.
    image: elasticsearch:7.12.1
    container_name: elasticsearch
    healthcheck:
      test: [ "CMD", "curl",  "--fail" , "http://elasticsearch:9200/_cluster/health?wait_for_status=green&timeout=1s", "||", "exit", "1" ]
      interval: 5s
      timeout: 3s
          
    volumes:
      - elasticsearch-data:/usr/share/elasticsearch/data
    networks:
      - elastic
    ports:
      - 9200:9200
      - 9300:9300
    labels:
      co.elastic.metrics/module: "elasticsearch"
      co.elastic.metrics/hosts: "http://elasticsearch:9200"
      co.elastic.metrics/metricsets: "node_stats,node"
      co.elastic.metrics/xpack.enabled: "true"
    environment:
      - node.name=elasticsearch
      - cluster.name=cluster-7
      - discovery.type=single-node
      - 'ES_JAVA_OPTS=-Xms512m -Xmx512m'
      - xpack.monitoring.enabled=true
      - xpack.monitoring.elasticsearch.collection.enabled=true
      
    ulimits:
      memlock:
        soft: -1
        hard: -1
      nofile:
        soft: 65536
        hard: 65536
    cap_add:
      - IPC_LOCK
4个回答

6

根据Elastic版本不同,方法可能会有所不同。如果使用OpenSearch,则需要使用其他内容,因为它们的health输出略有不同(但仅供参考)。

这是我在docker-compose中使用的内容。

healthcheck:
  interval: 10s
  retries: 80
  test: curl --write-out 'HTTP %{http_code}' --fail --silent --output /dev/null http://localhost:8200/

或者
healthcheck:
   test: curl -s http://elasticsearch01:9200 >/dev/null || exit 1
   interval: 30s
   timeout: 10s
   retries: 50


3

对于 d-c 文件版本 3.7,我使用以下结构:

healthcheck:
  test: curl -u elastic:elastic -s -f elasticsearch:9200/_cat/health >/dev/null || exit 1
  interval: 30s
  timeout: 10s
  retries: 5

1
为了健康检查,我们尝试了几种方法:
使用SSL:
    healthcheck:
      test:
        [
          "CMD-SHELL",
          "curl -s --user elastic:${ELASTIC_PASSWORD} --cacert /usr/share/elasticsearch/config/certs/elasticsearch/elasticsearch.crt -X GET http://localhost:9200/_cluster/health?pretty | grep status | grep -q '\\(green\\|yellow\\)'"
        ]
      interval: 10s
      timeout: 10s
      retries: 24


没有SSL:
    healthcheck:
      test:     
        [
          "CMD-SHELL",
          "curl -s --user elastic:${ELASTIC_PASSWORD} -X GET http://localhost:9200/_cluster/health?pretty | grep status | grep -q '\\(green\\|yellow\\)'"
        ]
      interval: 10s
      timeout: 10s
      retries: 24

cacert的位置可能因docker构建而异。例如,如果您使用deviantony/docker-elk,基本命令是docker compose exec -it elasticsearch /bin/bash -c 'curl -u elastic:I4Umonitor --cacert /usr/share/elasticsearch/config/ca.crt -XGET https://localhost:9200/_cluster/health?pretty | grep status | grep "\(green\|yellow\)"' 因此,健康检查看起来像这样:
    healthcheck:
      test:     
        [
          "CMD-SHELL",
          "curl -s --user elastic:${ELASTIC_PASSWORD} --cacert /usr/share/elasticsearch/config/ca.crt -XGET https://localhost:9200/_cluster/health?pretty | grep status | grep -q '\\(green\\|yellow\\)'"
        ]
      interval: 10s
      timeout: 10s
      retries: 24

0

故障排除快速提示:

1 - 使用docker inspect快速检查服务健康状况:

sudo docker inspect --format "{{json .State.Health }}" elasticsearch | jq

2 - 如果不是“健康”的状态,则进入 Docker 内部进行测试:

sudo docker exec -it <container_id> bash

root@container_id:/# curl -s http://elasticsearch:9200

代码应该返回类似下面的内容:

% Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100    66  100    66    0     0   6186      0 --:--:-- --:--:-- --:--:--  6600

如果不是这个问题,那么你的域名或授权可能是错误的。

否则,请按照建议在Docker健康检查中再次尝试:

test: curl -s http://elasticsearch:9200 >/dev/null || exit 1

注意:如果您正在使用环境变量,可能无法使用单个美元符号识别它们。 $VARIALBE
然后尝试使用双美元符号。 $$VARIALBE
希望能帮到您...

网页内容由stack overflow 提供, 点击上面的
可以查看英文原文,
原文链接