在Docker Swarm中无法发现其他Neo4J因果集群实例

3

我使用稍作修改的演示docker-compose,该文件来自于这里,感谢GraphAware团队

我成功地使用docker-compose up运行了一个因果集群。但是,我无法通过docker swarm实现相同的结果。

compose文件是相同的:

version: '3.3'

networks:
  neonet:
    driver: overlay
    attachable: true
    ipam:
      config:
        - subnet: 10.161.0.0/24

services:

  neo-1:
    image: neo4j:3.3.4-enterprise
    networks:
      - neonet
    volumes:
      - /srv/neo4j/neo4j-core1/data:/data
      - /srv/neo4j/neo4j-core1/logs:/logs
    environment:
      - NEO4J_AUTH=neo4j/blah
      - NEO4J_dbms_mode=CORE
      - NEO4J_ACCEPT_LICENSE_AGREEMENT=yes
      - NEO4J_causalClustering_expectedCoreClusterSize=3
      - NEO4J_causalClustering_initialDiscoveryMembers=neo-1:5000,neo-2:5000,neo-3:5000
      - NEO4J_dbms_connector_http_listen__address=:7474
      - NEO4J_dbms_connector_https_listen__address=:6477
      - NEO4J_dbms_connector_bolt_listen__address=:7687

  neo-2:
    image: neo4j:3.3.4-enterprise
    networks:
      - neonet
    volumes:
      - /srv/neo4j/neo4j-core2/data:/data
      - /srv/neo4j/neo4j-core2/logs:/logs
    environment:
      - NEO4J_AUTH=neo4j/blah
      - NEO4J_dbms_mode=CORE
      - NEO4J_ACCEPT_LICENSE_AGREEMENT=yes
      - NEO4J_causalClustering_expectedCoreClusterSize=3
      - NEO4J_causalClustering_initialDiscoveryMembers=neo-1:5000,neo-2:5000,neo-3:5000
      - NEO4J_dbms_connector_http_listen__address=:7474
      - NEO4J_dbms_connector_https_listen__address=:6477
      - NEO4J_dbms_connector_bolt_listen__address=:7687

  neo-3:
    image: neo4j:3.3.4-enterprise
    networks:
      - neonet
    volumes:
      - /srv/neo4j/neo4j-core3/data:/data
      - /srv/neo4j/neo4j-core3/logs:/logs
    environment:
      - NEO4J_AUTH=neo4j/blah
      - NEO4J_dbms_mode=CORE
      - NEO4J_ACCEPT_LICENSE_AGREEMENT=yes
      - NEO4J_causalClustering_expectedCoreClusterSize=3
      - NEO4J_causalClustering_initialDiscoveryMembers=neo-1:5000,neo-2:5000,neo-3:5000
      - NEO4J_dbms_connector_http_listen__address=:7474
      - NEO4J_dbms_connector_https_listen__address=:6477
      - NEO4J_dbms_connector_bolt_listen__address=:7687

除了在docker-compose up中,我既未指定叠加网络细节,也未部署具体信息。两个集群都运行在一台机器上。

如果我进入独立的docker-compose容器,IP地址看起来很正常,并且端口5000是可用的;对于群集部署容器做同样的事情(curl ip:5000)会导致连接被拒绝

运行netstat -ntlp命令结果如下:

/var/lib/neo4j # netstat -ntlp
Active Internet connections (only servers)
Proto Recv-Q Send-Q Local Address           Foreign Address         State       PID/Program name
tcp        0      0 10.161.0.166:5000       0.0.0.0:*               LISTEN      -
tcp        0      0 127.0.0.11:44137        0.0.0.0:*               LISTEN      -
tcp        0      0 0.0.0.0:7000            0.0.0.0:*               LISTEN      -

在此机器上(ifconfig)没有任何接口的IP地址上,将端口5000设置为监听状态:

eth0      Link encap:Ethernet  HWaddr 02:42:0A:A1:00:A7
          inet addr:10.161.0.167  Bcast:10.161.0.255  Mask:255.255.255.0
          UP BROADCAST RUNNING MULTICAST  MTU:1450  Metric:1
          RX packets:119 errors:0 dropped:0 overruns:0 frame:0
          TX packets:119 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:0
          RX bytes:7110 (6.9 KiB)  TX bytes:7110 (6.9 KiB)

eth1      Link encap:Ethernet  HWaddr 02:42:AC:12:00:06
          inet addr:172.18.0.6  Bcast:172.18.255.255  Mask:255.255.0.0
          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
          RX packets:8 errors:0 dropped:0 overruns:0 frame:0
          TX packets:0 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:0
          RX bytes:648 (648.0 B)  TX bytes:0 (0.0 B)

lo        Link encap:Local Loopback
          inet addr:127.0.0.1  Mask:255.0.0.0
          UP LOOPBACK RUNNING  MTU:65536  Metric:1
          RX packets:58 errors:0 dropped:0 overruns:0 frame:0
          TX packets:58 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:1
          RX bytes:3604 (3.5 KiB)  TX bytes:3604 (3.5 KiB)

正如您所看到的,有两个接口,我的neonet网络和(我假设是)Docker的ingress

此外,neo4j已经通过配置指示自己在所有接口上侦听发现:

causal_clustering.transaction_listen_address=0.0.0.0:6000
causal_clustering.transaction_advertised_address=2a9e1683a92e:6000
causal_clustering.raft_listen_address=0.0.0.0:7000
causal_clustering.raft_advertised_address=2a9e1683a92e:7000
causal_clustering.initial_discovery_members=neo1:5000,neo2:5000,neo3:5000
causal_clustering.expected_core_cluster_size=3
causal_clustering.discovery_listen_address=0.0.0.0:5000
causal_clustering.discovery_advertised_address=2a9e1683a92e:5000
EDITION=enterprise
ACCEPT.LICENSE.AGREEMENT=yes

...但它在某种程度上决定监听特定的IP地址 - 对于5000是这样的,但不幸的是对于7000并非如此。

我不是网络专家,但似乎监听一个没有连接到本机任何接口的IP地址并不正确。

如何指示Neo4J绑定到所有接口?或者至少绑定到一个有效的接口?

1个回答

2
原来有多个修复方法,其中核心是将deploy.endpoint_node: dnsrr设置为防止创建docker虚拟IP。最终,我的工作群集文件如下所示。
工作 = 多节点工作的neo4j原因核心集群(仅限);使用Neo4J OGM v3客户端连接url bolt+routing://neo-1:7687 100%工作。我还没有勇气尝试故障转移初始连接;因此在neo-1上使用SPF(最初)。
version: '3.3'

services:
  neo-1:
    image: neo4j:3.3.4-enterprise
    volumes:
      - neo-data:/data
      - neo-logs:/var/lib/neo4j/logs
    environment:
      - NEO4J_AUTH=neo4j/blah
      - NEO4J_causalClustering_discoveryAdvertisedAddress=neo-1:5000
      - NEO4J_causalClustering_transactionAdvertisedAddress=neo-1:6000
      - NEO4J_causalClustering_raftAdvertisedAddress=neo-1:7000
      - NEO4J_causalClustering_expectedCoreClusterSize=3
      - NEO4J_causalClustering_initialDiscoveryMembers=neo-1:5000,neo-2:5000,neo-3:5000
      - NEO4J_dbms_connectors_default__advertised__address=neo-1
      - NEO4J_dbms_connector_bolt_advertised__address=:7687
      - NEO4J_ACCEPT_LICENSE_AGREEMENT=yes
      - NEO4J_dbms_mode=CORE

    deploy:
      mode: global
      endpoint_mode: dnsrr
      placement:
        constraints:
          - node.labels.neodb == 1
    networks:
      - neonet

  neo-2:
    image: neo4j:3.3.4-enterprise
    volumes:
      - neo-data:/data
      - neo-logs:/var/lib/neo4j/logs
    environment:
      - NEO4J_AUTH=neo4j/blah
      - NEO4J_causalClustering_discoveryAdvertisedAddress=neo-2:5000
      - NEO4J_causalClustering_transactionAdvertisedAddress=neo-2:6000
      - NEO4J_causalClustering_raftAdvertisedAddress=neo-2:7000
      - NEO4J_causalClustering_expectedCoreClusterSize=3
      - NEO4J_causalClustering_initialDiscoveryMembers=neo-1:5000,neo-2:5000,neo-3:5000
      - NEO4J_dbms_connectors_default__advertised__address=neo-2
      - NEO4J_dbms_connector_bolt_advertised__address=:7687
      - NEO4J_ACCEPT_LICENSE_AGREEMENT=yes
      - NEO4J_dbms_mode=CORE

    deploy:
      mode: global
      endpoint_mode: dnsrr
      placement:
        constraints:
          - node.labels.neodb == 2
    networks:
      - neonet

  neo-3:
    image: neo4j:3.3.4-enterprise
    volumes:
      - neo-data:/data
      - neo-logs:/var/lib/neo4j/logs
    environment:
      - NEO4J_AUTH=neo4j/blah
      - NEO4J_causalClustering_discoveryAdvertisedAddress=neo-3:5000
      - NEO4J_causalClustering_transactionAdvertisedAddress=neo-3:6000
      - NEO4J_causalClustering_raftAdvertisedAddress=neo-3:7000
      - NEO4J_causalClustering_expectedCoreClusterSize=3
      - NEO4J_causalClustering_initialDiscoveryMembers=neo-1:5000,neo-2:5000,neo-3:5000
      - NEO4J_dbms_connectors_default__advertised__address=neo-3
      - NEO4J_dbms_connector_bolt_advertised__address=:7687
      - NEO4J_ACCEPT_LICENSE_AGREEMENT=yes
      - NEO4J_dbms_mode=CORE

    deploy:
      mode: global
      endpoint_mode: dnsrr
      placement:
        constraints:
          - node.labels.neodb == 3
    networks:
      - neonet

networks:
  neonet:
    driver: overlay

volumes:
  neo-data:
  neo-logs:

我相信这段话过于啰嗦,而且现在可能已经有一种解决方案,允许只声明一个服务(带有多个副本)。


我知道这已经过时了,但我必须停下来说声谢谢。我们有一个长期运行的集群,我们不得不迁移到一个新的容器编排系统,但我无法让它运行起来。我使用了你的配置作为一个健全性检查,并成功连接,最终找到了解决方法! - Senica Gonzalez

网页内容由stack overflow 提供, 点击上面的
可以查看英文原文,
原文链接