Apache Ignite频繁缓存关闭异常

4
我们正在使用Apache Ignite作为缓存来加速我们的授权和权限调用。在客户端加载应用程序时,应用程序会通过18次get调用命中Ignite缓存,在此期间,我们经常从Ignite获取缓存关闭异常。尽管我们一直试图使用大量调用来复制相同情况,但错误似乎每18次调用中有3-4次出现,而且当本地运行时也是如此。我们已经将所有推荐的配置应用于集群。



 <!-- Alter configuration below as needed. -->
   <bean id="grid.cfg" class="org.apache.ignite.configuration.IgniteConfiguration">
      <property name="peerClassLoadingEnabled" value="true" />
      <property name="cacheConfiguration">
         <list>
            <!-- Partitioned cache example configuration (Atomic mode). -->
            <bean class="org.apache.ignite.configuration.CacheConfiguration">
               <property name="name" value="default" />
               <property name="atomicityMode" value="ATOMIC" />
               <property name="backups" value="1" />
            </bean>
         </list>
      </property>
      <property name="includeEventTypes">
         <list>
            <util:constant static-field="org.apache.ignite.events.EventType.EVT_TASK_STARTED" />
            <util:constant static-field="org.apache.ignite.events.EventType.EVT_TASK_FINISHED" />
            <util:constant static-field="org.apache.ignite.events.EventType.EVT_TASK_FAILED" />
         </list>
      </property>
      <property name="binaryConfiguration">

         <bean class="org.apache.ignite.configuration.BinaryConfiguration">

            <property name="compactFooter" value="false" />

         </bean>
      </property>
      <!-- Configure internal thread pool. -->
      <property name="publicThreadPoolSize" value="64" />
      <!-- Configure system thread pool. -->
      <property name="systemThreadPoolSize" value="32" />
      <!--<property name="clientMode" value="false" />-->
      <property name="sqlSchemas">
         <list>
            <value>BA_EV</value>
            <value>BA_DEMO</value>
            <value>TEST_1</value>
            <value>TEST_2</value>
            <value>TEST_3</value>
         </list>
      </property>
      <!-- Enabling Apache Ignite Persistent Store. -->
      <property name="dataStorageConfiguration">
         <bean class="org.apache.ignite.configuration.DataStorageConfiguration">
            <property name="defaultDataRegionConfiguration">
               <bean class="org.apache.ignite.configuration.DataRegionConfiguration">
                  <property name="persistenceEnabled" value="true" />
               </bean>
            </property>
         </bean>
      </property>
      <property name="communicationSpi">
         <bean class="org.apache.ignite.spi.communication.tcp.TcpCommunicationSpi">
            <property name="slowClientQueueLimit" value="1000" />
            <property name="messageQueueLimit" value="1024" />
         </bean>
      </property>
      <!-- Enabling authentication. -->
      <property name="authenticationEnabled" value="true" />
      <!-- Explicitly configure TCP discovery SPI to provide list of initial nodes. -->
      <property name="discoverySpi">
         <bean class="org.apache.ignite.spi.discovery.tcp.TcpDiscoverySpi">
            <property name="ipFinder">
               <!--
                        Ignite provides several options for automatic discovery that can be used
                        instead os static IP based discovery. For information on all options refer
                        to our documentation: http://apacheignite.readme.io/docs/cluster-config
                    -->
               <!-- Uncomment static IP finder to enable static-based discovery of initial nodes. -->
               <!--<bean class="org.apache.ignite.spi.discovery.tcp.ipfinder.vm.TcpDiscoveryVmIpFinder">-->
               <bean class="org.apache.ignite.spi.discovery.tcp.ipfinder.multicast.TcpDiscoveryMulticastIpFinder">
                  <property name="addresses">
                     <list>
                        <!-- In distributed environment, replace with actual host IP address. -->
                        <value>localhost:47500..47509</value>
                     </list>
                  </property>
               </bean>
            </property>
         </bean>
      </property>
     </bean>
    </beans>


我们正在使用通过注解调用的简单缓存.get()和缓存.put()方法。
目前我们正在使用。

    try {
        cache.put(key, returnType.cast(result));
        } catch (CacheException | org.hibernate.cache.CacheException e) {
            if (e.getCause() instanceof IgniteClientDisconnectedException) {
                IgniteClientDisconnectedException cause =(IgniteClientDisconnectedException)e.getCause();
                cause.reconnectFuture().get(); // Wait for reconnection.
                cache.put(key, returnType.cast(result));
                addDiconnectCount();
                LOGGER.error("Diconnection Reason Trace: ", e);
            }
        }

我们尝试使用以下方法解决错误,但我们无法找出频繁缓存重新连接的根本原因。我们在本地使用了4GB XMX,并且有大约86%未使用的堆内存以及在Ignite集群和客户端中几乎没有节点。

我们正在尝试存储以下类型的对象:

    Map<String, Serializable>
    map.put(String, String[]);
    List<String>

日志:

>>> +----------------------------------------------------------------------+
>>> Ignite ver. 2.7.6#20190911-sha1:21f7ca41c4348909e2fd26ccf59b5b2ce1f4474e
>>> +----------------------------------------------------------------------+
>>> OS name: Windows 10 10.0 amd64
>>> CPU(s): 4
>>> Heap: 2.0GB
>>> VM name: 4484@xxxxxxx
>>> Local node [ID=7F470894-A979-443C-BC4F-7BCB047C7550, order=6, clientMode=true]
>>> Local node addresses: [xxxxxxx.xx.xxxxxxx.com/0:0:0:0:0:0:0:1, BLREQX1352123L.xx.xxxxxxx.com/10.73.4.44, 192.168.138.1/127.0.0.1, BLREQX1352123L.xx.xxxxxxx.com/172.2                    2.192.1, /192.168.138.1, /192.168.234.1]
>>> Local ports: TCP:10801 TCP:47101

[16:39:14,562][INFO][Thread-10][GridDiscoveryManager] Topology snapshot [ver=6, locNode=7f470894, servers=1, clients=1, state=ACTIVE, CPUs=4, offheap=3.2GB, heap=4.0GB]                         [16:39:45,598][INFO][main][Http11NioProtocol] Starting ProtocolHandler ["http-nio-5014"]
[16:39:45,598][INFO][main][NioSelectorPool] Using a shared selector for servlet write/read
[16:40:14,564][INFO][grid-timeout-worker-#23][IgniteKernal]
Metrics for local node (to disable set 'metricsLogFrequency' to 0)
    ^-- Node [id=7f470894, uptime=00:01:00.016]
    ^-- H/N/C [hosts=1, nodes=2, CPUs=4]
    ^-- CPU [cur=0.13%, avg=31.59%, GC=0%]
    ^-- PageMemory [pages=0]
    ^-- Heap [used=183MB, free=91.06%, comm=1024MB]
    ^-- Off-heap [used=0MB, free=-1%, comm=0MB]
    ^-- Outbound messages queue [size=0]
    ^-- Public thread pool [active=0, idle=0, qSize=0]
    ^-- System thread pool [active=0, idle=0, qSize=0]
[16:40:20,864][INFO][exchange-worker-#38][GridCacheProcessor] Started cache [name=userRoleCache, id=-2021367519, memoryPolicyName=null, mode=REPLICATED, atomicity=ATOMIC, ba                    ckups=2147483647, mvcc=false], encryptionEnabled=false]
[16:40:20,888][INFO][exchange-worker-#38][GridCacheProcessor] Finish proxy initialization, cacheName=userRoleCache, localNodeId=7f470894-a979-443c-bc4f-7bcb047c7550
[16:40:20,997][INFO][exchange-worker-#38][GridCacheProcessor] Stopped cache [cacheName=userRoleCache]
[16:40:21,000][INFO][exchange-worker-#38][GridCacheProcessor] Can not finish proxy initialization because proxy does not exist, cacheName=userRoleCache, localNodeId=7f470894                    -a979-443c-bc4f-7bcb047c7550
[16:40:21,001][INFO][exchange-worker-#38][GridCacheProcessor] Can not finish proxy initialization because proxy does not exist, cacheName=userRoleCache, localNodeId=7f470894                    -a979-443c-bc4f-7bcb047c7550
[16:40:21,001][INFO][exchange-worker-#38][GridCacheProcessor] Can not finish proxy initialization because proxy does not exist, cacheName=userRoleCache, localNodeId=7f470894                    -a979-443c-bc4f-7bcb047c7550
[16:40:21,015][INFO][exchange-worker-#38][GridCacheProcessor] Started cache [name=roleAppPermissions, id=-686786119, memoryPolicyName=null, mode=REPLICATED, atomicity=ATOMIC                    , backups=2147483647, mvcc=false], encryptionEnabled=false]
[16:40:21,039][INFO][exchange-worker-#38][GridCacheProcessor] Finish proxy initialization, cacheName=roleAppPermissions, localNodeId=7f470894-a979-443c-bc4f-7bcb047c7550                        [16:40:21,063][INFO][exchange-worker-#38][GridCacheProcessor] Stopped cache [cacheName=roleAppPermissions]
[16:40:21,131][SEVERE][http-nio-5014-exec-4][JerseyConfig]] Servlet.service() for servlet [com.xxxx.xxxx.base.config.JerseyConfig] in context with path [/ba-cct-api] threw e                    xception

1
没有查看日志很难说这里发生了什么。你的日志中是否有JVM暂停消息?你能分享一下吗? - alamar
已添加日志,请检查。 - Chetan Munigangappa
1
日志看起来不错,你能展示一下你捕获的异常细节吗? - alamar
其余部分是简单的异常堆栈传递。除了消息缓存已关闭之外,没有什么特别的了。我已经添加了一个检查来执行重新连接后的操作。如果必要,我可以添加这些日志,但它们没有更多的内容可供说明。 - Chetan Munigangappa
1
你能提供完整的日志并将IGNITE_QUIET设置为false吗?也许有关于重新连接原因的信息。 - alamar
事实证明,这个问题根本与Ignite无关。感谢您的努力。 - Chetan Munigangappa
1个回答

2

我终于搞清楚了,问题出在我们自定义的包装器中的一个try-with-resources逻辑。由于Apache Ignite每次都重新打开缓存连接,只有在同时访问时才能看到这个错误。

感谢大家的答复。


网页内容由stack overflow 提供, 点击上面的
可以查看英文原文,
原文链接