NoSpamLogger.java:达到最大内存使用量 Cassandra:卡桑德拉

11
我有一个由5个Cassandra节点组成的集群,每个节点上大约有650GB的数据,副本因子为3。我最近在/var/log/cassandra/system.log中看到了以下错误信息。
INFO [ReadStage-5] 2017-10-17 17:06:07,887 NoSpamLogger.java:91 - 达到最大内存使用量(1.000GiB),无法分配1.000MiB的块
我尝试增加file_cache_size_in_mb,但很快就会遇到同样的错误。我尝试将此参数提高至2GB,但没有效果。
当出现错误时,CPU利用率飙升,读取延迟极不稳定。我看到这种激增大约每隔半小时出现一次。请注意下面列表中的时间。
INFO [ReadStage-5] 2017-10-17 17:06:07,887 NoSpamLogger.java:91 - 达到最大内存使用量(1.000GiB),无法分配1.000MiB的块 INFO [ReadStage-36] 2017-10-17 17:36:09,807 NoSpamLogger.java:91 - 达到最大内存使用量(1.000GiB),无法分配1.000MiB的块 INFO [ReadStage-15] 2017-10-17 18:05:56,003 NoSpamLogger.java:91 - 达到最大内存使用量(2.000GiB),无法分配1.000MiB的块 INFO [ReadStage-28] 2017-10-17 18:36:01,177 NoSpamLogger.java:91 - 达到最大内存使用量(2.000GiB),无法分配1.000MiB的块
我有两个表是按小时分区的,分区很大。例如,这是它们从nodetool table stats中的输出。
    Read Count: 4693453
    Read Latency: 0.36752741680805157 ms.
    Write Count: 561026
    Write Latency: 0.03742310516803143 ms.
    Pending Flushes: 0
        Table: raw_data
        SSTable count: 55
        Space used (live): 594395754275
        Space used (total): 594395754275
        Space used by snapshots (total): 0
        Off heap memory used (total): 360753372
        SSTable Compression Ratio: 0.20022598072758296
        Number of keys (estimate): 45163
        Memtable cell count: 90441
        Memtable data size: 685647925
        Memtable off heap memory used: 0
        Memtable switch count: 1
        Local read count: 0
        Local read latency: NaN ms
        Local write count: 126710
        Local write latency: 0.096 ms
        Pending flushes: 0
        Percent repaired: 52.99
        Bloom filter false positives: 167775
        Bloom filter false ratio: 0.16152
        Bloom filter space used: 264448
        Bloom filter off heap memory used: 264008
        Index summary off heap memory used: 31060
        Compression metadata off heap memory used: 360458304
        Compacted partition minimum bytes: 51
        **Compacted partition maximum bytes: 3449259151**
        Compacted partition mean bytes: 16642499
        Average live cells per slice (last five minutes): 1.0005435888450147
        Maximum live cells per slice (last five minutes): 42
        Average tombstones per slice (last five minutes): 1.0
        Maximum tombstones per slice (last five minutes): 1
        Dropped Mutations: 0



    Read Count: 4712814
    Read Latency: 0.3356051004771247 ms.
    Write Count: 643718
    Write Latency: 0.04168356951335834 ms.
    Pending Flushes: 0
        Table: customer_profile_history
        SSTable count: 20
        Space used (live): 9423364484
        Space used (total): 9423364484
        Space used by snapshots (total): 0
        Off heap memory used (total): 6560008
        SSTable Compression Ratio: 0.1744084338623116
        Number of keys (estimate): 69
        Memtable cell count: 35242
        Memtable data size: 789595302
        Memtable off heap memory used: 0
        Memtable switch count: 1
        Local read count: 2307
        Local read latency: NaN ms
        Local write count: 51772
        Local write latency: 0.076 ms
        Pending flushes: 0
        Percent repaired: 0.0
        Bloom filter false positives: 0
        Bloom filter false ratio: 0.00000
        Bloom filter space used: 384
        Bloom filter off heap memory used: 224
        Index summary off heap memory used: 400
        Compression metadata off heap memory used: 6559384
        Compacted partition minimum bytes: 20502
        **Compacted partition maximum bytes: 4139110981**
        Compacted partition mean bytes: 708736810
        Average live cells per slice (last five minutes): NaN
        Maximum live cells per slice (last five minutes): 0
        Average tombstones per slice (last five minutes): NaN
        Maximum tombstones per slice (last five minutes): 0
        Dropped Mutations: 0

Here goes:

cdsdb/raw_data histograms
Percentile  SSTables     Write Latency      Read Latency    Partition Size        Cell Count
                              (micros)          (micros)           (bytes)                  
50%             0.00             61.21              0.00           1955666               642
75%             1.00             73.46              0.00          17436917              4768
95%             3.00            105.78              0.00         107964792             24601
98%             8.00            219.34              0.00         186563160             42510
99%            12.00            315.85              0.00         268650950             61214
Min             0.00              6.87              0.00                51                 0
Max            14.00           1358.10              0.00        3449259151           7007506

cdsdb/customer_profile_history histograms
Percentile  SSTables     Write Latency      Read Latency    Partition Size        Cell Count
                              (micros)          (micros)           (bytes)                  
50%             0.00             73.46              0.00         223875792             61214
75%             0.00             88.15              0.00         668489532            182785
95%             0.00            152.32              0.00        1996099046            654949
98%             0.00            785.94              0.00        3449259151           1358102
99%             0.00            943.13              0.00        3449259151           1358102
Min             0.00             24.60              0.00              5723                 4
Max             0.00           5839.59              0.00        5960319812           1955666

请问您能提供缓解这个问题的方法吗?

1
你能给我们这两个表的“nodetool cfhistograms”吗? - dilsingi
1
我在问题中发布了直方图。 - Varsha
1个回答

14

根据发布的cfhistograms输出,分区非常巨大。

原始数据表(raw_data table)的95%百分位数的分区大小为107MB,最大值为3.44GB。客户档案历史(customer_profile_history)的95%百分位数的分区大小为1.99GB,最大值为5.96GB。

这显然与您每半小时注意到的问题有关,因为这些庞大的分区被写入sstable中。数据模型必须更改,基于上面的分区大小,最好将分区间隔设置为“分钟”而不是“小时”。这样,2GB的分区将被减少为33MB的分区。

推荐的分区大小是尽可能接近100MB的最大值。虽然理论上我们可以存储超过100MB,但性能会受到影响。请记住,每次读取该分区都需要通过网络传输100MB以上的数据。在您的情况下,超过2GB,因此还带有所有性能相关的影响。


{btsdaf} - Varsha
如果没有后续问题,请记得接受答案(勾选)。 - dilsingi
我还没有截断旧数据,因为这需要应用程序级别的更改。但是,我已经进行了实验,以确信内存错误的根本原因确实是大分区。因此,非常感谢您在此问题上提供的所有输入。 - Varsha
@dilsingi,请为上述问题提出建议? - vishal paalakurthi
@vishalpaalakurthi 分区间隔或分区键基于您自己的表和用例。如果您能详细说明您的用例和表的细节,那么解释起来会更容易。 - dilsingi
显示剩余6条评论

网页内容由stack overflow 提供, 点击上面的
可以查看英文原文,
原文链接