在压缩历史记录中,rows_merged是什么意思?

8
当我发出命令时
$ nodetool compactionhistory

I get

. . . compacted_at        bytes_in       bytes_out      rows_merged
. . . 1404936947592       8096           7211           {1:3, 3:1}

{1:3, 3:1}是什么意思?我只能找到这个文档,其中写道:

合并的分区数

但这并没有解释为什么有多个值以及冒号的含义。

1个回答

25

基本上它意味着 {tables:rows},例如 {1:3, 3:1} 意味着从一个SSTable(1:3)中取出了3行,从三个SSTable(3:1)中取出了1行,所有这些都是为了在压实操作中生成一个SSTable。

我自己尝试了一下,这里有一个例子,希望对你有所帮助:

创建键空间和表:

cqlsh> create keyspace space1 WITH replication = {'class': 'SimpleStrategy', 'replication_factor': 1};

cqlsh> create TABLE space1.tb1 ( key text, val1 text, primary KEY (key));

cqlsh> INSERT INTO space1.tb1 (key, val1 ) VALUES ( 'key1','111');
cqlsh> INSERT INTO space1.tb1 (key, val1 ) VALUES ( 'key2','222');
cqlsh> INSERT INTO space1.tb1 (key, val1 ) VALUES ( 'key3','333');
cqlsh> INSERT INTO space1.tb1 (key, val1 ) VALUES ( 'key4','444');
cqlsh> INSERT INTO space1.tb1 (key, val1 ) VALUES ( 'key5','555');
cqlsh> exit

现在我们执行 flush 操作来创建 SSTable。

$ nodetool flush space1

我们看到只创建了一个版本的表

$ sudo ls -lR /var/lib/cassandra/data/space1

/var/lib/cassandra/data/space1:
total 4
drwxr-xr-x. 2 cassandra cassandra 4096 Feb  3 12:51 tb1

/var/lib/cassandra/data/space1/tb1:
total 32
-rw-r--r--. 1 cassandra cassandra   43 Feb  3 12:51 space1-tb1-jb-1-CompressionInfo.db
-rw-r--r--. 1 cassandra cassandra  146 Feb  3 12:51 space1-tb1-jb-1-Data.db
-rw-r--r--. 1 cassandra cassandra   24 Feb  3 12:51 space1-tb1-jb-1-Filter.db
-rw-r--r--. 1 cassandra cassandra   90 Feb  3 12:51 space1-tb1-jb-1-Index.db
-rw-r--r--. 1 cassandra cassandra 4389 Feb  3 12:51 space1-tb1-jb-1-Statistics.db
-rw-r--r--. 1 cassandra cassandra   80 Feb  3 12:51 space1-tb1-jb-1-Summary.db
-rw-r--r--. 1 cassandra cassandra   79 Feb  3 12:51 space1-tb1-jb-1-TOC.txt

检查 sstable2json 工具,我们可以查看到我们的数据。

$ sudo -u cassandra /usr/bin/sstable2json /var/lib/cassandra/data/space1/tb1/space1-tb1-jb-1-Data.db
[
{"key": "6b657935","columns": [["","",1422967847005000], ["val1","555",1422967847005000]]},
{"key": "6b657931","columns": [["","",1422967817740000], ["val1","111",1422967817740000]]},
{"key": "6b657934","columns": [["","",1422967840622000], ["val1","444",1422967840622000]]},
{"key": "6b657933","columns": [["","",1422967832341000], ["val1","333",1422967832341000]]},
{"key": "6b657932","columns": [["","",1422967825116000], ["val1","222",1422967825116000]]}
]

目前,“notetool compactionhistory”对于此表没有显示任何内容,但是让我们运行压缩以查看我们会得到什么(向右滚动)

$ nodetool compactionhistory | awk 'NR == 2 || /space1/'
id                                       keyspace_name      columnfamily_name            compacted_at              bytes_in       bytes_out      rows_merged
5725f890-aba4-11e4-9f73-351725b0ac5b     space1             tb1                          1422968305305             146            146            {1:5}

现在让我们删除两行,并清空

cqlsh> delete from space1.tb1 where key='key1';
cqlsh> delete from space1.tb1 where key='key2';
cqlsh> exit

$ nodetool flush space1

$ sudo ls -l /var/lib/cassandra/data/space1/tb1/
[sudo] password for datastax: 
total 64
-rw-r--r--. 1 cassandra cassandra   43 Feb  3 12:58 space1-tb1-jb-2-CompressionInfo.db
-rw-r--r--. 1 cassandra cassandra  146 Feb  3 12:58 space1-tb1-jb-2-Data.db
-rw-r--r--. 1 cassandra cassandra  336 Feb  3 12:58 space1-tb1-jb-2-Filter.db
-rw-r--r--. 1 cassandra cassandra   90 Feb  3 12:58 space1-tb1-jb-2-Index.db
-rw-r--r--. 1 cassandra cassandra 4393 Feb  3 12:58 space1-tb1-jb-2-Statistics.db
-rw-r--r--. 1 cassandra cassandra   80 Feb  3 12:58 space1-tb1-jb-2-Summary.db
-rw-r--r--. 1 cassandra cassandra   79 Feb  3 12:58 space1-tb1-jb-2-TOC.txt
-rw-r--r--. 1 cassandra cassandra   43 Feb  3 13:02 space1-tb1-jb-3-CompressionInfo.db
-rw-r--r--. 1 cassandra cassandra   49 Feb  3 13:02 space1-tb1-jb-3-Data.db
-rw-r--r--. 1 cassandra cassandra   16 Feb  3 13:02 space1-tb1-jb-3-Filter.db
-rw-r--r--. 1 cassandra cassandra   36 Feb  3 13:02 space1-tb1-jb-3-Index.db
-rw-r--r--. 1 cassandra cassandra 4413 Feb  3 13:02 space1-tb1-jb-3-Statistics.db
-rw-r--r--. 1 cassandra cassandra   80 Feb  3 13:02 space1-tb1-jb-3-Summary.db
-rw-r--r--. 1 cassandra cassandra   79 Feb  3 13:02 space1-tb1-jb-3-TOC.txt

让我们检查表格的内容

$ sudo -u cassandra /usr/bin/sstable2json /var/lib/cassandra/data/space1/tb1/space1-tb1-jb-2-Data.db
[
{"key": "6b657935","columns": [["","",1422967847005000], ["val1","555",1422967847005000]]},
{"key": "6b657931","columns": [["","",1422967817740000], ["val1","111",1422967817740000]]},
{"key": "6b657934","columns": [["","",1422967840622000], ["val1","444",1422967840622000]]},
{"key": "6b657933","columns": [["","",1422967832341000], ["val1","333",1422967832341000]]},
{"key": "6b657932","columns": [["","",1422967825116000], ["val1","222",1422967825116000]]}
]

$ sudo -u cassandra /usr/bin/sstable2json /var/lib/cassandra/data/space1/tb1/space1-tb1-jb-3-Data.db
[
{"key": "6b657931","metadata": {"deletionInfo": {"markedForDeleteAt":1422968551313000,"localDeletionTime":1422968551}},"columns": []},
{"key": "6b657932","metadata": {"deletionInfo": {"markedForDeleteAt":1422968553322000,"localDeletionTime":1422968553}},"columns": []}
]
现在让我们压缩。
$ nodetool compact space1

正如预期的那样,现在只有一个稳定版本。

$ sudo ls -l /var/lib/cassandra/data/space1/tb1/
total 32
-rw-r--r--. 1 cassandra cassandra   43 Feb  3 13:05 space1-tb1-jb-4-CompressionInfo.db
-rw-r--r--. 1 cassandra cassandra  133 Feb  3 13:05 space1-tb1-jb-4-Data.db
-rw-r--r--. 1 cassandra cassandra  656 Feb  3 13:05 space1-tb1-jb-4-Filter.db
-rw-r--r--. 1 cassandra cassandra   90 Feb  3 13:05 space1-tb1-jb-4-Index.db
-rw-r--r--. 1 cassandra cassandra 4429 Feb  3 13:05 space1-tb1-jb-4-Statistics.db
-rw-r--r--. 1 cassandra cassandra   80 Feb  3 13:05 space1-tb1-jb-4-Summary.db
-rw-r--r--. 1 cassandra cassandra   79 Feb  3 13:05 space1-tb1-jb-4-TOC.txt

现在让我们检查新稳定版本的内容,我们可以看到墓碑

$ sudo -u cassandra /usr/bin/sstable2json /var/lib/cassandra/data/space1/tb1/space1-tb1-jb-4-Data.db
[
{"key": "6b657935","columns": [["","",1422967847005000], ["val1","555",1422967847005000]]},
{"key": "6b657931","metadata": {"deletionInfo": {"markedForDeleteAt":1422968551313000,"localDeletionTime":1422968551}},"columns": []},
{"key": "6b657934","columns": [["","",1422967840622000], ["val1","444",1422967840622000]]},
{"key": "6b657933","columns": [["","",1422967832341000], ["val1","333",1422967832341000]]},
{"key": "6b657932","metadata": {"deletionInfo": {"markedForDeleteAt":1422968553322000,"localDeletionTime":1422968553}},"columns": []}
]

最后让我们检查压缩历史记录(向右滚动)

$ nodetool compactionhistory | awk 'NR == 2 || /space1/'
id                                       keyspace_name      columnfamily_name            compacted_at              bytes_in       bytes_out      rows_merged
5725f890-aba4-11e4-9f73-351725b0ac5b     space1             tb1                          1422968305305             146            146            {1:5}
46112600-aba5-11e4-9f73-351725b0ac5b     space1             tb1                          1422968706144             195            133            {1:3, 2:2}

4
哇,这是一个很棒的回答! - Aaron
1
完全同意。谢谢! - Ztyx

网页内容由stack overflow 提供, 点击上面的
可以查看英文原文,
原文链接