- 使用简单的字符串键和值。
key:user,value:payload(整个JSON blob可以达到100-200 KB) - 使用哈希
HSET user:1 username "someone"
HSET user:1 location "NY"
HSET user:1 bio "STRING WITH OVER 100 lines"
SET user:1 payload
哪种方法更节省内存?使用字符串键和值,还是使用哈希?
SET user:1 payload
HSET user:1 username "someone"
HSET user:1 location "NY"
HSET user:1 bio "STRING WITH OVER 100 lines"
Store the entire object as JSON-encoded string in a single key and keep track of all Objects using a set (or list, if more appropriate). For example:
INCR id:users
SET user:{id} '{"name":"Fred","age":25}'
SADD users {id}
Generally speaking, this is probably the best method in most cases. If there are a lot of fields in the Object, your Objects are not nested with other Objects, and you tend to only access a small subset of fields at a time, it might be better to go with option 2.
Advantages: considered a "good practice." Each Object is a full-blown Redis key. JSON parsing is fast, especially when you need to access many fields for this Object at once. Disadvantages: slower when you only need to access a single field.
Store each Object's properties in a Redis hash.
INCR id:users
HMSET user:{id} name "Fred" age 25
SADD users {id}
Advantages: considered a "good practice." Each Object is a full-blown Redis key. No need to parse JSON strings. Disadvantages: possibly slower when you need to access all/most of the fields in an Object. Also, nested Objects (Objects within Objects) cannot be easily stored.
Store each Object as a JSON string in a Redis hash.
INCR id:users
HMSET users {id} '{"name":"Fred","age":25}'
This allows you to consolidate a bit and only use two keys instead of lots of keys. The obvious disadvantage is that you can't set the TTL (and other stuff) on each user Object, since it is merely a field in the Redis hash and not a full-blown Redis key.
Advantages: JSON parsing is fast, especially when you need to access many fields for this Object at once. Less "polluting" of the main key namespace. Disadvantages: About same memory usage as #1 when you have a lot of Objects. Slower than #2 when you only need to access a single field. Probably not considered a "good practice."
Store each property of each Object in a dedicated key.
INCR id:users
SET user:{id}:name "Fred"
SET user:{id}:age 25
SADD users {id}
According to the article above, this option is almost never preferred (unless the property of the Object needs to have specific TTL or something).
Advantages: Object properties are full-blown Redis keys, which might not be overkill for your app. Disadvantages: slow, uses more memory, and not considered "best practice." Lots of polluting of the main key namespace.
一般来说,选项4不是首选。选项1和2非常相似,并且都很常见。通常我更喜欢选项1,因为它允许您存储更复杂的对象(具有多层嵌套等)。当您真正关心不污染主键命名空间时(即您不希望在数据库中有很多键,并且您不关心诸如TTL、键分片等事项)时,使用选项3。
如果我在这里犯了什么错误,请考虑留下评论并允许我修改答案,而不是直接点踩。谢谢!:)
hmget
命令获取n个字段的复杂度为O(n),而使用带有选项1的get
命令仍然是O(1)。理论上来说,后者更快。 - Aruna Herathobj
的JSON字符串中,而将像浏览量、投票数和投票者之类的字段与独立的键名分开存储?这样,通过一次读取查询,您就可以获得整个对象,并且仍然可以快速更新对象的动态部分。相对较少更新的JSON字符串字段可以通过读取并在事务中写入整个对象来完成。 - arun这取决于您如何访问数据:
选择选项1:
选择选项2:
提示:一般而言,请选择在大多数用例中需要更少查询的选项。
JSON
有效载荷的同时修改(一种经典的非原子的读取-修改-写入问题),选项1不是一个好主意。 - Samveen给定一组答案的一些补充:
首先,如果您要有效地使用Redis哈希,则必须知道键计数的最大数量和值的最大大小-否则,如果它们超出了哈希的hash-max-ziplist-value或hash-max-ziplist-entries,Redis将在幕后将其转换为实际上是常规键/值对。(见hash-max-ziplist-value,hash-max-ziplist-entries)而从哈希选项下面的故障确实很糟糕,因为Redis中每个常规键/值对使用+90字节。
这意味着,如果您选择选项二并意外突破了max-hash-ziplist-value,那么您将会获得每个用户模型内每个属性+ 70字节!
# you need me-redis and awesome-print gems to run exact code
redis = Redis.include(MeRedis).configure( hash_max_ziplist_value: 64, hash_max_ziplist_entries: 512 ).new
=> #<Redis client v4.0.1 for redis://127.0.0.1:6379/0>
> redis.flushdb
=> "OK"
> ap redis.info(:memory)
{
"used_memory" => "529512",
**"used_memory_human" => "517.10K"**,
....
}
=> nil
# me_set( 't:i' ... ) same as hset( 't:i/512', i % 512 ... )
# txt is some english fictionary book around 56K length,
# so we just take some random 63-symbols string from it
> redis.pipelined{ 10000.times{ |i| redis.me_set( "t:#{i}", txt[rand(50000), 63] ) } }; :done
=> :done
> ap redis.info(:memory)
{
"used_memory" => "1251944",
**"used_memory_human" => "1.19M"**, # ~ 72b per key/value
.....
}
> redis.flushdb
=> "OK"
# setting **only one value** +1 byte per hash of 512 values equal to set them all +1 byte
> redis.pipelined{ 10000.times{ |i| redis.me_set( "t:#{i}", txt[rand(50000), i % 512 == 0 ? 65 : 63] ) } }; :done
> ap redis.info(:memory)
{
"used_memory" => "1876064",
"used_memory_human" => "1.79M", # ~ 134 bytes per pair
....
}
redis.pipelined{ 10000.times{ |i| redis.set( "t:#{i}", txt[rand(50000), 65] ) } };
ap redis.info(:memory)
{
"used_memory" => "2262312",
"used_memory_human" => "2.16M", #~155 byte per pair i.e. +90 bytes
....
}
如果您无法保证某些用户属性的最大大小,则可以选择第一种解决方案,如果内存问题非常重要,则在将用户JSON存储在Redis之前进行压缩。
如果您可以强制所有属性的最大大小,则可以设置hash-max-ziplist-entries/value,并使用哈希作为每个用户表示形式的一个哈希或从Redis指南的此主题中使用哈希内存优化:https://redis.io/topics/memory-optimization 并将用户存储为JSON字符串。无论哪种方式,您也可以压缩长用户属性。
https://redis.io/docs/stack/json/
https://developer.redis.com/howtos/redisjson/getting-started/
https://redis.com/blog/redisjson-public-preview-performance-benchmarking/
我们在生产环境中遇到了类似的问题,我们想到了一个解决方案,即在有效载荷超过某个阈值(KB)时进行压缩。
我有一个仓库专门用于Redis客户端库,链接在这里:here
基本思路是检测有效载荷是否大于某个阈值,如果是,就进行gzip压缩和base-64编码,然后将压缩后的字符串作为普通字符串存储在Redis中。在检索数据时,我们先判断字符串是否是有效的base-64字符串,如果是,就对其进行解压缩。
整个压缩和解压缩过程是透明的,同时可以减少近50%的网络流量。
BenchmarkDotNet=v0.12.1, OS=macOS 11.3 (20E232) [Darwin 20.4.0]
Intel Core i7-9750H CPU 2.60GHz, 1 CPU, 12 logical and 6 physical cores
.NET Core SDK=5.0.201
[Host] : .NET Core 3.1.13 (CoreCLR 4.700.21.11102, CoreFX 4.700.21.11602), X64 RyuJIT DEBUG
方法 | 平均时间 | 误差 | 标准偏差 | Gen 0 | Gen 1 | Gen 2 | 分配内存 |
---|---|---|---|---|---|---|---|
启用压缩进行基准测试 | 668.2 毫秒 | 13.34 毫秒 | 27.24 毫秒 | - | - | - | 4.88 MB |
禁用压缩进行基准测试 | 1,387.1 毫秒 | 26.92 毫秒 | 37.74 毫秒 | - | - | - | 2.39 MB |