用 Ruby 将哈希表按键值分组

19

我有一个数组,这个数组是由MongoDB执行的map/reduce方法输出的,它看起来像这样:

[{"minute"=>30.0, "hour"=>15.0, "date"=>5.0, "month"=>9.0, "year"=>2011.0, "type"=>0.0, "count"=>299.0}, 
{"minute"=>30.0, "hour"=>15.0, "date"=>5.0, "month"=>9.0, "year"=>2011.0, "type"=>10.0, "count"=>244.0}, 
{"minute"=>30.0, "hour"=>15.0, "date"=>5.0, "month"=>9.0, "year"=>2011.0, "type"=>1.0, "count"=>204.0}, 
{"minute"=>45.0, "hour"=>15.0, "date"=>5.0, "month"=>9.0, "year"=>2011.0, "type"=>0.0, "count"=>510.0}, 
{"minute"=>45.0, "hour"=>15.0, "date"=>5.0, "month"=>9.0, "year"=>2011.0, "type"=>10.0, "count"=>437.0}, 
{"minute"=>0.0, "hour"=>16.0, "date"=>5.0, "month"=>9.0, "year"=>2011.0, "type"=>0.0, "count"=>469.0}, 
{"minute"=>0.0, "hour"=>16.0, "date"=>5.0, "month"=>9.0, "year"=>2011.0, "type"=>10.0, "count"=>477.0}, 
{"minute"=>15.0, "hour"=>16.0, "date"=>5.0, "month"=>9.0, "year"=>2011.0, "type"=>0.0, "count"=>481.0}, 
{"minute"=>15.0, "hour"=>16.0, "date"=>5.0, "month"=>9.0, "year"=>2011.0, "type"=>10.0, "count"=>401.0}, 
{"minute"=>30.0, "hour"=>16.0, "date"=>5.0, "month"=>9.0, "year"=>2011.0, "type"=>0.0, "count"=>468.0}, 
{"minute"=>30.0, "hour"=>16.0, "date"=>5.0, "month"=>9.0, "year"=>2011.0, "type"=>10.0, "count"=>448.0}, 
{"minute"=>45.0, "hour"=>16.0, "date"=>5.0, "month"=>9.0, "year"=>2011.0, "type"=>0.0, "count"=>485.0}, 
{"minute"=>45.0, "hour"=>16.0, "date"=>5.0, "month"=>9.0, "year"=>2011.0, "type"=>10.0, "count"=>518.0}] 

你会注意到在这个例子中,type 有三种不同的值,分别是 012。现在需要做的是按照其 type 键值将此哈希数组分组,例如,这个数组最终会变成这样:

{
  :type_0 => [
    {"minute"=>30.0, "hour"=>15.0, "date"=>5.0, "month"=>9.0, "year"=>2011.0, "count"=>299.0}, 
    {"minute"=>45.0, "hour"=>15.0, "date"=>5.0, "month"=>9.0, "year"=>2011.0, "count"=>510.0}, 
    {"minute"=>0.0, "hour"=>16.0, "date"=>5.0, "month"=>9.0, "year"=>2011.0, "count"=>469.0}, 
    {"minute"=>15.0, "hour"=>16.0, "date"=>5.0, "month"=>9.0, "year"=>2011.0, "count"=>481.0}, 
    {"minute"=>30.0, "hour"=>16.0, "date"=>5.0, "month"=>9.0, "year"=>2011.0, "count"=>468.0}, 
    {"minute"=>45.0, "hour"=>16.0, "date"=>5.0, "month"=>9.0, "year"=>2011.0, "count"=>485.0}
  ],

  :type_1 => [
    {"minute"=>30.0, "hour"=>15.0, "date"=>5.0, "month"=>9.0, "year"=>2011.0, "count"=>204.0}
  ],

  :type_10 => [
    {"minute"=>30.0, "hour"=>15.0, "date"=>5.0, "month"=>9.0, "year"=>2011.0, "count"=>244.0}, 
    {"minute"=>45.0, "hour"=>15.0, "date"=>5.0, "month"=>9.0, "year"=>2011.0, "count"=>437.0},
    {"minute"=>0.0, "hour"=>16.0, "date"=>5.0, "month"=>9.0, "year"=>2011.0, "count"=>477.0}, 
    {"minute"=>15.0, "hour"=>16.0, "date"=>5.0, "month"=>9.0, "year"=>2011.0, "count"=>401.0}, 
    {"minute"=>30.0, "hour"=>16.0, "date"=>5.0, "month"=>9.0, "year"=>2011.0, "count"=>448.0}, 
    {"minute"=>45.0, "hour"=>16.0, "date"=>5.0, "month"=>9.0, "year"=>2011.0, "count"=>518.0}
  ]
} 

我知道这些示例数组非常大,但我认为这可能比我想象中的问题更简单。

基本上,每个哈希数组将按其type键的值进行分组,并作为具有每种类型的数组的哈希返回,任何帮助都将非常有帮助,甚至只是一些有用的提示也将不胜感激。


可能是在Ruby中将数组拆分为多个小数组的最佳方法的重复问题。 - akostadinov
4个回答

38
array.group_by {|x| x['type']}

或者如果你想要符号键的东西,甚至可以这样做

array.group_by {|x| "type_#{x['type']}".to_sym}
我认为这句话最好表达了"So basically each array of hashes would be grouped by the value of its type key, and then returned as a hash with an array for each type",即每个哈希数组将会被按照其类型键值分组,然后返回一个哈希,其中每个类型都对应着一个数组。虽然输出的哈希中保留了:type键。

2
它不会在 Ruby 1.8 中运行,也不会产生所需输出。 - user783774
2
这将进行分组,但不会在响应中删除“类型”。我不介意,因为它很简单,但说实话它并没有回答问题。 - pjammer

2
by_type = {}

a.each do |h|
   type = h.delete("type").to_s
   # type = ("type_" + type ).to_sym

   by_type[ type ] ||= []
   by_type[ type ] << h      # note: h is modified, without "type" key

end

注意:这里的哈希键略有不同,我直接使用类型值作为键。
如果您必须按照示例中的哈希键,可以添加被注释掉的行。
P.S.:我刚刚看到了Tapio的解决方案——非常好且简短!请注意,它仅适用于Ruby >= 1.9。

1
为什么不直接使用 a.group_by {|x| x['type']} 呢? - Tapio Saarinen
它不会删除'type'键,这有什么关系吗?我认为这并不重要,对吧? - Tapio Saarinen
@Tapio:在他的示例中,他期望“type”键在途中被删除... 是的,我同意,这并不重要.. group_by()很新鲜好用,谢谢! +1 - Tilo

2
也许是这样的吗?
mangled = a.group_by { |h| h['type'].to_i }.each_with_object({ }) do |(k,v), memo|
    tk = ('type_' + k.to_s).to_sym
    memo[tk] = v.map { |h| h = h.dup; h.delete('type'); h }
end

如果您不关心保留原始数据:

mangled = a.group_by { |h| h['type'].to_i }.each_with_object({ }) do |(k,v), memo|
    tk = ('type_' + k.to_s).to_sym
    memo[tk] = v.map { |h| h.delete('type'); h } # Drop the h.dup in here
end

2

group_by 将可枚举对象按块返回的结果分成集合组。您在此块中不仅限于获取键的值,因此如果您想在这些集合中省略'type',可以像下面这样做:

array.group_by {|x| "type_#{x.delete('type').to_i}".to_sym}

这将精确地实现您所要求的功能。

高级: 这有点超出了问题的范围,但如果您想要保留原始数组,必须复制其中每个对象。以下是可行的方法:

array.map(&:dup).group_by {|x| "type_#{x.delete('type').to_i}".to_sym}

网页内容由stack overflow 提供, 点击上面的
可以查看英文原文,
原文链接