如何在数组中索引重复的元素？

Question

如何在数组中索引重复的元素？

6

从以下数组（哈希）开始：

[
  {:name=>"site a", :url=>"http://example.org/site/1/"}, 
  {:name=>"site b", :url=>"http://example.org/site/2/"}, 
  {:name=>"site c", :url=>"http://example.org/site/3/"}, 
  {:name=>"site d", :url=>"http://example.org/site/1/"}, 
  {:name=>"site e", :url=>"http://example.org/site/2/"}, 
  {:name=>"site f", :url=>"http://example.org/site/6/"},
  {:name=>"site g", :url=>"http://example.org/site/1/"}
]

我该如何添加重复URL的索引呢？例如：

[
  {:name=>"site a", :url=>"http://example.org/site/1/", :index => 1}, 
  {:name=>"site b", :url=>"http://example.org/site/2/", :index => 1}, 
  {:name=>"site c", :url=>"http://example.org/site/3/", :index => 1}, 
  {:name=>"site d", :url=>"http://example.org/site/1/", :index => 2}, 
  {:name=>"site e", :url=>"http://example.org/site/2/", :index => 2}, 
  {:name=>"site f", :url=>"http://example.org/site/6/", :index => 1},
  {:name=>"site g", :url=>"http://example.org/site/1/", :index => 3}
]

- Luke

3个回答

3

array = [
  {:name=>"site a", :url=>"http://example.org/site/1/"}, 
  {:name=>"site b", :url=>"http://example.org/site/2/"}, 
  {:name=>"site c", :url=>"http://example.org/site/3/"}, 
  {:name=>"site d", :url=>"http://example.org/site/1/"}, 
  {:name=>"site e", :url=>"http://example.org/site/2/"}, 
  {:name=>"site f", :url=>"http://example.org/site/6/"},
  {:name=>"site g", :url=>"http://example.org/site/1/"}
]

array.inject([]) { |ar, it| 
    count_so_far = ar.count{|i| i[:url] == it[:url]}
    it[:index] = count_so_far+1
    ar << it
}
#=>
[
  {:name=>"site a", :url=>"http://example.org/site/1/", :index=>1}, 
  {:name=>"site b", :url=>"http://example.org/site/2/", :index=>1}, 
  {:name=>"site c", :url=>"http://example.org/site/3/", :index=>1}, 
  {:name=>"site d", :url=>"http://example.org/site/1/", :index=>2}, 
  {:name=>"site e", :url=>"http://example.org/site/2/", :index=>2}, 
  {:name=>"site f", :url=>"http://example.org/site/6/", :index=>1}, 
  {:name=>"site g", :url=>"http://example.org/site/1/", :index=>3}
]

- fl00r

太好了，现在我只是试图理解它的工作原理... :) - Luke

我重新格式化了inject调用，希望能使它更清晰。inject循环遍历接收数组，并在每次调用注入块时，ar将包含它“到目前为止”看到的URL（及其运行计数）-因为它们被添加到块的末尾。所以在开始时，您要计算到目前为止已经看到了多少个“当前”URL，并将其添加。这有点笨拙，因为它实际上是一个伪装的递归操作。（感谢@fl00r慷慨地让我试图让他的代码可理解。） - millimoose

1

我会用 count_so_far = ar[ar.rindex{|i| i[:url] == it[:url]}][:index] 替换 count_so_far = ar.count{|i| i[:url] == it[:url]}。如果有很多元素，性能会更好。 - Serabe

@Sii 为什么？当在 G 中时，rindex 将返回3，正如我所期望的那样。 - Serabe

@Serabe：实际上，你是对的，我错过了你正在检索的先前运行计数。只是花了我一些时间来处理“rindex”返回“nil”的情况。 - millimoose

@Sii请查看此处的文档：[http://www.ruby-doc.org/core/classes/Array.html#M000238] - Serabe

0

如果我想要它高效，我会这样写:

items_with_index = items.inject([[], {}]) do |(output, counts), h|
  new_count = (counts[h[:url]] || 0) + 1
  [output << h.merge(:index => new_count), counts.update(h[:url] => new_count)]
end[0]

- tokland

网页内容由stack overflow 提供, 点击上面的

可以查看英文原文，
原文链接

- undur_gongor · Accepted Answer

我建议使用哈希表来跟踪索引。反复扫描先前的条目似乎效率低下。

counts = Hash.new(0)
array.each { | hash | 
  hash[:index] = counts[hash[:url]] = counts[hash[:url]] + 1
}

或者更简洁一些

array.each_with_object(Hash.new(0)) { | hash, counts | 
  hash[:index] = counts[hash[:url]] = counts[hash[:url]] + 1
}