基于计数器递增Python字典值

Question

基于计数器递增Python字典值

3

我有一个包含重复值的字典。

Deca_dict = {
    "1": "2_506",
    "2": "2_506",
    "3": "2_506",
    "4": "2_600",
    "5": "2_600",
    "6": "1_650"
}

我使用了collections.Counter来计算每个元素的数量。

decaAdd_occurrences = {'2_506':3, '2_600':2, '1_650':1}

我随后创建了一个新的值字典以进行更新。

deca_double_dict = {key: value for key, value in Deca_dict.items()
                        if decaAdd_occurrences[value] > 1}
deca_double_dict = {
    "1": "2_506",
    "3": "2_506",
    "2": "2_506",
    "4": "2_600"
}

在这种情况下，它是原始字典而没有最后一项。

我正在尝试找出如何递增num，对于counter_dict的值减1。这将更新所有值，除了一个可以保持不变。目标输出允许其中一个重复项保持相同的值，而其余重复项的值字符串的第一个数字将逐渐递增（基于计数的重复次数）。我试图实现Deca_dict所表示数据的唯一值。

Goal output = {'1':'3_506', '2':'4_506', '3':'2_506', '4':'3_600', '5':'2_600'}

我最初采用了以下方法，但最终只是将所有双重项目递增，导致最初的结果只是值加一。为了背景：原始Deca_dict的值是通过连接两个数字（deca_address_num和deca_num_route）找到的。此外，homesLayer是一个QGIS矢量图层，其中deca_address_num和deca_num_route存储在具有索引d_address_idx和id_route_idx的字段中。

for key in deca_double_dict.keys():
    for home in homesLayer.getFeatures():
        if home.id() == key:
            deca_address_num = home.attributes()[d_address_idx]
            deca_num_route = home.attributes()[id_route_idx]
            deca_address_plus = deca_address_num + increment
            next_deca_address = (str(deca_address_plus) + '_' +
                                 str(deca_num_route))
            if not next_deca_address in Deca_dict.values():
                update_deca_dbl_dict[key] = next_deca_address

结果毫无用处：

Update_deca_dbl_dict = {
    "1": "3_506",
    "3": "3_506",
    "2": "3_506",
    "5": "3_600",
    "4": "3_600"
}

我的第二次尝试是要加入一个计数器，但是东西放错了位置。

for key, value in deca_double_dict.iteritems():
    iterations = decaAdd_occurrences[value] - 1
    for home in homesLayer.getFeatures():
        if home.id() == key:
            #deca_homeID_list.append(home.id())
            increment = 1
            deca_address_num = home.attributes()[d_address_idx]
            deca_num_route = home.attributes()[id_route_idx]
            deca_address_plus = deca_address_num + increment
            next_deca_address = (str(deca_address_plus) + '_' +
                                 str(deca_num_route))
            #print deca_num_route
            while iterations > 0:
                if not next_deca_address in Deca_dict.values():
                    update_deca_dbl_dict[key] = next_deca_address
                    iterations -= 1
                    increment += 1

更新尽管下面的一个答案适用于递增字典中所有重复的项目，但我正在重新编写我的代码，因为我需要将此比较条件与原始数据进行比较以进行递增。我仍然得到了与第一次尝试（无用的）相同的结果。

for key, value in deca_double_dict.iteritems():
    for home in homesLayer.getFeatures():
        if home.id() == key:
            iterations = decaAdd_occurrences[value] - 1
            increment = 1
            while iterations > 0:
                deca_address_num = home.attributes()[d_address_idx]
                deca_num_route = home.attributes()[id_route_idx]
                deca_address_plus = deca_address_num + increment
                current_address = str(deca_address_num) + '_' + str(deca_num_route)
                next_deca_address = (str(deca_address_plus) + '_' +
                                 str(deca_num_route))
                if not next_deca_address in Deca_dict.values():
                    update_deca_dbl_dict[key] = next_deca_address
                    iterations -= 1
                    increment += 1
                else:
                    alpha_deca_dbl_dict[key] = current_address
                    iterations = 0

- user25976

@Elric 我查看完整字符串以查找重复项，但我只增加字符串的第一个数字。最初，该字符串是QGIS要素的两个属性值的串联。因此，我通过迭代要素来恢复ID为字典键的要素及其原始两个属性编号。 - user25976

@Xenomorph 你是指 next_deca_address 的连接吗？如果你是指那个——不，我没有语法错误。 - user25976

@user25976 奇怪，通常你只需要使用 ' 或 " 就可以了。 - nbro

2

你的字典很有趣：它有重复的键。你是怎么做到的？decaAdd_occurrences是什么？Deca_dict和deca_dict是否应该相同？ - Paul Cornelius

1

你在变量increment中的更改并没有影响到deca_address_plus的值。我的意思是，在while循环之后没有赋值操作，你只是在每次搜索键后将increment设置为1。抱歉，我现在没有电脑，只是觉得你的代码很有趣。 - eri0o

显示剩余15条评论

3个回答

1

这里有一个解决方案：基本上，它保留了重复值中的第一个，并在其余重复值前添加的数字上递增。

from collections import OrderedDict, defaultdict
orig_d = {'1':'2_506', '2':'2_506', '3':'2_506', '4':'2_600', '5':'2_600'}
orig_d = OrderedDict(sorted(orig_d.items(), key=lambda x: x[0]))

counter = defaultdict(int)
for k, v in orig_d.items():
    counter[v] += 1
    if counter[v] > 1:
        pre, post = v.split('_')
        pre = int(pre) + (counter[v] - 1)
        orig_d[k] = "%s_%s" % (pre, post)

print(orig_d)

结果：

OrderedDict([('1', '2_506'), ('2', '3_506'), ('3', '4_506'), ('4', '2_600'), ('5', '3_600')])

- junnytony

1

我认为这段代码可以实现你想要的功能。我稍微修改了你的输入字典，以更好地说明发生了什么。与你之前所做的主要区别在于，由Counter字典创建的decaAdd_occurrences不仅跟踪计数，还跟踪当前地址num前缀的值。这使得在修改Deca_dict的过程中，既更新了计数，又更新了下一个要使用的num值，因为它们都是通过decaAdd_occurrences来完成的。

from collections import Counter

Deca_dict = {
    "1": "2_506",
    "2": "2_506",
    "3": "2_506",
    "4": "2_600",
    "5": "1_650",
    "6": "2_600"
}

decaAdd_occurrences = {k: (int(k.split('_')[0]), v) for k,v in
                                Counter(Deca_dict.values()).items()}

for key, value in Deca_dict.items():
    num, cnt = decaAdd_occurrences[value]
    if cnt > 1:
        route = value.split('_')[1]
        next_num = num + 1
        Deca_dict[key] = '{}_{}'.format(next_num, route)
        decaAdd_occurrences[value] = next_num, cnt-1  # update values

更新的字典：

Deca_dict -> {
    "1": "3_506",
    "2": "2_506",
    "3": "4_506",
    "4": "3_600",
    "5": "1_650",
    "6": "2_600"
}

- martineau

关于这行代码decaAdd_occurrences[value] = next_num, cnt - 1，我有一个问题。如果我没记错的话，value是Deca_dict中的一个值。如果最初的value = '2_506'，那么{'2_506':(2,3)}。这行代码会将此项目更改为{'2_506':(3,1)}吗？如果是，为什么会这样？如果不是，它做了什么？ - user25976

并不完全正确。在 decaAdd_occurrences 中，键 '2_506' 的初始值将为 (2, 3)，但在循环中，在第一个项目被更新后，它将变为 (3, 2)。在第二个项目被更新后，它将变为 (4, 1)，然后从那时起，它将保持这种状态，因为 cnt 不再是 > 1。这允许其中一个重复项保持相同的值，正如您所希望的那样。结果表明，这确实发生了。 - martineau

这是 OP 实际想要的吗？如果你在输入 Deca_dict 中再添加一个元素 - 7:3_600 - 那么你的输出字典实际上将包含重复的值：原始的 "3_600" 和一个来自于将 "2_600" 递增而来的值。 - Paul Cornelius

@Paul：说得好。这正是OP所说的他们想要的。从问题中，我不确定像字典中那样输入数据是否可能。如果是这样的话，那么OP提出的算法将无法解决问题——我所做的只是尝试在Python中高效地实现它。 - martineau

@martineau 当然，你做到了。我实际上无法理解 OP 想要做什么，但出于某种原因，他似乎不喜欢重复。 - Paul Cornelius

网页内容由stack overflow 提供, 点击上面的

可以查看英文原文，
原文链接

- Paul Cornelius · Accepted Answer

这大致符合您的要求吗？我假设您可以处理将2_506更改为3_506等的功能。我使用一个集合而不是您的计数器，以确保没有重复值。

在原帖中，我在底部切断了一行，抱歉。

values_so_far = set()
d1 = {} # ---your original dictionary with duplicate values---
d2 = {} # d1 with all the duplicates changed
def increment_value(old_value):
    # you know how to write this
    # return the modified string

for k,v in d1.items():
    while v in values_so_far:
        v = increment_value(v)
    d2[k] = v
    values_so_far.add(v)