从一个包含多个字典的字典中创建内部值列表

3

我正在尝试找出字典内部值的最大值和最小值,这个字典是由dict组成的。

这个dict长这样:

{'ALLEN PHILLIP K': {'bonus': 4175000,
                     'exercised_stock_options': 1729541,
                     'expenses': 13868},
 'BADUM JAMES P': {'bonus': 'NaN',
                   'exercised_stock_options': 257817,
                   'expenses': 3486},
 ...
}

我希望确定所有字典中exercised_stock_options的最小值和最大值。

我尝试使用pandas来做这件事,但是无法找到适当的方式来整理数据。然后,我尝试在Python中使用简单的for循环。我的for循环代码不起作用,我无法确定原因(字典的字典称为data_dict):

stock_options=[]
for person in range(len(data_dict)):
    stock_options.append(data_dict[person]['exercised_stock_options'])
print stock_options

接下来我将要取出列表中的最大值和最小值。

这段代码为什么不起作用?还有其他方法可以找到字典嵌套的内部值的最大值和最小值吗?

4个回答

4
这里有一个方法,使用列表推导式从每个字典中获取exercised_stock_options,然后打印出数据的最小值和最大值。忽略示例数据,您可以根据需要进行修改。
d = {'John Smith':{'exercised_stock_options':99},
     'Roger Park':{'exercised_stock_options':50},
     'Tim Rogers':{'exercised_stock_options':10}}
data = [d[person]['exercised_stock_options'] for person in d]
print min(data), max(data)

2
您正在使用范围来获取主词典的索引号。您真正应该做的是获取字典的键,而不是索引。也就是说,person是每个人的名称。因此,当person == 'ALLEN PHILLIP K'时,datadict [person]现在获取该键的字典。
请注意,使用items()迭代字典表示最好使用d,v = data_dict.items()而不是循环遍历字典本身。还要注意Python 2和Python 3之间的区别。
people=[]
stock_options=[]
for person, stock_data in data_dict.items():
    people.append(person)
    stock_options.append(stock_data['exercised_stock_options'])
    # This lets you keep track of the people as well for future use
print stock_options
mymin = min(stock_options)
mymax = max(stock_options)
# process min and max values.

Best-practice

Use items() to iterate across dictionary

The updated code below demonstrates the Pythonic style for iterating through a dictionary. When you define two variables in a for loop in conjunction with a call to items() on a dictionary, Python automatically assigns the first variable as the name of a key in that dictionary, and the second variable as the corresponding value for that key.

d = {"first_name": "Alfred", "last_name":"Hitchcock"}

for key,val in d.items():
    print("{} = {}".format(key, val))

Difference Python 2 and Python 3

In python 2.x the above examples using items would return a list with tuples containing the copied key-value pairs of the dictionary. In order to not copy and with that load the whole dictionary’s keys and values inside a list to the memory you should prefer the iteritems method which simply returns an iterator instead of a list. In Python 3.x the iteritems is removed and the items method returns view objects. The benefit of these view objects compared to the tuples containing copies is that every change made to the dictionary is reflected in the view objects.


1
你需要迭代字典 .values() 并返回 "exercised_stock_options" 的值。你可以使用简单的列表推导式来检索这些值。
>>> values = [value['exercised_stock_options'] for value in d.values()]
>>> values
[257817, 1729541]
>>> min(values)
257817
>>> max(values)
1729541

0

我几周前发布了lifter,专门用于这种任务,我认为你可能会觉得它很有用。

唯一的问题是你有一个映射(字典的字典),而不是一个常规的可迭代对象。

以下是使用lifter的答案:

from lifter.models import Model

# We create a model representing our data
Person = Model('Person')

# We convert your data to a regular iterable
iterable = []
for name, data in your_data.items():
    data['name'] = name
    iterable.append(data)

# we load this into lifter
manager = Person.load(iterable)

# We query the data
results = manager.aggregate(
    (Person.exercised_stock_options, min),
    (Person.exercised_stock_options, max),
)

当然,您可以使用列表推导式来实现相同的结果,但是,如果您想在获取结果之前使用复杂查询来过滤数据,那么使用专用库有时会很方便。例如,您可以仅获取支出少于10000的人的最小值和最大值:

# We filter the data
queryset = manager.filter(Person.expenses < 10000)

# we apply our aggregate on the filtered queryset
results = queryset.aggregate(
    (Person.exercised_stock_options, min),
    (Person.exercised_stock_options, max),
)

网页内容由stack overflow 提供, 点击上面的
可以查看英文原文,
原文链接