Python: 字典可以用于索引吗？

Question

Python: 字典可以用于索引吗？

pythondictionaryindexing

5

这是我在StackOverflow上的第一个问题，我已经搜索了很多网站，但没有找到我要找的内容（或者没有注意到）。请不要打击我的积极性 :)

此外，这是我第一次使用Python进行编程，我感到困惑。

我有一个文本文件，里面有三列，用空格分隔。这些列是“DeptID”、“CourseID”和“NumberofStudentsEnrolled”。

以下是样本数据：

101 10001 23
102 10002 30
102 10004 5
102 10005 13
105 10006 59
105 10007 77

所以，每当我调用“DeptID”索引和“CourseID”索引时，程序将给出已注册学生的人数。

例如：NumberofEnrolled（“101”，“10001”）应该给出23作为答案。

我应该尝试使用矩阵吗？因为我有点迷失了。我知道我想要什么，但我不知道在Python中它被称为什么。

import numpy

depts = []
courses = []

file = open("C:\\Info.txt", "r")

# SPLIT EVERY LINE INTO 3 PIECES : DeptID , CourseID , Enrolled
for line in file:
    depts.append(line.split()[0]) # ADD Depts
    courses.append(line.split()[1])  # ADD Courses

# CLOSE THE FILE
file.close()  

# I HAVE TRIED NUMPY BUT COULDN'T HANDLE WITH IT.
numpyList = numpy.zeros((57, 57), dtype = numpy.int32)    

dept_array = numpy.array(dept)
course_array = numpy.array(course)


test_dict = {}
for i in range(len(dept_array)):
test_dict[dept_array[i]] = course_array[i]

test_dict 输出如下：

{'101': '10001', '102': '10005', '105': '10007'}

这个输出只能获取多个数据的最后一个数据。我猜测我需要一种类型来存储多个键值对。

- Jason K.

我建议您研究 数据框架 和 pandas。 - Vinícius Figueiredo

1

我不能用numpy做这个吗？ - Jason K.

3

可以使用字典嵌套字典的方式轻松完成，不一定需要使用重量级的(numpy或pandas)解决方案。 - donkopotamus

4个回答

3

其他人已经给了你一些选项。

我建议，由于 (deptID, courseID) 这对键值对是唯一的，你可以使用元组作为你的键。

depts = dict()

depts[(101,10001)] = 23
depts[(102,10002)] = 30
depts[(102,10004)] = 5
depts[(102,10005)] = 13
depts[(105,10006)] = 59
depts[(105,10007)] = 77


print(depts)
#output: {(102, 10002): 30, (101, 10001): 23, (105, 10006): 59, (102, 10005): 13, (105, 10007): 77, (102, 10004): 5}

print(depts.keys())
#output: [(102, 10002), (101, 10001), (105, 10006), (102, 10005), (105, 10007), (102, 10004)]

#should you ever need to access all the courses associated with an ID you 
#can use a filter with a lambda or more easily a List Comprehension
#to identify that data.  But this will be have O(n) time look up as opposed
#to a dictionary of dictionaries which would have a O(1) look up for 
#associated courseID lookups.
print([catalogue[1] for catalogue in depts.keys() if catalogue[0] == 102])
#output: [10002, 10005, 10004]


for (i,j) in depts.keys() :
    print (depts[(i,j)])
#output:   30
#output:   23
#output:   59
#output:   13
#output:   77
#output:   5

- FredMan

抱歉，我是Python的新手 :) 那么我可以做类似这样的事情吗 depts[(i, j)] - Jason K.

没什么好道歉的，伙计。如果你未来打算更多地使用Python，你可能想要购买一本Oreilly出版的Python Cookbook。它帮助了很多人，而且在我看来是一本相当不错的读物。 - FredMan

谢谢@FredMan。我能做像depts[(i, j)]这样的事情吗？ - Jason K.

当然。将i设置为某个变量，将j设置为某个变量，然后调用depts[(i,j)]，它会找到学生数量。我会编辑我的帖子，加入一个例子。 - FredMan

非常感谢，那很有帮助 :) - Jason K.

1

如果您将数据转换为字典，那么就会变得更加容易。

打开您的info.txt文件并另存为info.csv。这样做的原因是因为csv可以轻松处理空格、逗号和其他分隔符。

import csv

data_dict = {}
# you can change the delimiter if its something other than space.
with open("C:\\Info.txt", "r") as fobj:
    data = csv.reader(fobj, delimiter=' ')

    # reading the rows/lines of the file
    for row in data:
        if row[0] in data_dict.keys():
            data_dict[row[0]][row[1]] = row[2]
        else:
            data_dict[row[0]] = {row[1]: row[2]}

def func(dept_id, course_id):
    # check whether the dept_id exists in your current dictonary
    if dept_id in data_dict.keys():
        # check whether the course_id exists in your current dictonary
        if course_id in data_dict[dept_id].keys():
            return data_dict[dept_id][course_id]
        else:
            print ('Invalid course id')
    else:
        print ('invalid department id')

print func('101', '10001')

- Abhinav Anand

目前这段代码是错误的，data_dict[row[0]] = {row[1]: row[2]} 会覆盖与 row[0] 相关联的值，而不是向其中添加新数据。 - donkopotamus

@donkopotamus 谢谢你指出这个问题。我已经纠正了它。 - Abhinav Anand

0

如果你真的想同时使用DeptID和CourseID，似乎你需要一个二维查找表（不是Python内置的东西），首先在字典中查找可能的DeptID，将给你一个与该部门对应的已注册课程编号及其人数的表（字典）。

有点低效，但我觉得所有的CourseIDs都会是唯一的，如果是这样的话，你是否可以根据此进行查找？

- Joseph Perez

网页内容由stack overflow 提供, 点击上面的

可以查看英文原文，
原文链接

- donkopotamus · Accepted Answer

您可以轻松地将数据读入一个字典的字典中：

data = {}
for line in file:
    dept, course, num_students = line.split()
    data.setdefault(dept, {})[course] = int(num_students)

现在您可以查找以下内容：

>>> data["101"]["10001"]
23