递归函数实现深度优先搜索

Question

递归函数实现深度优先搜索

4

我将尝试编写一个可以从命令行初始化的递归函数，它接受一个文本文件，其中包含一系列起始点、终点和这些点之间的距离，并找到从特定起始点到特定终点的最短距离。

例如，一个文本文件可能如下所示：

a,b,5
a,c,8
b,d,6
c,d,2
d,e,12
d,f,2
e,g,3
f,g,7

并且将类似以下方式调用：

python program_name.py a b text_file_name.txt

这个想法基于我即将参加的下一学期的一个项目，我希望提前开始。教授提供了一个广泛的“起始代码”，可以在这里找到：here。

我已经尝试过多种不同的方法来实现递归函数，以便遍历文本文件并记录和比较点之间的距离，但是我似乎做不对。现在我的代码（非常明显是错误的）是：

if place not in distances:
    print('not in')
    distances[place] =roads[place]
    dist_so_far = distances[place]     
    dfs(place, 0.0, roads, distances)
elif place in distances and distances[place] <= dist_so_far:
    print('less than')
    #dfs(place,0.0, roads, distances)
elif place in distances and distances[place] > dist_so_far:
    print('greater than')
    distances[place] = dist_so_far
    dfs(place, 0.0, roads, distances)

我知道这不是正确的做法，但我认为它的格式是一个很好的起点。我只是无法理解哪些字典包含什么以及要比较哪些索引。

- Crakajaxz

1

提醒一下，你不应该在stackoverflow上发布课程项目。教授会找到你的。 - Kristopher Wagner

1

但由于这不是一项课堂作业，而是他自己处理的个人项目，以便提前开始课程，因此它实际上是 Stack Overflow 上的主题。与我们收到的大多数“请帮我做作业”的问题不同，这里的 OP 实际上正在尝试独立完成工作，并询问他不理解的事情的帮助。干得好，@Crakajaxz。 - rmunn

1

@rmunn非常准确，谢谢。我还注意到在那个任务页面上有一个作者的位置，所以如果这确实成为下学期的一个项目，我计划链接这个问题以提供适当的文档。并不是想隐藏什么或作弊。 - Crakajaxz

@Crakajaxz - 那是一种极好的态度，这正是我感到有动力给你写一份详细答案的原因。此外，如果你需要比我刚刚给你的更多帮助，你应该将其作为一个单独的问题提出来，因为Stack Overflow更喜欢每个问题都只涉及一个主题，并且每个人只发布一个答案。如果你确实提出了一个单独的问题，请随意在这里发表评论提到我的用户名（就像你刚才做的那样），并提供一个链接到你的新问题。我也很乐意回答它。 - rmunn

1个回答

网页内容由stack overflow 提供, 点击上面的

可以查看英文原文，
原文链接

- rmunn · Accepted Answer

我似乎无法理解哪些字典包含什么内容以及需要比较哪些索引，这让我感到十分疑惑！我将尝试解释教授发布的起始代码及其作用，但不会直接给出解决方案（因为通过自己解决问题可以学到更多）。请看此处的起始代码。

def read_distances(map_file):
    connections = dict()
    # ....
    return connections

该函数正在构建一个字典，您可以使用它来查找任何两个点之间是否连接以及它们之间的距离。因此，如果输入文本文件中有一条线路"A,B,5"，则read_distances的结果将创建两个条目，一个为A，其值为("B",5)，另一个为B，其值为("A",5)。还请注意，该字典中的每个条目都将是一个连接列表。换句话说：

# Assume that map_file contained one line: "A,B,5"
distances = read_distances(map_file)
print(distances["A"])  # Will print "[('B',5)]"
print(distances["B"])  # Will print "[('A',5)]"

如果有多个连接，例如文本文件包含以下内容：

A,B,5
A,C,3
B,C,4

然后，您会得到类似于以下内容的东西：

distances = read_distances(map_file)
print(distances["A"])  # Will print "[('B',5),('C',3)]"
print(distances["B"])  # Will print "[('A',5),('C',4)]"
print(distances["C"])  # Will print "[('A',3),('B',4)]"

当你获取distances[starting_point]时，你会得到一个列表，其中包含所有与你的起点有单个连接的点。该列表由2元组（即具有2个元素的元组）组成，每个元组都具有结构（other_point，distance_as_int）。

我认为我应该在这里停下了，因为这可能是刚好足够帮助你解决目前的问题而不为你解决练习的内容。（我刚刚删除了一段我写的“这是我建议如何解决这个问题”的部分，因为我意识到除非你要求，否则我不应该给你那样的帮助。）如果你需要更多的帮助，请在这个答案上或者你的问题上留言，我应该能收到通知。我很乐意为你提供更多帮助，特别是因为你在课程开始之前就尝试自己解决它，这是正确的方法。

更新1： 好的，还有一个提示不能为你解决问题。当你查找distances[starting_point]并获得一个元组列表时，你将想要遍历该列表并对每个元组执行某些操作。例如，

connections = distances[start_point]
for connection in connections:
    end_point = connection[0]
    distance = connection[1]
    # Now do something with start_point, end_point, and distance
    # Precisely *what* you'll do with them is up to you

这段代码可以简化一下，因为Python有一个很好的“元组拆包”功能：如果你正在循环遍历像“connections”列表这样生成元组的东西，你可以这样做：

connections = distances[start_point]
for end_point, distance in connections:
    # Now do something with start_point, end_point, and distance
    # Precisely *what* you'll do with them is up to you

这将自动为您“解包”元组。请注意，仅当您知道所有元组都具有相同数量的项目（在本例中为2）时才起作用。还有一件事情可以做，那就是注意到我们实际上不需要那个connections变量，因为我们除了循环之外没有使用它。消除它，代码变成：

for end_point, distance in distances[start_point]:
    # Now do something with start_point, end_point, and distance
    # Precisely *what* you'll do with them is up to you

这是编写该特定循环的最清晰、最“Pythonic”的方式。

注意：上面的代码实际上不能直接运行，因为Python要求任何循环都必须至少有一个语句在其中，而注释不算作语句。为了使该代码实际运行，您需要在循环中包含pass语句。pass语句是一种特殊语句，它什么也不做。所以要真正运行上面的空循环代码，您需要编写：

for end_point, distance in distances[start_point]:
    # Now do something with start_point, end_point, and distance
    # Precisely *what* you'll do with them is up to you
    pass

那个循环将被允许，而没有单词pass的代码会产生一个IndentationError。为了简化，我在上面的所有示例中都省略了pass，但我认为值得提一下为什么这段代码不能按原样运行。

更新2：

根据要求，这里是一个解决方案函数，以便您可以逐步进行并理解正在发生的情况。我放了大量注释，但没有注释，这只有八行代码。

def dfs(place, dist_so_far, roads, distances):
    """Depth-first search, which may continue from from_place if dist_so_far
        is the shortest distance at which it has yet been reached.
       Args:
          place: Currently searching from here
          dist_so_far:  Distance at which from_place has been reached
              this time (which may not be the shortest path to from_place)
          roads:  dict mapping places to lists of hops of the form (place, hop-distance)
          distances: dict mapping places to the shortest distance at which they
               have been reached so far (up to this time).
    """
    #FIXME
    #   Consider cases:
    #      - We've never been at place before (so it's not in distances)
    #      - We've been at place before, on a path as short as this one (in distances)
    #      - We've been here before, but this way is shorter (dist_so_far)
    #    Consider which are base cases, and which require recursion.
    #    For the cases that require recursion, what is the progress step?

    # First scenario: we've never reached this place before
    if place not in distances:
        # Right now we only know one way to get to this place,
        # so that's automatically the shortest known distance.
        distances[place] = dist_so_far

    # Second scenario: we've been here before, via a route
    # that was shorter than dist_so_far. If so, then any
    # roads from here lead to places we've also already
    # visited via a shorter route. Any distance we calculate
    # right now would just be longer than the distance we've
    # already found, so we can just stop right now!
    if dist_so_far > distances[place]:
        return

    # Third scenario: dist_so_far is actually the shortest
    # path we've found yet. (The first scenario is actually
    # a special case of this one!) We should record the
    # shortest distance to this place, since we'll want to
    # use that later. Then we'll look at all the roads from
    # this place to other places, and for each of those
    # other places, we'll make a recursive call to figure
    # out more paths.

    # Note no "if" statement needed: because of the return
    # statement earlier, if we get here, we know that the
    # current route is the best one yet known.
    distances[place] = dist_so_far

    # Now for some recursion:
    for other_place, hop_distance in roads[place]:
        dist_to_other_place = dist_so_far + hop_distance
        dfs(other_place, dist_to_other_place, roads, distances)

    # That's it!

没错，就是这样。我对你提供的样本距离文件运行了这个算法，并找到了每对点之间的最短距离。尝试在脑海中逐步运行算法，看看你是否理解它的原理。

然而，有一个关键的概念可能在你第一次看到时并不明显。这个关键概念是： Python 字典是持久的和可变的。也就是说，如果你将一个字典对象（例如本示例中的 distances 字典）传递给一个函数，并且该函数修改了字典，则 你传递到函数的字典将被修改。

顺便说一下，专业程序员倾向于认为这是一件坏事，因为调用函数不应该意外地修改参数。这往往会导致程序中难以追踪的微妙错误，并且通常应该避免。（请注意上述句子中的意外地一词。但是，如果你调用的函数名为add_value_to_dict，那么你很可能期望你的字典被修改。）

然而，在这种特殊情况下，通常被认为是一个不好的副作用的字典修改效应，对于编写高效的代码至关重要。 distances 字典被用来跟踪我们已经找到了什么，并查看是否有剩余的工作要做。由于你期望它会被递归调用的 dfs() 函数修改，所以你不会为自己创建微妙的错误。但是我不希望你认为，在函数参数中修改字典的技巧在任何时候都是一个好主意。大多数情况下，它会导致你发现直到几个月后才能解决的微妙错误。

好了，这可能已经足够的解释了。尝试在脑海中逐步运行此代码并理解它。如果有什么困惑，请告诉我。