Python最短距离算法

3

我希望创建一个简单的广度优先搜索算法,该算法返回最短路径。

演员信息字典将演员映射到演员出现的电影列表:

actor_info = { "act1" : ["movieC", "movieA"], "act2" : ["movieA", "movieB"], 
     "act3" :["movieA", "movieB"], "act4" : ["movieC", "movieD"], 
     "act5" : ["movieD", "movieB"], "act6" : ["movieE"], 
     "act7" : ["movieG", "movieE"], "act8" : ["movieD", "movieF"], 
     "KevinBacon" : ["movieF"], "act10" : ["movieG"], "act11" : ["movieG"] }

这个的反函数将电影映射到出现在它们中的演员列表:
movie_info = {'movieB': ['act2', 'act3', 'act5'], 'movieC': ['act1', 'act4'], 
      'movieA': ['act1', 'act2', 'act3'], 'movieF': ['KevinBacon', 'act8'], 
      'movieG': ['act7', 'act10', 'act11'], 'movieD': ['act8', 'act4', 'act5'], 
      'movieE': ['act6', 'act7']}

所以针对一个调用
shortest_dictance("act1", "Kevin Bacon", actor_info, movie_info)

我应该得到 3,因为act1出现在movieC中,并与出现在movieD中的Act4一起出现,而Act8又出现在与KevinBacon一起出现的movie F中。所以最短距离是3。

到目前为止,我有以下内容:

def shotest_distance(actA, actB, actor_info, movie_info):
   '''Return the number of movies required to connect actA and actB. 
   If theres no connection return -1.'''

    # So we keep 2 lists of actors:
    #   1.The actors that we have already investigated.
    #   2.The actors that need to be investigated because we have found a 
    #      connection beginning at actA. This list must be 
    #      ordered, since we want to investigate actors in the order we 
    #      discover them.
    #      -- Each time we put an actor in this list, we also store
    #         her distance from actA.
    investigated = []
    to_investigate = [actA]
    distance = 0
    while actB not in to_investigate and to_investigate!= []:
        for actor in to_investigate:
            to_investigated.remove(actA)
            investigated.append(act)

            for movie in actor_info[actor]:
                for co_star in movie_info[movie]:
                    if co_star not in (investigated and to_investigate):
                        to_investigate.append(co_star)

 ....
 ....

 return d    

我想不出一个合适的方法来跟踪代码每次迭代中发现的距离。另外,代码在时间上似乎非常低效。


这是一道面试题吗? - anijhaw
2个回答

2

首先,将所有节点连接成一个图形,然后运行 shortest_path 代码(可能有一种高效的图形库可以代替下面提到的函数,但是这个函数很优雅),然后从最短路径中找出所有电影名称的数量。

for i in movie_info:
    actor_info[i] = movie_info[i]

def find_shortest_path(graph, start, end, path=[]):
    path = path + [start]
    if start == end:
        return path
    if not start in graph:
        return None
    shortest = None
    for node in graph[start]:
        if node not in path:
            newpath = find_shortest_path(graph, node, end, path)
            if newpath:
                if not shortest or len(newpath) < len(shortest):
                    shortest = newpath
    return shortest

L = find_shortest_path(actor_info, 'act1', 'act2')
print len([i for i in L if i in movie_info])

find_shortest_path 源代码:http://www.python.org/doc/essays/graphs/


该函数用于查找图中两个节点之间的最短路径。

2
编程提示:使用“start in graph”而不是“graph.has_key(start)”。 - Raymond Hettinger

1

这看起来像是有效的。它跟踪当前一组电影。对于每个步骤,它查看所有尚未被考虑(“seen”)的一步之遥的电影。

actor_info = { "act1" : ["movieC", "movieA"], "act2" : ["movieA", "movieB"], 
     "act3" :["movieA", "movieB"], "act4" : ["movieC", "movieD"], 
     "act5" : ["movieD", "movieB"], "act6" : ["movieE"], 
     "act7" : ["movieG", "movieE"], "act8" : ["movieD", "movieF"], 
     "KevinBacon" : ["movieF"], "act10" : ["movieG"], "act11" : ["movieG"] }

movie_info = {'movieB': ['act2', 'act3', 'act5'], 'movieC': ['act1', 'act4'], 
      'movieA': ['act1', 'act2', 'act3'], 'movieF': ['KevinBacon', 'act8'], 
      'movieG': ['act7', 'act10', 'act11'], 'movieD': ['act8', 'act4', 'act5'], 
      'movieE': ['act6', 'act7']}

def shortest_distance(actA, actB, actor_info, movie_info):
    if actA not in actor_info:
        return -1  # "infinity"
    if actB not in actor_info:
        return -1  # "infinity"
    if actA == actB:
        return 0

    dist = 1
    movies = set(actor_info[actA])
    end_movies = set(actor_info[actB])
    if movies & end_movies:
        return dist

    seen = movies.copy()
    print "All movies with", actA, seen
    while 1:
        dist += 1
        next_step = set()
        for movie in movies:
            for actor in movie_info[movie]:
                next_step.update(actor_info[actor])
        print "Movies with actors from those movies", next_step
        movies = next_step - seen 
        print "New movies with actors from those movies", movies
        if not movies:
            return -1 # "Infinity"

        # Has actorB been in any of those movies?
        if movies & end_movies:
            return dist

        # Update the set of seen movies, so I don't visit them again
        seen.update(movies)

if __name__ == "__main__":
    print shortest_distance("act1", "KevinBacon", actor_info, movie_info)

输出结果为

All movies with act1 set(['movieC', 'movieA'])
Movies with actors from those movies set(['movieB', 'movieC', 'movieA', 'movieD'])
New movies with actors from those movies set(['movieB', 'movieD'])
Movies with actors from those movies set(['movieB', 'movieC', 'movieA', 'movieF', 'movieD'])
New movies with actors from those movies set(['movieF'])
3

这里有一个版本,它返回一个电影列表,组成最小连接(如果actA和actB相同,则为无连接,如果没有连接则为空列表)。
def connect(links, movie):
    chain = []
    while movie is not None:
        chain.append(movie)
        movie = links[movie]
    return chain

def shortest_distance(actA, actB, actor_info, movie_info):
    if actA not in actor_info:
        return None  # "infinity"
    if actB not in actor_info:
        return None  # "infinity"
    if actA == actB:
        return []

    # {x: y} means that x is one link outwards from y
    links = {}

    # Start from the destination and work backward
    for movie in actor_info[actB]:
        links[movie] = None
    dist = 1
    movies = links.keys()

    while 1:
        new_movies = []
        for movie in movies:
            for actor in movie_info[movie]:
                if actor == actA:
                    return connect(links, movie)
                for other_movie in actor_info[actor]:
                    if other_movie not in links:
                        links[other_movie] = movie
                        new_movies.append(other_movie)
        if not new_movies:
            return None # Infinity
        movies = new_movies

if __name__ == "__main__":
    dist = shortest_distance("act1", "KevinBacon", actor_info, movie_info)
    if dist is None:
        print "Not connected"
    else:
        print "The Kevin Bacon Number for act1 is", len(dist)
        print "Movies are:", ", ".join(dist)

这是输出结果:
The Kevin Bacon Number for act1 is 3
Movies are: movieC, movieD, movieF

网页内容由stack overflow 提供, 点击上面的
可以查看英文原文,
原文链接