使用numpy进行矩阵乘法计算L2距离

10
我想自己完成斯坦福大学CS231n 2017年卷积神经网络课程的任务。
我正在尝试使用NumPy仅使用矩阵乘法和求和广播来计算L2距离。 L2距离为:

enter image description here

如果我使用这个公式,我认为我可以做到:

enter image description here

以下代码展示了计算L2距离的三种方法。如果我将compute_distances_two_loops方法的输出与compute_distances_one_loop方法的输出进行比较,两者相等。但是,如果我将compute_distances_two_loops方法的输出与只使用矩阵乘法和广播求和实现L2距离的compute_distances_no_loops方法的输出进行比较,它们是不同的。
def compute_distances_two_loops(self, X):
    """
Compute the distance between each test point in X and each training point
in self.X_train using a nested loop over both the training data and the 
test data.

Inputs:
- X: A numpy array of shape (num_test, D) containing test data.

Returns:
- dists: A numpy array of shape (num_test, num_train) where dists[i, j]
  is the Euclidean distance between the ith test point and the jth training
  point.
"""
    num_test = X.shape[0]
    num_train = self.X_train.shape[0]
    dists = np.zeros((num_test, num_train))
    for i in xrange(num_test):
        for j in xrange(num_train):
            #####################################################################
            # TODO:                                                             #
            # Compute the l2 distance between the ith test point and the jth    #
            # training point, and store the result in dists[i, j]. You should   #
            # not use a loop over dimension.                                    #
            #####################################################################
            #dists[i, j] = np.sqrt(np.sum((X[i, :] - self.X_train[j, :]) ** 2))
            dists[i, j] = np.sqrt(np.sum(np.square(X[i, :] - self.X_train[j, :])))
            #####################################################################
            #                       END OF YOUR CODE                            #
            #####################################################################
    return dists

def compute_distances_one_loop(self, X):
    """
Compute the distance between each test point in X and each training point
in self.X_train using a single loop over the test data.

Input / Output: Same as compute_distances_two_loops
"""
    num_test = X.shape[0]
    num_train = self.X_train.shape[0]
    dists = np.zeros((num_test, num_train))
    for i in xrange(num_test):
        #######################################################################
        # TODO:                                                               #
        # Compute the l2 distance between the ith test point and all training #
        # points, and store the result in dists[i, :].                        #
        #######################################################################
        dists[i, :] = np.sqrt(np.sum(np.square(self.X_train - X[i, :]), axis = 1))
        #######################################################################
        #                         END OF YOUR CODE                            #
        #######################################################################
    print(dists.shape)
    return dists

def compute_distances_no_loops(self, X):
    """
Compute the distance between each test point in X and each training point
in self.X_train using no explicit loops.

Input / Output: Same as compute_distances_two_loops
"""
    num_test = X.shape[0]
    num_train = self.X_train.shape[0]
    dists = np.zeros((num_test, num_train))
    #########################################################################
    # TODO:                                                                 #
    # Compute the l2 distance between all test points and all training      #
    # points without using any explicit loops, and store the result in      #
    # dists.                                                                #
    #                                                                       #
    # You should implement this function using only basic array operations; #
    # in particular you should not use functions from scipy.                #
    #                                                                       #
    # HINT: Try to formulate the l2 distance using matrix multiplication    #
    #       and two broadcast sums.                                         #
    #########################################################################
    dists = np.sqrt(-2 * np.dot(X, self.X_train.T) +
                    np.sum(np.square(self.X_train), axis=1) +
                    np.sum(np.square(X), axis=1)[:, np.newaxis])
    print(dists.shape)
    #########################################################################
    #                         END OF YOUR CODE                              #
    #########################################################################
    return dists

您可以在这里找到完整的可测试代码。

您知道我在compute_distances_no_loops或其他地方做错了什么吗?

更新:

抛出错误消息的代码是:

dists_two = classifier.compute_distances_no_loops(X_test)

# check that the distance matrix agrees with the one we computed before:
difference = np.linalg.norm(dists - dists_two, ord='fro')
print('Difference was: %f' % (difference, ))
if difference < 0.001:
    print('Good! The distance matrices are the same')
else:
    print('Uh-oh! The distance matrices are different')

并且错误信息:

Difference was: 372100.327569
Uh-oh! The distance matrices are different

你能再检查下点乘法的符号吗?我认为应该是 X.dot(self.X_train.T) 这样的吧,可能是这个原因吗? - sharathnatraj
我无法复现您的错误,当我尝试np.allclose(compute_distances_no_loops(Y, Z), compute_distances_one_loop(Y, Z))时,它会返回True - Dani Mesejo
在运行compute_distances_no_loops方法后,我遇到了错误。 - VansFannel
@DaniMesejo 我已经更新了问题,包括引发错误的代码和错误信息。 - VansFannel
我认为你需要使用广播乘法,而不是点积。 - Ilia Novoselov
5个回答

12

这是如何计算X和Y的行之间的成对距离而不创建任何三维矩阵的方法:

def dist(X, Y):
    sx = np.sum(X**2, axis=1, keepdims=True)
    sy = np.sum(Y**2, axis=1, keepdims=True)
    return np.sqrt(-2 * X.dot(Y.T) + sx + sy.T)

非常感谢!你的代码运行得非常好。但是,我不知道为什么,我仍然收到相同的消息:“差异是:372100.327569 - 哎呀!距离矩阵不同”。 - VansFannel
有一件事帮助我理解在操作 sx + sy.T 中发生的 numpy 的“魔法”:sx = np.sum(X**2, axis=1, keepdims=True); print("sx.shape", sx.shape); # sx.shape (500, 1); sy = np.sum(Y**2, axis=1, keepdims=True); print("sy.T.shape", sy.T.shape); # sy.T.shape (1, 5000); square_sums = sx + sy.T; print("square_sums.shape", square_sums.shape); # square_sums.shape (500, 5000); - Moshe Jonathan Gordon Radian
请注意,这是可行的,因为如果你展开 (X-Y)^2 = (X-Y)*(X-Y),你会得到 X^2 + Y^2 - 2XY - Luke

0

虽然回复晚了,但我用另一种方法解决了它并想要发布出来。当我解决这个问题时,我不知道 numpy 可以从矩阵中减去列向量或行向量。事实证明,我们可以从 nxm 矩阵中减去 nx1 或 1xm 向量,这样就会从每个行向量和列向量中减去。如果使用的库不支持这种行为,可以使用我的库。对于这种情况,我已经计算出了结果,结果如下:

sum_x_train=np.sum(self.X_train**2,axis=1, keepdims=True)
sum_x_test=np.sum(X**2,axis=1, keepdims=True)
sum_2x_tr_te=np.dot(self.X_train,X.T)*2
sum_x_train=np.dot(sum_x_train,np.ones((1,X.shape[0])))
sum_x_test=np.dot(sum_x_test,np.ones((1,self.X_train.shape[0])))
dists=np.sqrt(sum_x_test.T+sum_x_train-sum_2x_tr_te).T

这种方法的缺点是它使用更多的内存。

0

我认为您正在寻找配对距离。

有一种令人惊叹的技巧可以在单行中完成。您必须巧妙地使用广播功能:

X_train = np.expand_dims(self.X_train, 1) # shape: [num_train, 1, D]
X_test = np.expand_dims(X, 0) # shape: [1, num_tests, D]
dists = np.square(X_train - X_test) # Thanks to broadcast [num_train, num_tests, D]
dists = np.sqrt(np.sum(dists, axis=-1)) # [num_train, num_tests]

谢谢您的回答。我已经测试过了,但是 dists = np.square(X_train - X_test) # Thanks to broadcast [num_train, num_tests, D] 这一行代码会抛出错误 MemoryError: Unable to allocate 7.15 GiB for an array with shape (5000, 500, 3072) and data type uint8 - VansFannel
这是因为您在内存中加载了大量数据。但是,如果尝试使用其他方法来计算成对距离,您将会遇到同样的问题。 - Guillem
这段代码是大学作业的一部分,我认为教授们不会让它无法执行,除非你有很多内存。 - VansFannel

0

我相信问题出在数组形状不一致。

#a^2 matrix (500, 1) 
alpha = np.sum(np.square(X), axis=1)
alpha = alpha.reshape(len(alpha), 1)
print(alpha.shape)
#b^2 matrix (1, 5000) 
beta  = np.sum(np.square(self.X_train.T), axis=0)
beta = beta.reshape(1, len(beta))
print(beta.shape)
#ab matrix (500, 5000)
alphabeta = np.dot(X, self.X_train.T)
print(alphabeta.shape)

dists = np.sqrt(-2 * alphabeta + alpha + beta)

0
这是我对函数compute_distances_no_loops()的解决方案,正如OP所要求的。出于性能原因,我不使用sqrt()函数:
def compute_distances_no_loops(self, X):
    num_test = X.shape[0]
    num_train = self.X_train.shape[0]
    dists = np.zeros((num_test, num_train))
    #---------------
    # Get square of X and X_train
    X_sq = np.sum(X**2, axis=1, keepdims=True) 
    Xtrain_sq = np.sum(self.X_train**2, axis=1, keepdims=True)
    # Calculate (squared) dists as (X_train - X)**2 = X_train**2 - 2*X_train*X + X**2
    dists = -2*X.dot(self.X_train.T) + X_sq + Xtrain_sq.T
    #---------------  
    return dists

请在您的回答中提供更多细节。目前的写法让人难以理解您的解决方案。 - Community
谢谢@社区!我编辑了我的评论。希望现在更清楚了。 - tqbinh

网页内容由stack overflow 提供, 点击上面的
可以查看英文原文,
原文链接