如何使用所有处理器在MPI中发送/接收数据

Question

如何使用所有处理器在MPI中发送/接收数据

10

这个程序是使用C Lagrange和MPI编写的。我对MPI还比较陌生，想要使用所有处理器进行一些计算，包括进程0。为了学习这个概念，我编写了下面这个简单的程序。但是这个程序在从进程0接收输入后卡住了，并且无法将结果发送回进程0。

#include <mpi.h>
#include <stdio.h>

int main(int argc, char** argv) {    
    MPI_Init(&argc, &argv);
    int world_rank;
    MPI_Comm_rank(MPI_COMM_WORLD, &world_rank);
    int world_size;
    MPI_Comm_size(MPI_COMM_WORLD, &world_size);

    int number;
    int result;
    if (world_rank == 0) 
    {
        number = -2;
        int i;
        for(i = 0; i < 4; i++)
        {
            MPI_Send(&number, 1, MPI_INT, i, 0, MPI_COMM_WORLD);
        }
        for(i = 0; i < 4; i++)
        {           /*Error: can't get result send by other processos bellow*/
            MPI_Recv(&number, 1, MPI_INT, i, 99, MPI_COMM_WORLD, MPI_STATUS_IGNORE);
            printf("Process 0 received number %d from i:%d\n", number, i);
        }
    } 
    /*I want to do this without using an else statement here, so that I can use process 0 to do some calculations as well*/

    MPI_Recv(&number, 1, MPI_INT, 0, 0, MPI_COMM_WORLD, MPI_STATUS_IGNORE); 
    printf("*Process %d received number %d from process 0\n",world_rank, number);
    result = world_rank + 1;
    MPI_Send(&result, 1, MPI_INT, 0, 99, MPI_COMM_WORLD);  /* problem happens here when trying to send result back to process 0*/

    MPI_Finalize();
}

运行和获取结果：

:$ mpicc test.c -o test
:$ mpirun -np 4 test

*Process 1 received number -2 from process 0
*Process 2 received number -2 from process 0
*Process 3 received number -2 from process 0
/* hangs here and will not continue */

如果可以的话，请用示例展示或编辑上面的代码。

- D P.

2个回答

1

实际上，进程1-3确实将结果发送回处理器0。然而，处理器0卡在此循环的第一次迭代中：

for(i=0; i<4; i++)
{      
    MPI_Recv(&number, 1, MPI_INT, i, 99, MPI_COMM_WORLD, MPI_STATUS_IGNORE);
    printf("Process 0 received number %d from i:%d\n", number, i);
}

在第一个MPI_Recv调用中，处理器0将会阻塞等待从自己接收标记为99的消息，而0尚未发送此消息。

一般来说，处理器向自己发送/接收消息是一个不好的主意，特别是使用阻塞调用。0已经在内存中有这个值了，它不需要向自己发送。

然而，一个解决方法是从 i=1 开始接收循环。

for(i=1; i<4; i++)
{           
    MPI_Recv(&number, 1, MPI_INT, i, 99, MPI_COMM_WORLD, MPI_STATUS_IGNORE);
    printf("Process 0 received number %d from i:%d\n", number, i);
}

现在运行代码将会给你以下结果：

Process 1 received number -2 from process 0
Process 2 received number -2 from process 0
Process 3 received number -2 from process 0
Process 0 received number 2 from i:1
Process 0 received number 3 from i:2
Process 0 received number 4 from i:3
Process 0 received number -2 from process 0

请注意，像Gilles所提到的使用MPI_Bcast和MPI_Gather作为数据分发/收集的方法更加高效和标准化。

- jahed

网页内容由stack overflow 提供, 点击上面的

可以查看英文原文，
原文链接

- Gilles · Accepted Answer

我不太明白在工作域周围使用两个if语句会有什么问题。但无论如何，这里提供一个示例。

我修改了你的代码以使用集体通信，因为它们比你使用的一系列发送/接收更有意义。由于最初的通信是使用统一值进行的，我使用MPI_Bcast()执行了相同的操作，可以一次性完成。相反，由于结果值都不同，可以使用MPI_Gather()调用。我还介绍了一个对sleep()的调用，只是为了模拟进程在发送其结果之前要工作一段时间的情况。

代码现在看起来像这样：

#include <mpi.h>
#include <stdlib.h>   // for malloc and free
#include <stdio.h>    // for printf
#include <unistd.h>   // for sleep

int main( int argc, char *argv[] ) {

    MPI_Init( &argc, &argv );
    int world_rank;
    MPI_Comm_rank( MPI_COMM_WORLD, &world_rank );
    int world_size;
    MPI_Comm_size( MPI_COMM_WORLD, &world_size );

    // sending the same number to all processes via broadcast from process 0
    int number = world_rank == 0 ? -2 : 0;
    MPI_Bcast( &number, 1, MPI_INT, 0, MPI_COMM_WORLD );
    printf( "Process %d received %d from process 0\n", world_rank, number );

    // Do something usefull here
    sleep( 1 );
    int my_result = world_rank + 1;

    // Now collecting individual results on process 0
    int *results = world_rank == 0 ? malloc( world_size * sizeof( int ) ) : NULL;
    MPI_Gather( &my_result, 1, MPI_INT, results, 1, MPI_INT, 0, MPI_COMM_WORLD );

    // Process 0 prints what it collected
    if ( world_rank == 0 ) {
        for ( int i = 0; i < world_size; i++ ) {
            printf( "Process 0 received result %d from process %d\n", results[i], i );
        }
        free( results );
    }

    MPI_Finalize();

    return 0;
}

编译后的结果如下：

$ mpicc -std=c99 simple_mpi.c -o simple_mpi

它运行并呈现如下结果：

$ mpiexec -n 4 ./simple_mpi
Process 0 received -2 from process 0
Process 1 received -2 from process 0
Process 3 received -2 from process 0
Process 2 received -2 from process 0
Process 0 received result 1 from process 0
Process 0 received result 2 from process 1
Process 0 received result 3 from process 2
Process 0 received result 4 from process 3