我正在使用OpenMP来并行化for循环。我尝试通过线程ID访问C++ Armadillo向量,但我想知道即使不同的线程访问不重叠的内存区域,我是否必须将访问放在关键部分中。
这是我的代码:
#include <armadillo>
#include <omp.h>
#include <iostream>
int main()
{
arma::mat A = arma::randu<arma::mat>(1000,700);
arma::rowvec point = A.row(0);
arma::vec distances = arma::zeros(omp_get_max_threads());
#pragma omp parallel shared(A,point,distances)
{
arma::vec local_distances = arma::zeros(omp_get_num_threads());
int thread_id = omp_get_thread_num();
for(unsigned int l = 0; l < A.n_rows; l++){
double temp = arma::norm(A.row(l) - point,2);
if(temp > local_distances[thread_id])
local_distances[thread_id] = temp;
}
// Is it necessary to put a critical section here?
#pragma omp critical
if(local_distances[thread_id] > distances[thread_id]){
distances[thread_id] = local_distances[thread_id];
}
}
std::cout << distances[distances.index_max()] << std::endl;
}
在我的情况下,把读/写操作放到distances
向量中是必要的吗?