如何使用自定义函数对象实现thrust::transform，以跳过device_vector的部分元素？

Question

如何使用自定义函数对象实现thrust::transform，以跳过device_vector的部分元素？

3

我正在进行一个项目（基本上是物理模拟），需要在许多时间步骤中对大量节点进行计算。目前，我通过编写自定义函数器来实现每种计算类型，在thrust::transform 中调用它们。

作为最简单的例子（伪代码），假设我有一些数据，它们都共享一个常见结构，但可以分解为不同类型（A、B和C），例如，所有数据都具有...

double value.

因此，我将这些数据存储在一个单一的 device_vector 中，如下所示：

class Data {
    thrust::device_vector<double> values;
    unsigned values_begin_A, values_end_A;
    unsigned values_begin_B, values_end_B;
    unsigned values_begin_C, values_end_C;
}

向量前面的第一部分是类型A，接着是类型B，然后是类型C。为了跟踪这些，我保存每种类型的起始/结束索引值。

需要不同的函数对象（例如，functor1应用于A和B类型；functor2应用于A、B和C类型；functor3应用于A和C类型）作用于不同类型的数据。每个函数对象都需要访问由counting_iterator提供的向量中值的索引，并将结果存储在一个单独的向量中。

struct my_functor : public thrust::unary_function< thrust::tuple<unsigned, double> , double > {

    __host__ __device__
    double operator() (const thrust::tuple<unsigned, double> index_value) {

        // Do something with the index and value.

        return result;
    }
}

我的问题是，我不知道如何最佳实现对类型A和C值执行最后一个函数对象，同时跳过B。特别地，我正在寻找一种适用于Thrust的解决方案，随着我添加更多节点类型和更多的函数对象（作用于新旧类型组合）可以合理扩展，并仍能获得并行化的优势。

我想到了四个选项：

选项1：

为每个数据类型分别调用一个变换操作，例如

void Option_One(thrust::device_vector<double>& result) {
    // Multiple transform calls.

    thrust::counting_iterator index(0);

    // Apply functor to 'A' values.
    thrust::transform( 
        thrust::make_zip_iterator(thrust::make_tuple(index, values.begin())),
        thrust::make_zip_iterator(thrust::make_tuple(index, values.begin())) + values_end_A,
        result.begin(),
        my_functor());

    // Apply functor to 'C' values.
    thrust::transform( 
        thrust::make_zip_iterator(thrust::make_tuple(index, values.begin())) + values_begin_C,
        thrust::make_zip_iterator(thrust::make_tuple(index, values.begin())) + values_end_C,
        result.begin() + values_begin_C,
        my_functor());
}

这似乎相当简单，但会牺牲效率，因为我牺牲了同时评估A和C的能力。

选项2：

将值复制到临时向量中，在临时向量上调用变换，然后将临时结果复制回结果。这看起来需要大量来回复制，但允许一次在A和C上调用变换。

void Option_Two(thrust::device_vector<double>& result) {

    // Copy 'A' and 'C' values into temporary vector
    thrust::device_vector<double> temp_values_A_and_C(size_A + size_C);
    thrust::copy(values.begin(), values.begin() + values_end_A, temp_values_A_and_C.begin());
    thrust::copy(values.begin() + values_begin_C, values.begin() + values_end_C, temp_values_A_and_C.begin() + values_end_A);

    // Store results in temporary vector.
    thrust::device_vector<double> temp_results_A_and_C(size_A + size_C);

    thrust::transform( 
        thrust::make_zip_iterator(thrust::make_tuple(index, temp_values_A_and_C.begin())),
        thrust::make_zip_iterator(thrust::make_tuple(index, temp_values_A_and_C.begin())) + size_A + size_C,
        temp_results_A_and_C.begin(),
        my_functor());


    // Copy temp results back into result
    // ....
}

选项3：

调用转换操作以对所有值进行处理，但修改函数对象以检查索引，并仅针对A或C范围内的索引进行操作。

struct my_functor_with_index_checking : public thrust::unary_function< thrust::tuple<unsigned, double> , double > {

    __host__ __device__
    double operator() (const thrust::tuple<unsigned, double> index_value) {

        if ( (index >= values_begin_A && index <= values_end_A ) ||
            ( index >= values_begin_C && index <= values_end_C ) ) {

                // Do something with the index and value.
                return result;
             }
        else {
            // Do nothing;
            return 0; //Result is 0 by default.
        }
    }
}

void Option_Three(thrust::device_vector<double>& result) {

    // Apply functor to all values, but check index inside functor.
    thrust::transform( 
        thrust::make_zip_iterator(thrust::make_tuple(index, values.begin())),
        thrust::make_zip_iterator(thrust::make_tuple(index, values.begin())) + values.size(),
        result.begin(),
        my_functor_with_index_checking());
}

选项4： 我提出的最终方案是基于counting_iterator创建自定义迭代器，在A范围内正常计数，但一旦到达A的末尾，就跳转到C的开头。这似乎是一种优雅的解决方案，但我不知道如何实现。

void Option_Four(thrust::device_vector<double>& result) {

    // Create my own version of a counting iterator
    // that skips from the end of 'A' to the beginning of 'C'
    // I don't know how to do this!
    FancyCountingIterator fancyIndex(0); 

    thrust::transform( 
        thrust::make_zip_iterator(thrust::make_tuple(fancyIndex, values.begin())),
        thrust::make_zip_iterator(thrust::make_tuple(fancyIndex, values.begin())) + values.size(),
        result.begin(),
        my_functor());
}

- Charlie H.

1个回答

网页内容由stack overflow 提供, 点击上面的

可以查看英文原文，
原文链接

- lxkarthi · Accepted Answer

使用permutation_iterator与自定义的transform_iterator结合使用（这是您正在寻找的高级迭代器）。

Data d; //assuming this has values.
unsigned A_size = d.values_end_A - d.values_begin_A;
unsigned C_size = d.values_end_C - d.values_begin_C;
auto A_C_index_iter = thrust::make_transform_iterator( thrust::make_counting_iterator(0), 
[&]__device__(int i) {
  if (i<A_size)
    return i+d.values_begin_A; 
  else 
    return (i-A_size)+d.values_begin_C;
});
auto permuted_input_iter = thrust::make_permutation_iterator(values.begin(), A_C_index_iter);
auto permuted_output_iter = thrust::make_permutation_iterator(result.begin(), A_C_index_iter);
thrust::transform(permuted_input_iter, permuted_input_iter + A_size + C_size, permuted_output_iter);

这个利用了完全的并行性（A_size + C_size）。