在Rcpp函数内替换Rcpp::List元素是否安全？

Question

在Rcpp函数内替换Rcpp::List元素是否安全？

3

我需要覆盖作为Rcpp函数参数传递的Rcpp :: List对象的元素。我的担心是内存安全性。如果重新分配非空列表元素，那么我是否实际上正在重连指向原始内容的指针，但从未释放存储原始内容的内存？如果是这样，有什么解决方法吗？

我知道我可以轻松修改作为Rcpp :: List元素的Rcpp对象（例如Rcpp :: NumericVector），因为Rcpp :: NumericVector进行浅复制。然而，这不能满足我的要求，即完全用其他内容替换元素。

下面，我包括一个C ++代码片段，显示了我所指的情况。

#include <Rcpp.h>
using namespace Rcpp;

// [[Rcpp::export]]
void replaceListElement(List l)
{
    std::vector<int> v;
    v.push_back(4);
    v.push_back(5);
    v.push_back(6);
    l["a"] = v;
}

/*** R
l <- list()
l$a <- c(1,2,3)
replaceListElement(l)
print(l)
*/

当在RStudio中通过Rcpp调用时，print（l） 命令输出以下内容。

$a
[1] 4 5 6

这是期望的结果，因此我的问题只与内存安全有关。

- davnovak

1个回答

网页内容由stack overflow 提供, 点击上面的

可以查看英文原文，
原文链接

- Ralf Stubner · Accepted Answer

Rcpp::List是一个Vector<VECSXP>，也就是指向其他向量的指针向量。如果你将新的向量分配给此列表中的某个元素，则实际上只是更改了指针，而没有释放指针原本所指向的内存。但是，R仍然知道这段内存，并通过其垃圾回收器释放它。我们可以通过一个简单的实验来看到这一点，在该实验中，我使用您的C++代码并稍微更改了R代码：

#include <Rcpp.h>
using namespace Rcpp;

// [[Rcpp::export]]
void replaceListElement(List l)
{
  std::vector<int> v;
  v.push_back(4);
  v.push_back(5);
  v.push_back(6);
  l["a"] = v;
}

/*** R
l <- list()
l$a <- runif(1e7)
replaceListElement(l)
print(l)
gc() # optional
*/

在这里，使用更大的向量使效果更加显著。现在如果我使用 R -d valgrind -e 'Rcpp::sourceCpp("<filename>")'，我将会得到带有gc()函数调用的下列结果。

==13827==
==13827== HEAP SUMMARY:
==13827==     in use at exit: 48,125,775 bytes in 9,425 blocks
==13827==   total heap usage: 34,139 allocs, 24,714 frees, 173,261,724 bytes allocated
==13827==
==13827== LEAK SUMMARY:
==13827==    definitely lost: 0 bytes in 0 blocks
==13827==    indirectly lost: 0 bytes in 0 blocks
==13827==      possibly lost: 0 bytes in 0 blocks
==13827==    still reachable: 48,125,775 bytes in 9,425 blocks
==13827==                       of which reachable via heuristic:
==13827==                         newarray           : 4,264 bytes in 1 blocks
==13827==         suppressed: 0 bytes in 0 blocks
==13827== Rerun with --leak-check=full to see details of leaked memory
==13827==
==13827== For counts of detected and suppressed errors, rerun with: -v
==13827== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 0 from 0)

没有 gc() 的调用：

==13761==
==13761== HEAP SUMMARY:
==13761==     in use at exit: 132,713,314 bytes in 10,009 blocks
==13761==   total heap usage: 34,086 allocs, 24,077 frees, 173,212,886 bytes allocated
==13761==
==13761== LEAK SUMMARY:
==13761==    definitely lost: 0 bytes in 0 blocks
==13761==    indirectly lost: 0 bytes in 0 blocks
==13761==      possibly lost: 0 bytes in 0 blocks
==13761==    still reachable: 132,713,314 bytes in 10,009 blocks
==13761==                       of which reachable via heuristic:
==13761==                         newarray           : 4,264 bytes in 1 blocks
==13761==         suppressed: 0 bytes in 0 blocks
==13761== Rerun with --leak-check=full to see details of leaked memory
==13761==
==13761== For counts of detected and suppressed errors, rerun with: -v
==13761== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 0 from 0)

所以在这两种情况下，valgrind 都没有检测到任何内存泄漏。仍然可达内存的数量相差约为8x10^7字节，即 l$a 中原始向量的大小。这表明 R 确实知道原始向量，并在被告知释放时释放它，但当 R 决定自行运行垃圾收集器时也会发生这种情况。