何时使用数组而不是向量/字符串？

Question

何时使用数组而不是向量/字符串？

c++arraysstringvector

12

我是一名初学者的C++程序员，所以我学习使用数组而不是向量（这似乎是一般做法，然后再转向向量）。

我注意到SO上很多答案建议使用向量而不是数组，使用字符串而不是字符数组。似乎这是在C++中编码的“正确”方式。

尽管如此，什么情况下仍值得使用经典的数组/char*（如果有的话）？

- Skilldrick

8个回答

16

永远不要使用裸数组。

如果出于其他原因，裸数组似乎比向量更好，则可以在C++11编译器中使用 std::tr1::array 或 std::array（或者在boost::array中使用）。它只是执行了我会执行的检查以确保数组的正确性，并且它自动实现了DRY（例如，在循环中使用大小，因此将来对数组声明进行的更改将自动工作）。

数组实现“就是”具有检查和提供大小常量的裸数组，因此很容易将数组代码嵌入到嵌入式代码中，因为该代码并不是真正“太聪明”适用于任何编译器。只要编译器支持模板，我就会将boost头文件复制到我的代码中，以便我可以使用这个而不是裸数组，因为使用裸数组很容易出错。裸数组是邪恶的。它们容易出错。

并且它与STL算法非常配合得好（如果可用）。

现在，有两种情况需要使用裸数组（必须使用）：当您处于仅C代码的情况下（不与C代码通信，而是编写仅C部分的代码，例如C库）时。但那是另一种语言。另一个原因是编译器根本不支持模板......

- Klaim

我非常同意。原始的STL应该有一个非常类似boost :: array的东西，这是毫无疑问的；我从来没有理解为什么没有包含它（除非整数模板参数是语言的后期添加？） - timday

是的，看起来他们试图通过TR1解决这个问题，但编译器没有快速达到让它流行的程度。 :/ 无论如何，今天没有理由使用原始数组，因为std::tr1::array或boost::array是可编译的。C++0x将有助于以后使其成为“标准方式”，我猜。 - Klaim

5

这个问题实际上可以分成两个部分：

如何管理平坦数组数据的内存？
如何访问平坦数组的元素？

我个人喜欢使用std::vector来管理内存，除非需要与不使用STL的代码保持兼容性（例如与纯C代码进行接口）。使用new或malloc分配原始数组的内存（部分原因是很容易忘记需要担心它）使得编写异常安全的代码更加困难。有关原因，请参见任何一篇RAII文章。

在实践中，std::vector被实现为平坦数组。因此，总是可以提取出原始数组并使用C风格的访问模式。我通常从向量下标操作符语法开始。对于某些编译器，在生成调试版本时，向量提供自动边界检查。这很慢（通常会导致紧密循环的10倍减速），但有助于发现某些类型的错误。

如果在特定平台上进行分析表明operator[]是瓶颈，那么我会切换到直接访问原始数组。有趣的是，根据编译器和操作系统的不同，使用STL向量有时比使用原始数组更快。下面是一个简单测试应用程序的结果。它是在32位Release模式下使用/O2优化编译的Visual Studio 2008中运行在Vista x64上的。在64位测试应用程序中也可以获得类似的结果。

Binary search...
           fill vector (for reference) :  0.27 s
                   array with ptr math :  0.38 s <-- C-style pointers lose
                  array with int index :  0.23 s <-- [] on raw array wins
            array with ptrdiff_t index :  0.24 s
                 vector with int index :  0.30 s  <-- small penalty for vector abstraction
           vector with ptrdiff_t index :  0.30 s

Counting memory (de)allocation...
                memset (for reference) :  2.85 s
      fill malloc-ed raw array with [] :  2.66 s
     fill malloc-ed raw array with ptr :  2.81 s
         fill new-ed raw array with [] :  2.64 s
        fill new-ed raw array with ptr :  2.65 s
                  fill vector as array :  3.06 s  \ something's slower 
                           fill vector :  3.05 s  / with vector!

NOT counting memory (de)allocation...
                memset (for reference) :  2.57 s
      fill malloc-ed raw array with [] :  2.86 s
     fill malloc-ed raw array with ptr :  2.60 s
         fill new-ed raw array with [] :  2.63 s
        fill new-ed raw array with ptr :  2.78 s
                  fill vector as array :  2.49 s \ after discounting the  
                           fill vector :  2.54 s / (de)allocation vector is faster!

代码：

#define WINDOWS_LEAN_AND_MEAN
#include <windows.h>
#include <string>
#include <vector>
#include <stdio.h>

using namespace std;

__int64 freq; // initialized in main
int const N = 1024*1024*1024/sizeof(int)/2; // 1/2 GB of data
int const nIter = 10;

class Timer {
public:
  Timer(char *name) : name(name) {
    QueryPerformanceCounter((LARGE_INTEGER*)&start);
  }
  ~Timer() {
    __int64 stop;
    QueryPerformanceCounter((LARGE_INTEGER*)&stop);
    printf("  %36s : % 4.2f s\n", name.c_str(), (stop - start)/double(freq));
  }
private:
  string const name;
  __int64 start;
};


template <typename Container, typename Index>
int binarySearch_indexed(Container sortedArray, Index first, Index last, int key) {
  while (first <= last) {
    Index mid = (first + last) / 2; // NOT safe if (first+last) is too big!
    if (key > sortedArray[mid])      first = mid + 1;
    else if (key < sortedArray[mid])  last = mid - 1; 
    else return mid;  
  }
  return 0; // Use "(Index)-1" in real code
}

int Dummy = -1;
int const *binarySearch_ptr(int const *first, int const *last, int key) {
  while (first <= last) {
    int const *mid = (int const *)(((unsigned __int64)first + (unsigned __int64)last) / 2);  
    if (key > *mid)      first = mid + 1;
    else if (key < *mid)  last = mid - 1; 
    else return mid;  
  }
  return &Dummy; // no NULL checks: don't do this for real
}

void timeFillWithAlloc() {
  printf("Counting memory (de)allocation...\n");
  { 
    Timer tt("memset (for reference)");
    int *data = (int*)malloc(N*sizeof(int));
    for (int it=0; it<nIter; it++) memset(data, 0, N*sizeof(int));
    free(data);
  }
  { 
    Timer tt("fill malloc-ed raw array with []");
    int *data = (int*)malloc(N*sizeof(int));
    for (int it=0; it<nIter; it++) for (size_t i=0; i<N; i++) data[i] = (int)i;
    free(data);
  }
  { 
    Timer tt("fill malloc-ed raw array with ptr");
    int *data = (int*)malloc(N*sizeof(int));
    for (int it=0; it<nIter; it++) {
    int *d = data;
    for (size_t i=0; i<N; i++) *d++ = (int)i;
    }
    free(data);
  }
  { 
    Timer tt("fill new-ed raw array with []");
    int *data = new int[N];
    for (int it=0; it<nIter; it++) for (size_t i=0; i<N; i++) data[i] = (int)i;
    delete [] data;
  }
  { 
    Timer tt("fill new-ed raw array with ptr");
    int *data = new int[N];
    for (int it=0; it<nIter; it++) {
    int *d = data;
    for (size_t i=0; i<N; i++) *d++ = (int)i;
    }
    delete [] data;
  }
  { 
    Timer tt("fill vector as array");
    vector<int> data(N); 
    for (int it=0; it<nIter; it++) {
      int *d = &data[0]; 
    for (size_t i=0; i<N; i++) *d++ = (int)i;
    }
  }
  { 
    Timer tt("fill vector");
    vector<int> data(N); 
    for (int it=0; it<nIter; it++) for (size_t i=0; i<N; i++) data[i] = (int)i;
  }
  printf("\n");
}

void timeFillNoAlloc() {
  printf("NOT counting memory (de)allocation...\n");

  { 
    int *data = (int*)malloc(N*sizeof(int));
    {
      Timer tt("memset (for reference)");
      for (int it=0; it<nIter; it++) memset(data, 0, N*sizeof(int));
    }
    free(data);
  }
  { 
    int *data = (int*)malloc(N*sizeof(int));
    {
      Timer tt("fill malloc-ed raw array with []");
      for (int it=0; it<nIter; it++) for (size_t i=0; i<N; i++) data[i] = (int)i;
    }
    free(data);
  }
  { 
    int *data = (int*)malloc(N*sizeof(int));
    {
      Timer tt("fill malloc-ed raw array with ptr");
      for (int it=0; it<nIter; it++) {
        int *d = data;
        for (size_t i=0; i<N; i++) *d++ = (int)i;
      }
    }
    free(data);
  }
  { 
    int *data = new int[N];
    {
      Timer tt("fill new-ed raw array with []");
      for (int it=0; it<nIter; it++) for (size_t i=0; i<N; i++) data[i] = (int)i;
    }
    delete [] data;
  }
  { 
    int *data = new int[N];
    {
      Timer tt("fill new-ed raw array with ptr");
      for (int it=0; it<nIter; it++) {
        int *d = data;
        for (size_t i=0; i<N; i++) *d++ = (int)i;
      }
    }
    delete [] data;
  }
  { 
    vector<int> data(N); 
    {
      Timer tt("fill vector as array");
      for (int it=0; it<nIter; it++) {
        int *d = &data[0]; 
        for (size_t i=0; i<N; i++) *d++ = (int)i;
      }
    }
  }
  { 
    vector<int> data(N); 
    {
      Timer tt("fill vector");
      for (int it=0; it<nIter; it++) for (size_t i=0; i<N; i++) data[i] = (int)i;
    }
  }
  printf("\n");
}

void timeBinarySearch() {
  printf("Binary search...\n");
  vector<int> data(N); 
  {
    Timer tt("fill vector (for reference)");
    for (size_t i=0; i<N; i++) data[i] = (int)i;
  }

  {
    Timer tt("array with ptr math");
    int sum = 0;
    for (int i=-1000000; i<1000000; i++) {
      sum += *binarySearch_ptr(&data[0], &data[0]+data.size(), i);
    }
  }
  {
    Timer tt("array with int index");
    int sum = 0;
    for (int i=-1000000; i<1000000; i++) {
      sum += data[binarySearch_indexed<int const *, int>(
        &data[0], 0, (int)data.size(), -1)];
    }
  }
  {
    Timer tt("array with ptrdiff_t index");
    int sum = 0;
    for (int i=-1000000; i<1000000; i++) {
      sum += data[binarySearch_indexed<int const *, ptrdiff_t>(
        &data[0], 0, (ptrdiff_t)data.size(), -1)];
    }
  }
  {
    Timer tt("vector with int index");
    int sum = 0;
    for (int i=-1000000; i<1000000; i++) {
      sum += data[binarySearch_indexed<vector<int> const &, int>(
        data, 0, (int)data.size(), -1)];
    }
  }
  {
    Timer tt("vector with ptrdiff_t index");
    int sum = 0;
    for (int i=-1000000; i<1000000; i++) {
      sum += data[binarySearch_indexed<vector<int> const &, ptrdiff_t>(
        data, 0, (ptrdiff_t)data.size(), -1)];
    }
  }

  printf("\n");
}

int main(int argc, char **argv)
{
  QueryPerformanceFrequency((LARGE_INTEGER*)&freq);

  timeBinarySearch();
  timeFillWithAlloc();
  timeFillNoAlloc();

  return 0;
}

- Mr Fooz

标准并不保证std::vector是一个平坦数组，这让很多C++标准委员会成员感到惊讶。我不知道是否有任何实现不包含平坦数组，而即将发布的C++标准将保证它。 - David Thornley

4

当兼容性或性能至关重要时，数组/char*非常有用。向量和字符串是更高级别的对象，对于代码可维护性、可读性和总体易用性更好。几乎总是如此。

- mad-j

3

我建议在编译时就知道大小的情况下使用数组。虽然可以使用向量，但必须记住向量带有与在堆上进行的内存分配相关的开销。如果大小未知，则当然要使用向量。

- Naveen

你也可以使用 boost::array 或 tr1::array。 - Anonymous

复制开销与数组有何不同？我所能想到的唯一情况是，如果你有“Foo arr[] = {Foo(...),Foo(...),...};”这样的数组，那么数组可能更有效率。 - Mr Fooz

@Mr Fooz：是的，你说得对。没有区别。我编辑了我的答案。 - Naveen

还要记住，对于具有非平凡析构函数的数据类型来说，数组是危险的。除非你真的非常需要，否则永远不要使用它们的数组。 - David Thornley

我可以想到一些情况，我知道大小，但是使用数组而不是vector是错误的。第一个例子，大小为5000万。我没有那么多堆栈。第二个例子，我想要一个快速或nothrow swap操作 - 仅向量具有此功能，而数组不具备（不仅因为std :: swap不适用于数组，您也无法编写自己的快速数组swap，或者针对没有自己的nothrow交换的类型的数组进行nothrow交换）。因此，有时我会使用vector，尽管知道大小。 - Steve Jessop

1

我能想到的唯一原因可能是速度。你可以对数组/指针类型进行更好的优化，而不是对相应的对象进行优化。但是，即使我绝对知道我的数据结构需要容纳多少数据，我也会使用STL。在优化步骤中从STL更改为基本类型比以难以阅读的代码开始项目要好得多。

- soulmerge

好的观点。从我所看到的情况来看，英特尔的C++编译器 tend to output way more LOOP WAS VECTORIZED 在我们的C代码中，而不是我们的C++代码（后者使用了大量的STL）。 - Laserallan

std::vector 保证使用平坦数组进行存储。您始终可以在它们上执行所有 C 风格的指针算术运算。 - Mr Fooz

1

我认为有两个原因：

兼容性（旧代码中没有STL）。
速度。（我比较了使用vector/binary_search和array/handwritten二分查找的速度。对于最后一个，得到了丑陋的代码（需要重新分配内存），但是它大约比第一个快1.2-1.5倍，我使用的是MS VC++ 8）

- sergdev

1

我正在开发一个需要访问结构化数据的共享库。这些数据在编译时已知，因此使用文件范围常量数组POD（普通旧数据）结构来保存数据。

这会导致编译器和链接器将大部分数据放在只读区域中，有两个好处：

可以直接从磁盘映射到内存，而无需运行任何特殊的初始化代码。
可以在进程之间共享。

唯一的例外是编译器仍然会生成初始化代码来加载浮点常量，因此任何包含浮点数的结构最终都会在可写区域中。我怀疑这与浮点异常或浮点舍入模式有关，但我不确定如何验证这两个假设。

如果我使用向量和字符串对象，则编译器将生成更多的初始化代码，每当我的共享库被加载时就会执行。常量数据将分配在堆上，因此不能在进程之间共享。

如果我从磁盘中读取数据，则必须处理数据格式的检查，而不是让C++编译器为我完成。我还必须在一个自始至终都将全局数据“烤入”其中的代码库中管理数据的生命周期。

- bk1e

网页内容由stack overflow 提供, 点击上面的

可以查看英文原文，
原文链接

- Laserallan · Accepted Answer

当编写代码以供其他项目使用时，特别是针对一些特殊平台（如嵌入式系统、游戏主机等）的代码，可能没有STL库可用。旧项目或具有特殊要求的项目可能不希望引入STL库的依赖。依赖于数组、char*或任何其他东西的接口将与任何内容兼容，因为它是语言的一部分。然而，不能保证在所有构建环境中都存在STL。