排序数组的最快搜索方法是什么？

Question

排序数组的最快搜索方法是什么？

24

回答另一个问题后，我编写了下面的程序来比较排序数组中不同搜索方法的效率。基本上，我比较了两种插值搜索实现和一种二分搜索方法。我通过计算每个变体在相同数据集上花费的循环次数来比较性能。

然而，我确信有办法可以优化这些函数，使它们运行得更快。有人有什么想法可以让这个搜索函数更快吗？C或C++的解决方案可接受，但我需要它处理100000个元素的数组。

#include <stdlib.h>
#include <stdio.h>
#include <time.h>
#include <stdint.h>
#include <assert.h>

static __inline__ unsigned long long rdtsc(void)
{
  unsigned long long int x;
     __asm__ volatile (".byte 0x0f, 0x31" : "=A" (x));
     return x;
}

int interpolationSearch(int sortedArray[], int toFind, int len) {
    // Returns index of toFind in sortedArray, or -1 if not found
    int64_t low = 0;
    int64_t high = len - 1;
    int64_t mid;

    int l = sortedArray[low];
    int h = sortedArray[high];

    while (l <= toFind && h >= toFind) {
        mid = low + (int64_t)((int64_t)(high - low)*(int64_t)(toFind - l))/((int64_t)(h-l));

        int m = sortedArray[mid];

        if (m < toFind) {
            l = sortedArray[low = mid + 1];
        } else if (m > toFind) {
            h = sortedArray[high = mid - 1];
        } else {
            return mid;
        }
    }

    if (sortedArray[low] == toFind)
        return low;
    else
        return -1; // Not found
}

int interpolationSearch2(int sortedArray[], int toFind, int len) {
    // Returns index of toFind in sortedArray, or -1 if not found
    int low = 0;
    int high = len - 1;
    int mid;

    int l = sortedArray[low];
    int h = sortedArray[high];

    while (l <= toFind && h >= toFind) {
        mid = low + ((float)(high - low)*(float)(toFind - l))/(1+(float)(h-l));
        int m = sortedArray[mid];

        if (m < toFind) {
            l = sortedArray[low = mid + 1];
        } else if (m > toFind) {
            h = sortedArray[high = mid - 1];
        } else {
            return mid;
        }
    }

    if (sortedArray[low] == toFind)
        return low;
    else
        return -1; // Not found
}

int binarySearch(int sortedArray[], int toFind, int len) 
{
    // Returns index of toFind in sortedArray, or -1 if not found
    int low = 0;
    int high = len - 1;
    int mid;

    int l = sortedArray[low];
    int h = sortedArray[high];

    while (l <= toFind && h >= toFind) {
        mid = (low + high)/2;

        int m = sortedArray[mid];

        if (m < toFind) {
            l = sortedArray[low = mid + 1];
        } else if (m > toFind) {
            h = sortedArray[high = mid - 1];
        } else {
            return mid;
        }
    }

    if (sortedArray[low] == toFind)
        return low;
    else
        return -1; // Not found
}

int order(const void *p1, const void *p2) { return *(int*)p1-*(int*)p2; }

int main(void) {
    int i = 0, j = 0, size = 100000, trials = 10000;
    int searched[trials];
    srand(-time(0));
    for (j=0; j<trials; j++) { searched[j] = rand()%size; }

    while (size > 10){
        int arr[size];
        for (i=0; i<size; i++) { arr[i] = rand()%size; }
        qsort(arr,size,sizeof(int),order);

        unsigned long long totalcycles_bs = 0;
        unsigned long long totalcycles_is_64 = 0;
        unsigned long long totalcycles_is_float = 0;
        unsigned long long totalcycles_new = 0;
        int res_bs, res_is_64, res_is_float, res_new;
        for (j=0; j<trials; j++) {
            unsigned long long tmp, cycles = rdtsc();
            res_bs = binarySearch(arr,searched[j],size);
            tmp = rdtsc(); totalcycles_bs += tmp - cycles; cycles = tmp;

            res_is_64 = interpolationSearch(arr,searched[j],size);
            assert(res_is_64 == res_bs || arr[res_is_64] == searched[j]); 
            tmp = rdtsc(); totalcycles_is_64 += tmp - cycles; cycles = tmp;

            res_is_float = interpolationSearch2(arr,searched[j],size);
            assert(res_is_float == res_bs || arr[res_is_float] == searched[j]); 
            tmp = rdtsc(); totalcycles_is_float += tmp - cycles; cycles = tmp;
        }
        printf("----------------- size = %10d\n", size);
        printf("binary search          = %10llu\n", totalcycles_bs);
        printf("interpolation uint64_t = %10llu\n",  totalcycles_is_64);
        printf("interpolation float    = %10llu\n",  totalcycles_is_float);
        printf("new                    = %10llu\n",  totalcycles_new);
        printf("\n");
        size >>= 1;
    }
}

- kriss

3

像这样的问题既有趣又富有启发性。学习如何让代码更快是目标。而且这个问题是用C语言编写的，其中STL不存在。 - Andrew Bainbridge

2

某件事是否有答案只是问题在SO上存在的有用性的一部分。这里显然有可以学习的东西。此外，这个问题是可以回答的。我有一个更快的实现，但现在无法发布。 - Andrew Bainbridge

1

@Andrew：我并不怀疑在这里有一些有用的东西值得学习，但这更适合作为社区维基。因为没有明确的答案（尤其是因为“最快”的定义取决于平台和其他因素），它很容易陷入到争论谁更好的境地。 - Oliver Charlesworth

1

我投了最后一票来重新开放这个问题。我同意Oli的看法，这可能主要是一个学术练习，当情况允许时，您应该使用您选择的库中提供的标准排序之一。但这并不意味着它“不是一个真正的问题”。我还编辑了问题，删除了“代码高尔夫”方面，并将其转换为严格基于代码的优化问题。我没有看到其他人关闭的理由让我信服。 - Cody Gray

2

最快的搜索方法将取决于数组中的值。当数据平滑（值等间隔）时，二分法可能比插值更快，但当数据是指数或对数时，插值方法会更好。您还可以考虑像这个算法（https://dev59.com/sm855IYBdhLWcg3wvXDd）一样的算法，它将保证在X次访问数组内找到结果。 - oosterwal

显示剩余17条评论

8个回答

9

在Win32 Core2 Quad Q6600，gcc v4.3 msys上进行基准测试。使用g++ -O3编译，没有任何花哨的东西。

观察-断言，计时和循环开销约为40％，因此下面列出的任何收益都应除以0.6，以获得测试中算法的实际改进。

简单答案：

在我的机器上，将插值搜索中的int64_t替换为int（用于“low”，“high”和“mid”），可加速20％至40％。这是我能找到的最快速且易于实现的方法。对于数组大小为100000，它在我的机器上每次查找大约需要150个周期。这大致相当于缓存未命中的周期数。因此，在实际应用中，关注缓存可能是最重要的因素。
将二分搜索中的“/ 2”替换为“>>1”可加速4％。
在包含与“arr”相同数据的向量上使用STL的binary_search算法，与手工编码的binarySearch速度大致相同。尽管在较小的“size”上，STL要慢得多-约为40％。

- Andrew Bainbridge

很棒的统计数据，请指引我使用的基准测试工具。我对这个领域完全陌生。 - Anupam Saini

我没有使用任何工具，只用了编译器。原帖的程序已经进行了定时。我只是用各种修改运行了它多次。我似乎记得我也改变了被计时的内容，但我已经忘记了细节。 - Andrew Bainbridge

啊..那我会修改程序本身。感谢你的帮助，安德鲁。 - Anupam Saini

1

+1 为基准测试并抽出时间发布结果。 - SmacL

4

我有一个非常复杂的解决方案，需要使用专门的排序函数。这种排序比良好的快速排序略慢，但是我的所有测试都表明搜索函数比二进制或插值搜索要快得多。在我发现这个名字已经被占用之前，我称它为回归排序，但没有想到一个新名字（有什么想法吗？）。

有三个文件可以演示。

回归排序/搜索代码：

#include <sstream>
#include <math.h>
#include <ctime>
#include "limits.h"

void insertionSort(int array[], int length) {
   int key, j;
   for(int i = 1; i < length; i++) {
      key = array[i];
      j = i - 1;
      while (j >= 0 && array[j] > key) {
         array[j + 1] = array[j];
         --j;
      }
      array[j + 1] = key;
   }
}

class RegressionTable {
   public:
      RegressionTable(int arr[], int s, int lower, int upper, double mult, int divs);
      RegressionTable(int arr[], int s);
      void sort(void);
      int find(int key);
      void printTable(void);
      void showSize(void);
   private:
      void createTable(void);
      inline unsigned int resolve(int n);
      int * array;
      int * table;
      int * tableSize;
      int size;
      int lowerBound;
      int upperBound;
      int divisions;
      int divisionSize;
      int newSize;
      double multiplier;
};

RegressionTable::RegressionTable(int arr[], int s) {
   array = arr;
   size = s;
   multiplier = 1.35;
   divisions = sqrt(size);
   upperBound = INT_MIN;
   lowerBound = INT_MAX;
   for (int i = 0; i < size; ++i) {
      if (array[i] > upperBound)
         upperBound = array[i];
      if (array[i] < lowerBound)
         lowerBound = array[i];
   }
   createTable();
}

RegressionTable::RegressionTable(int arr[], int s, int lower, int upper, double mult, int divs) {
   array = arr;
   size = s;
   lowerBound = lower;
   upperBound = upper;
   multiplier = mult;
   divisions = divs;
   createTable();
}

void RegressionTable::showSize(void) {
   int bytes = sizeof(*this);
   bytes = bytes + sizeof(int) * 2 * (divisions + 1);
}

void RegressionTable::createTable(void) {
   divisionSize = size / divisions;
   newSize = multiplier * double(size);
   table = new int[divisions + 1];
   tableSize = new int[divisions + 1];

   for (int i = 0; i < divisions; ++i) {
      table[i] = 0;
      tableSize[i] = 0;
   }

   for (int i = 0; i < size; ++i) {
      ++table[((array[i] - lowerBound) / divisionSize) + 1];
   }

   for (int i = 1; i <= divisions; ++i) {
      table[i] += table[i - 1];
   }
   table[0] = 0;

   for (int i = 0; i < divisions; ++i) {
      tableSize[i] = table[i + 1] - table[i];
   }
}

int RegressionTable::find(int key) {
   double temp = multiplier;
   multiplier = 1;

   int minIndex = table[(key - lowerBound) / divisionSize];
   int maxIndex = minIndex + tableSize[key / divisionSize];
   int guess = resolve(key);
   double t;
   while (array[guess] != key) {
      // uncomment this line if you want to see where it is searching.
      //cout << "Regression Guessing " << guess << ", not there." << endl;
      if (array[guess] < key) {
         minIndex = guess + 1;
      }
      if (array[guess] > key) {
         maxIndex = guess - 1;
      }
      if (array[minIndex] > key || array[maxIndex] < key) {
         return -1;
      }
      t = ((double)key - array[minIndex]) / ((double)array[maxIndex] - array[minIndex]);
      guess = minIndex + t * (maxIndex - minIndex);
   }

   multiplier = temp;

   return guess;
}

inline unsigned int RegressionTable::resolve(int n) {
   float temp;
   int subDomain = (n - lowerBound) / divisionSize;
   temp = n % divisionSize;
   temp /= divisionSize;
   temp *= tableSize[subDomain];
   temp += table[subDomain];
   temp *= multiplier;
   return (unsigned int)temp;
}

void RegressionTable::sort(void) {
   int * out = new int[int(size * multiplier)];
   bool * used = new bool[int(size * multiplier)];
   int higher, lower;
   bool placed;

   for (int i = 0; i < size; ++i) {

      /* Figure out where to put the darn thing */
      higher = resolve(array[i]);
      lower = higher - 1;

      if (higher > newSize) {
         higher = size;
         lower = size - 1;
      } else if (lower < 0) {
         higher = 0;
         lower = 0;
      }
      placed = false;
      while (!placed) {
         if (higher < size && !used[higher]) {
            out[higher] = array[i];
            used[higher] = true;
            placed = true;
         } else if (lower >= 0 && !used[lower]) {
            out[lower] = array[i];
            used[lower] = true;
            placed = true;
         }
         --lower;
         ++higher;
      }
   }
   int index = 0;
   for (int i = 0; i < size * multiplier; ++i) {
      if (used[i]) {
         array[index] = out[i];
         ++index;
      }
   }

   insertionSort(array, size);
}

此外还有常规的搜索功能：

#include <iostream>
using namespace std;

int binarySearch(int array[], int start, int end, int key) {
   // Determine the search point.
   int searchPos = (start + end) / 2;
   // If we crossed over our bounds or met in the middle, then it is not here.
   if (start >= end)
      return -1;
   // Search the bottom half of the array if the query is smaller.
   if (array[searchPos] > key)
      return binarySearch (array, start, searchPos - 1, key);
   // Search the top half of the array if the query is larger.
   if (array[searchPos] < key)
      return binarySearch (array, searchPos + 1, end, key);
   // If we found it then we are done.
   if (array[searchPos] == key)
      return searchPos;
}

int binarySearch(int array[], int size, int key) {
   return binarySearch(array, 0, size - 1, key);
}

int interpolationSearch(int array[], int size, int key) {
   int guess = 0;
   double t;
   int minIndex = 0;
   int maxIndex = size - 1;
   while (array[guess] != key) {

      t = ((double)key - array[minIndex]) / ((double)array[maxIndex] - array[minIndex]);
      guess = minIndex + t * (maxIndex - minIndex);

      if (array[guess] < key) {
         minIndex = guess + 1;
      }
      if (array[guess] > key) {
         maxIndex = guess - 1;
      }
      if (array[minIndex] > key || array[maxIndex] < key) {
         return -1;
      }
   }

   return guess;
}

然后我编写了一个简单的主程序来测试不同的排序算法。

    #include <iostream>
    #include <iomanip>
    #include <cstdlib>
    #include <ctime>
    #include "regression.h"
    #include "search.h"
    using namespace std;

    void randomizeArray(int array[], int size) {
       for (int i = 0; i < size; ++i) {
          array[i] = rand() % size;
       }
    }

    int main(int argc, char * argv[]) {

       int size = 100000;
       string arg;
       if (argc > 1) {
          arg = argv[1];
          size = atoi(arg.c_str());
       }
       srand(time(NULL));
       int * array;
       cout << "Creating Array Of Size " << size << "...\n";
       array = new int[size];

       randomizeArray(array, size);
       cout << "Sorting Array...\n";
       RegressionTable t(array, size, 0, size*2.5, 1.5, size);
       //RegressionTable t(array, size);
       t.sort();
       int trials = 10000000;
       int start;

       cout << "Binary Search...\n";
       start = clock();
       for (int i = 0; i < trials; ++i) {
          binarySearch(array, size, i % size);
       }
       cout << clock() - start << endl;

       cout << "Interpolation Search...\n";
       start = clock();
       for (int i = 0; i < trials; ++i) {
          interpolationSearch(array, size, i % size);
       }
       cout << clock() - start << endl;

       cout << "Regression Search...\n";
       start = clock();
       for (int i = 0; i < trials; ++i) {
          t.find(i % size);
       }
       cout << clock() - start << endl;

       return 0;
}

试一下，告诉我它是否更快。这非常复杂，如果你不知道自己在做什么，很容易破坏它。修改时要小心。

我在Ubuntu上使用g++编译了主程序。

- regality

看起来很有趣。我今晚会测试一下。但为了明确事情，测试运行是在已经排序的数组上进行的，因此排序性能不会包括在测试中，只测试插值函数。 - kriss

我尝试了一下，它比插值快了两倍。但是我还需要查看细节以了解它是否真的是一个答案（如果是的话，我也可以接受集合或哈希表实现，其中查找也会更快）。无论如何，+1。 - kriss

3

一种处理方式是进行时间和空间的权衡。有许多方法可以实现这一点。最极端的方式是简单地创建一个数组，其最大尺寸为排序数组的最大值。将每个位置初始化为sortedArray中的索引。然后搜索将简单地实现为O(1)。

然而，以下版本可能更加现实，并且可能在现实世界中有用。它使用了一个“助手”结构，在第一次调用时初始化。它通过除以我从未经过太多测试的数字将搜索空间映射到较小的空间中。它将排序数组值的下限索引存储到helper map中，以缩小边界。实际的搜索将toFind数除以选定的除数并提取sortedArray的缩小范围，以进行常规二分搜索。

例如，如果排序值的范围为1到1000，而除数为100，则查找数组可能包含10个“部分”。要查找值250，它将其除以100得到整数索引位置250/100=2. map [2]将包含值大于等于200的sortedArray索引位置。map[3]将具有值大于等于300的索引位置，从而为正常的二进制搜索提供了较小的边界位置。然后，函数的其余部分就是您的二进制搜索函数的完全副本。

助手地图的初始化可能更有效，即使用二进制搜索填充位置而不是简单的扫描，但这是一次性成本，所以我没有测试它。这种机制在给定的均匀分布的测试数中运行良好。如写所述，如果分布不均匀，则不会像这样好。我认为这种方法也可以用于浮点搜索值。但是，将其推广到通用搜索键可能更难。例如，我不确定用于字符数据键的方法是什么。它需要一种O(1)查找/哈希，将其映射到特定的数组位置以查找索引边界。目前对我来说不清楚该函数是什么或者是否存在。

我很快地在以下实现中修补了helper map的设置。它不太好看，我不100%确定它在所有情况下都是正确的，但它确实展示了这个想法。我使用调试测试运行它，以便与您现有的binarySearch函数进行比较，以确保其正确性。

以下是示例数字：

100000 * 10000 : cycles binary search          = 10197811
100000 * 10000 : cycles interpolation uint64_t = 9007939
100000 * 10000 : cycles interpolation float    = 8386879
100000 * 10000 : cycles binary w/helper        = 6462534

这是一个快速而简单的实现：

#define REDUCTION 100  // pulled out of the air
typedef struct {
    int init;  // have we initialized it?
    int numSections;
    int *map;
    int divisor;
} binhelp;

int binarySearchHelp( binhelp *phelp, int sortedArray[], int toFind, int len)
{
    // Returns index of toFind in sortedArray, or -1 if not found
    int low;
    int high;
    int mid;

    if ( !phelp->init && len > REDUCTION ) {
        int i;
        int numSections = len / REDUCTION;
        int divisor = (( sortedArray[len-1] - 1 ) / numSections ) + 1;
        int threshold;
        int arrayPos;

        phelp->init = 1;
        phelp->divisor = divisor;
        phelp->numSections = numSections;
        phelp->map = (int*)malloc((numSections+2) * sizeof(int));
        phelp->map[0] = 0;
        phelp->map[numSections+1] = len-1;
        arrayPos = 0;
        // Scan through the array and set up the mapping positions.  Simple linear
        // scan but it is a one-time cost.
        for ( i = 1; i <= numSections; i++ ) {
            threshold = i * divisor;
            while ( arrayPos < len && sortedArray[arrayPos] < threshold )
                arrayPos++;
            if ( arrayPos < len )
                phelp->map[i] = arrayPos;
            else
                // kludge to take care of aliasing
                phelp->map[i] = len - 1;
        }
    }

    if ( phelp->init ) {
        int section = toFind / phelp->divisor;
        if ( section > phelp->numSections )
            // it is bigger than all values
            return -1;

        low = phelp->map[section];
        if ( section == phelp->numSections )
            high = len - 1;
        else
            high = phelp->map[section+1];
    } else {
        // use normal start points
        low = 0;
        high = len - 1;
    }

    // the following is a direct copy of the Kriss' binarySearch
    int l = sortedArray[low];
    int h = sortedArray[high];

    while (l <= toFind && h >= toFind) {
        mid = (low + high)/2;

        int m = sortedArray[mid];

        if (m < toFind) {
            l = sortedArray[low = mid + 1];
        } else if (m > toFind) {
            h = sortedArray[high = mid - 1];
        } else {
            return mid;
        }
    }

    if (sortedArray[low] == toFind)
        return low;
    else
        return -1; // Not found
}

需要初始化帮助程序结构（并释放内存）：

    help.init = 0;
    unsigned long long totalcycles4 = 0;
    ... make the calls same as for the other ones but pass the structure ...
        binarySearchHelp(&help, arr,searched[j],length);
    if ( help.init )
        free( help.map );
    help.init = 0;

- Mark Wilkins

3

首先，看一下数据，确定是否可以通过数据特定方法比一般方法获得更大的收益。

对于大型静态排序数据集，您可以创建一个额外的索引，根据您愿意使用的内存量提供部分鸽巢。例如，我们创建了一个256x256的二维范围数组，并将其填充为具有相应高位字节的搜索数组中元素的起始和结束位置。当我们进行搜索时，我们使用关键字上的高位字节来查找我们需要搜索的数组的范围/子集。如果我们在100,000个元素的二进制搜索中进行了约20次比较O(log2(n))，那么现在对于16个元素，我们只需进行约4次比较，或者O(log2(n/15))。这里的内存成本约为512k。

另一种方法，同样适用于数据变化不大的情况，是将数据分成常见项数组和很少使用项数组。例如，如果您保留现有的搜索方案，在长时间的测试期间运行广泛的真实案例，并记录要查找的项的详细信息，您可能会发现分布非常不均匀，即某些值比其他值更常被查找。如果是这种情况，请将数组分成一个很小的常见值数组和一个较大的剩余数组，并首先搜索较小的数组。如果数据正确（很大的“如果”！），您通常可以实现与第一种解决方案类似的改进，而无需内存成本。

还有许多其他数据特定优化，这些优化比尝试改进经过验证、测试和广泛使用的一般解决方案要好得多。

- SmacL

3

除非您的数据已知具有特殊属性，否则纯插值搜索有可能需要线性时间。如果您期望插值法在大多数数据上有所帮助，但不想让它在病态数据的情况下产生负面影响，我建议使用插值猜测和中点的（可能加权的）平均值，以确保运行时间具有对数界限。

- R.. GitHub STOP HELPING ICE

在这种情况下，我只是好奇实际情况下，遇到二分搜索通常已经足够好了。正如您所看到的，这里测试的数据集是伪随机的，我并不真正担心看到插值退化，但也许比较插值点和中点可以提供一种选择仅使用二分搜索继续下一步的方法。 - kriss

作为插值退化的真实世界示例，搜索的一个应用是在未索引视频文件中寻找时间戳。对于任何典型比特率低但峰值比特率非常高的视频，基于插值的搜索可能导致底层文件（或更糟糕的网络流）上出现病态的字节搜索操作。 - R.. GitHub STOP HELPING ICE

我认为，使用类似于introsort的方法更容易确保在退化情况下具有对数运行时界限。也就是说，计算迄今为止执行的迭代次数，并且如果寻找正确位置需要太多的迭代次数（定义为对数的某个倍数），则切换到二分搜索。我认为这可能会对性能产生较小的影响，而不是通常修改非退化情况下的搜索位置（假设正确预测二进制转换的if语句，这应该是可以的）。当然，这种预测纯粹是理论上的... - Grizzly

我认为你没有理解混合算法有多容易。在你的插值版本中，在mid = ...这一行后面，只需添加mid = (2*mid + low + hi)/4;即可。 - R.. GitHub STOP HELPING ICE

@R..：我尝试了混合版本。在我的测试集中，它在几乎所有情况下都比插值查找和二分查找慢，包括退化情况（这种情况通常发生在随机数据中）。但是，我应该准备一个最坏情况的测试集来看它的真正性能。 - kriss

尝试搜索指数间隔的数组，例如1、2、4、8、16、32、64、128...使用基数1.03125或者类似的数值，而不是2，如果你想要使其足够大以进行充分测试。 - R.. GitHub STOP HELPING ICE

2

在这个问题被关闭之前，我想发布一下我的当前版本（希望以后能够改进）。目前它比其他所有版本都要糟糕（如果有人能理解为什么我对循环结束的更改会产生这种影响，请留言评论）。

int newSearch(int sortedArray[], int toFind, int len) 
{
    // Returns index of toFind in sortedArray, or -1 if not found
    int low = 0;
    int high = len - 1;
    int mid;

    int l = sortedArray[low];
    int h = sortedArray[high];

    while (l < toFind && h > toFind) {
        mid = low + ((float)(high - low)*(float)(toFind - l))/(1+(float)(h-l));

        int m = sortedArray[mid];

        if (m < toFind) {
            l = sortedArray[low = mid + 1];
        } else if (m > toFind) {
            h = sortedArray[high = mid - 1];
        } else {
            return mid;
        }
    }

    if (l == toFind)
        return low;
    else if (h == toFind)
        return high;
    else
        return -1; // Not found
}

- kriss

0

用于比较的二分查找实现可以改进。关键思想是最初“规范化”范围，以便目标始终在第一步后大于最小值且小于最大值。这增加了终止差量大小。它还具有特殊处理小于排序数组的第一个元素或大于排序数组的最后一个元素的目标的效果。搜索时间预计会提高约15％。以下是C ++中代码的示例。

int binarySearch(int * &array, int target, int min, int max)
{ // binarySearch
  // normalize min and max so that we know the target is > min and < max
  if (target <= array[min]) // if min not normalized
  { // target <= array[min]
      if (target == array[min]) return min;
      return -1;
  } // end target <= array[min]
  // min is now normalized

  if (target >= array[max]) // if max not normalized
  { // target >= array[max]
      if (target == array[max]) return max;
      return -1;
  } // end target >= array[max]
    // max is now normalized

  while (min + 1 < max)
  { // delta >=2
    int tempi = min + ((max - min) >> 1); // point to index approximately in the middle between min and max
    int atempi = array[tempi]; // just in case the compiler does not optimize this
    if (atempi > target)max = tempi; // if the target is smaller, we can decrease max and it is still normalized        
    else if (atempi < target)min = tempi; // the target is bigger, so we can increase min and it is still normalized        
        else return tempi; // if we found the target, return with the index
        // Note that it is important that this test for equality is last because it rarely occurs.
  } // end delta >=2
  return -1; // nothing in between normalized min and max
} // end binarySearch

- Peter Baum

网页内容由stack overflow 提供, 点击上面的

可以查看英文原文，
原文链接

- Ben Voigt · Accepted Answer

如果您能够在内存中控制数据的布局，您可能会想查看Judy数组。

或者说一个更简单的思路：二分查找总是将搜索空间减半。可以使用内插法找到最优切点（切点不应该是期望键值所在的位置，而是最小化下一步搜索空间的统计期望的点）。这样做可以最小化步骤数，但是......并非所有步骤的成本都相同。如果可以保持局部性，分层内存允许在相同时间内执行多个测试，而不是单个测试。由于二分查找的前M个步骤仅涉及最多2**M个唯一元素，因此将它们存储在一起可以获得每个缓存行获取的搜索空间显着减少（而不是每个比较），这在实际世界中具有更高的性能。

n叉树就是基于这个原理工作的，然后Judy数组增加了一些不那么重要的优化。

底线：即使是“随机访问内存”（RAM）也比随机访问更快。搜索算法应该利用这一事实。