面试谜题：数组大小为n，包含从[i，i + n)的数字

Question

面试谜题：数组大小为n，包含从[i，i + n)的数字

8

给定一个未排序的数字数组，编写一个函数，如果数组由连续的数字组成，则返回true。

示例：

1. 如果数组是{5, 2, 3, 1, 4}，则函数应该返回true，因为数组中有从1到5的连续数字。 2. 如果数组是{83, 78, 80, 81, 79, 82}，则函数应该返回true，因为数组中有从78到83的连续数字。 3. 如果数组是{34, 23, 52, 12, 3}，则函数应该返回false，因为元素不是连续的。 4. 如果数组是{7, 6, 5, 5, 3, 4}，则函数应该返回false，因为5和5不是连续的。

我想出了以下算法：

1. 找到数组的最大值和最小值 2. max-min+1应该是数组的大小 3. 检查是否有重复项 4. 检查中间所有连续的数字

如何实现第四步？（复杂度应为O(n)）

欢迎提出其他建议。

- garima

1

你打算如何在未排序的数组中以O(n)时间执行第三步？ - NPE

@aix，如果这些数字是随机的，你就不能做到。但是你可以利用它们的特殊属性（连续性）来得出解决方案。 - Mark Ransom

2

如果您已经验证了第二条路径和第三条路径，那么您就不需要第四条路径：它是自动的。 - pmg

算法以确定数组是否包含 n，nm 的确切重复项 - xan

8个回答

2

bool consecutive(int a[], size_t n)
{
    int min = find_min(a,n);
    int max = find_max(a,n);

    if (max - min + 1 == n) {
        // variant of counting sort, O(n)
        // note: never freed, don't use in production
        int *freq = calloc(sizeof(int), n);

        for (size_t i=0; i<n; i++)
            if (++freq[a[i] - min] > 1)
                return false;
        return true;
    }
    return false;
}

- Fred Foo

如果您使用 freq 作为位向量而不是计数，可以在您的代码中节省32倍（或是字长大小）。更简单地说，如果您只使用 uint8_t 数组，那么您可以节省约4倍的空间。 - Thomas M. DuBuisson

@TomMD：没错。但我只是想达到O(n)的复杂度。 - Fred Foo

2

int visited[max - min + 1];

for(c = min; c <= max; c++) {
    visited[array[c] - min] += 1;

    if(visited[array] > 1)
        return false;               // since this is a duplicate
}

这应该是没问题的。visited[] 跟踪原始数组中每个数字出现的次数。如果它的任何元素大于2，则存在重复项，因此返回false；由于循环结束时两个数组的大小为max-min+1，所以 visited[] 已满，因为我们访问了 array[] 的所有元素。因此，如果 visited 为空，则必须在某个地方存在重复项，但是我们不需要去烦恼，因为此时我们仍然返回 false。

- BlackBear

2

我认为这可以在O(n)时间内完成，且仅需额外的O(1)空间。

确定数组的最小值和最大值。如果(max - min + 1) != n，则返回false。

从数组中减去最小值。现在我们有一个来自范围[0, n)的n个元素的数组。我们现在只需要检查重复项。通过以下代码，可以在线性时间内完成此操作（每个元素最多移动一次）：

for (int i = 0; i != n; ++i)
{
    if (A[i] != i)
    {
        int temp = A[i];
        for (;;)
        {
            int index = temp;
            if (A[index] == index)
            {
                // We have a duplicate
                return false;
            }
            std::swap(temp, A[index]);
            if (index == i)
            {
                // A[i] now contains i
                break;
            }
        }
    }
}
// If we get to here, there are no duplicates

- user515430

正确，但是在第一个 if 中你不需要这么努力。只需放置 a[i]。if (a[i]!=i) { if (a[a[i]]==i) return false; else std::swap(a[i], a[a[i]]); } 没有循环内嵌循环，但可能会有更多的数组下标。总体上看起来更好的方法是始终进行交换并进行另一次遍历以检查重复项。 - xan

@xan - 我觉得你是指 if (a[a[i]] == a[i]) return false; 我还是认为你需要一个内部循环。试试用 A = {2, 0, 3, 4, 1}。 - user515430

@user515430 你说得对，感谢提供反例。再试一下单循环代码：i=0; while(i<n) { if (a[i] == i) i++; else if (a[a[i]] == a[i]) return false; else std::swap(a[i], a[a[i]]); } 目前这个循环语法只是我认为更接近您原来的结构。 - xan

0

根据您的要求，您算法中的第四步根本不需要（因为第二步和第三步将保证它们是连续的）

我们可以通过在单次遍历中执行所有步骤来节省更多比较（以改善其时间复杂度）：

int is_consecutive(int *a, int size)  // O(n) algo
{   
 int min, max;
 hash H;
 if(!a || size == 0 ) return -1;
 max=min=a[0];
 H.insert(a[0]);

 for(i=1; i<size; i++) {
   if(H.find(a[i]) { delete H; return 0; }  // duplicacy check
   else H.insert(a[i]);
   if(a[i] < min) min = a[i];
   else if(a[i] > max) max = a[i];
 }
 if(size != (max - min)) { delete H; return 0; }  //range = size check
 delete H;   
 return 1;
}

- pankiii

0

我不太擅长C语言，但是你需要执行第四步吗？如果2为真且没有重复项，则它包含该序列，否则它就不包含。

- Dean Barnes

当数字为负数时怎么处理？ - garima

你说得没错，但是在不排序的情况下检查重复并不容易。 - xan

-2

在数组中查找最小元素、最大元素和元素之和。

它们形成一个等差数列，元素之和为：(元素个数/2)*(2*minElement+(n-1)*difference)

在这种情况下，difference为1

如果sum==(array.length/2)*(2*minElement+(array.length-1)*1) && maxElement==(minElement+(lenght ofArray-1)*1)

那么这些数字就是连续的。

没有额外的空间复杂度，时间复杂度为O(n)

感谢@jpalecek的纠正。现在应该没问题了

- Tejas

不正确，反例为 1 1 4 4。 - jpalecek

-2

这里有一个适用于1..N的方法，你可以使用等差数列公式来调整[i,i+n]。

// gcc -Wall q1.cc -o q1 && q1                                                                          

//Given an array of size N in which every number is between 1 and N, write a simple program to determine
//if there are any duplicates in it.                                                                    

//answer                                                                                                
/* The numbers can be assumed to not be stored in sorted order.  We also assume that all numbers are    
between 1 and N inclusive.  If for example they are not, then we will return true indicating that there 
is a possibility of a duplicate.                                                                        

This can be implemented using a bit array of size N all initialized to 0.  as we iterate the input array
of numbers, we simply check if the bit array at that number index is set, if so, we return true.  if not
we set the bit.  we loop until we get to the end of the list, and if we get there, that means no        
duplicate was found, and we can return false.                                                           

However, the above algorithm has the problem of using extra space.  A better idea is to take advantage  
of the fact that we are guaranteed that the numbers are between 1 and N.  if we were to use the fact    
that the arithmetic sum of these numbers is equal to n(n+1)/2, then we can just sum up all the numbers  
and ensure that it equals the above sum.  If it does, we have no duplicate, else we do.                 
*/                                                                                                      

#include <stdio.h>                                                                                      
#include <assert.h>                                                                                     

bool any_dup(const int *a, int n)                                                                       
{                                                                                                       
        int req_sum = n*(n+1)/2;                                                                        

        int s = 0;                                                                                      
        for (int i = 0;i<n;i++)                                                                         
                s += a[i];                                                                              

        return (s != req_sum);                                                                          
}                                                                                                       


int main()                                                                                              
{                                                                                                       
        int a[]={1,2,3,4,5};                                                                            
        assert(!any_dup(a,5));                                                                          

        int b[]={1,2,2,3,3,4,5,6};                                                                      
        assert(any_dup(b,8));                                                                           

        int c[]={5,2,3,1,4};                                                                            
        assert(!any_dup(c,5));                                                                          

        int d[]={34,21,1,4,2};                                                                          
        assert(any_dup(d,5));                                                                           

        return 0;                                                                                       
}

- lateralpunk

不行，反例是 1 1 4 4。 - jpalecek

网页内容由stack overflow 提供, 点击上面的

可以查看英文原文，
原文链接

- Seth · Accepted Answer

18

如果输入的数组为A：

找到A的最小值和最大值，如果数组大小错误，则返回False。
创建一个相同大小的新数组B，初始全部为零。
对于每个索引i，让B[A[i] - min] = 1。
检查B是否仍然包含任何零。

每个步骤都需要O(n)时间。

- Seth

不错。在步骤3中需要进行边界检查。同时，可以稍微改进一下步骤3，即将其更改为“增加B[A[i]-min]并返回false，如果结果超过1”。 - NPE

@aix - 好的建议。我在第一步中添加了边界检查，但任何一个地方都可以。 - Seth

只是一个小提示...在第三步中，您会想要检查是否使用了超出数组索引的范围。我知道这很挑剔，但我曾经因为忽略简单的检查而受到面试官的批评。 - gnomed

1

@aix，@gnomed：不需要边界检查，因为B的大小为max-min+1，所以任何索引A[i]-min都在范围内。但是，在第2步中，您需要检查一下是否要分配大小为0的数组，因为max = (U)INT_MAX和min = (U)INT_MIN。 - Steve Jessop

@Steve - 在aix发表评论后，我添加了“查找最大值并检查数组大小是否错误”的部分。 - Seth

显示剩余2条评论