最大化AND

Question

最大化AND

3

给定一个由n个非负整数组成的数组：A1，A2，…，AN。如何找到一对整数Au，Av（1≤u＜v≤N），使得（Au和Av）尽可能大。

例如：令N = 4，数组为[2 4 8 10]。这里的答案是8

解释

2 and 4 = 0
2 and 8 = 0
2 and 10 = 2
4 and 8 = 0
4 and 10 = 0
8 and 10 = 8

如果N可以达到10^5，该如何处理呢？我有一个O(N^2)的解决方案，但并不高效。

代码：

for(int i=0;i<n;i++){
    for(int j=i+1;j<n;j++){
        if(arr[i] & arr[j] > ans)
        {
            ans=arr[i] & arr[j];
        }
    }
}

- user3878046

5

你的任务是... 不，这是你的任务。 - SomeWittyUsername

3

可能是任何正在进行的编程比赛中的一个问题！ - Kaidul

1

@Jim Lewis @wallyk 嗯，这是我的错误。但考虑一下这个数组 [3, 4, 6, 7, 8, 9, 17]。这将产生 8（8 和 9）。 - Kaidul

1

@wallyk 错了。对于 [2 3 8 16 32 64]，匹配的数字是 2 和 3。 - SomeWittyUsername

2

@KaidulIslamSazal 看起来有人正在参加同一个比赛。https://dev59.com/6oHba4cB1Zd3GeqPXemw - Raymond Chen

显示剩余7条评论

4个回答

3

我没有测试过这个，也不准备去测试。它的内存占用是O(N)，时间复杂度是O(N)。

#include <vector>
#include <utility>
#include <algorithm>

using namespace std;


/*
 * The idea is as follows:
 * 1.) Create a mathematical set A that holds integers.
 * 2.) Initialize importantBit = highest bit in any integer in v
 * 3.) Put into A all integers that have importantBit set to 1.
 * 4.) If |A| = 2, that is our answer. If |A| < 2, --importantBit and try again. If |A| > 2, basically
 *     redo the problem but only on the integers in set A.
 *
 * Keep "set A" at the beginning of v.
 */
pair<unsigned, unsigned> find_and_sum_pair(vector<unsigned> v)
{
    // Find highest bit in v.
    int importantBit = 0;
    for(auto num : v)
        importantBit = max(importantBit, highest_bit_index(num));

    // Move all elements with imortantBit to front of vector until doing so gives us at least 2 in the set.
    int setEnd;
    while((setEnd = partial_sort_for_bit(v, importantBit, v.size())) < 2 && importantBit > 0)
        --importantBit;

    // If the set is never sufficient, no answer exists
    if(importantBit == 0)
        return pair<unsigned, unsigned>();

    // Repeat the problem only on the subset defined by A until |A| = 2 and impBit > 0 or impBit  = 0
    while(importantBit > 1)
    {
        unsigned secondSetEnd = partial_sort_for_bit(v, --importantBit, setEnd);
        if(secondSetEnd >= 2)
            setEnd = secondSetEnd;
    }

    return pair<unsigned, unsigned>(v[0], v[1]);
}

// Returns end index (1 past last) of set A
int partial_sort_for_bit(vector<unsigned> &v, unsigned importantBit, unsigned vSize)
{
    unsigned setEnd = 0;
    
    unsigned mask = 1<<(importantBit-1);
    for(decltype(v.size()) index = 0; index < vSize; ++index)
        if(v[index]&mask > 0)
            swap(v[index], v[setEnd++]);
    
    return setEnd;
}


unsigned highest_bit_index(unsigned i)
{
    unsigned ret = i != 0;
    while(i >>= 1)
        ++ret;
    return ret;
}

我再次遇到了这个问题，并用另一种方法解决了它（对我来说更容易理解）：

unsigned findMaxAnd(vector<unsigned> &input) {
    vector<unsigned> candidates;
    for(unsigned mask = 1<<31; mask; mask >>= 1) {
        for(unsigned i : input)
            if(i&mask)
                candidates.push_back(i);
        if (candidates.size() >= 2)
            input = move(candidates);
        candidates = vector<unsigned>();
    }
    
    if(input.size() < 2) {
        return 0;

    return input[0]&input[1]; 
}

- user904963

+1。但使用std::partition更容易；它甚至适合于注释：

unsigned maxand(std::vector<unsigned> v) { auto begin = v.begin(), end = v.end(); for (unsigned bit = 1U<<32; bit; bit >>= 1) { auto part = std::partition(begin, end, [&](unsigned a)->bool{return a & bit;}); if (part > begin + 1) end = part; } return *begin & *(begin + 1); }

- rici

你确定这是O(n)吗？ - Abhishek Bansal

1

@user1990169：它使用最多n / 2个交换（因此是O（n））对值的元素表示中的每个位进行一次划分，以最大尺寸n进行划分。由于值不是bignums，RAM模型表示我们可以将比特长度视为常数。而O（cN）属于O（N），因此所提出的算法是O（n）。 - rici

3

将数组按降序排序。
取前两个数字。如果它们都在两个连续的2的幂（即2^k和2^(k+1)）之间，则可以删除所有小于2^k的元素。
从剩下的元素中减去2^k。
重复步骤2和3，直到数组中的元素数量为2。

注意：如果您发现仅最大的元素介于2^k和2^(k+1)之间，而第二大的元素小于2^k，则不会删除任何元素，而只是从最大的元素中减去2^k。

另外，确定一个元素所处的序列{1、2、4、8、16、...}中的位置可以在O(log(log(MAX)))时间内完成，其中MAX是数组中的最大数字。

- Abhishek Bansal

2

除了这个小缺陷外，你需要从第一个元素中减去2^k，然后将该元素重新插入到正确的位置，否则你会得到错误的答案。根据你的数组实现（插入成本），这可能会破坏O(n log n)的运行时间。 - Vincent van der Weele

是的，我们将不得不在O（n）时间内插入此元素。尽管如此，在我看来，该算法的复杂度为O（nlogn），因为我们最多只有log（MAX）次迭代。每次我们都会从最大元素中丢弃最高有效位。 - Abhishek Bansal

绝对正确。考虑到默认实现，你甚至可以将其视为常量，因为整数的大小是有限制的。 - Vincent van der Weele

1

为什么这种O(nlogn)的解决方案被吹捧，而不是我的O(n)的解决方案？ - user904963

@user1990169，我现在也给你的答案点了赞，但昨天我没有点赞的原因是我根本没有看到它。我倾向于跳过“仅代码”的答案，因为我认为这样的答案更容易阅读。不是有意冒犯，只是解释为什么一个次优的答案可能会浮现到顶部。 - Vincent van der Weele

0

这里有一个O(N * log MAX_A)的解决方案：

1）我们贪心地构建答案，从最高位到最低位迭代。

2）为了做到这一点，可以维护一个当前适合的数字集合S。最初，它包含数组中的所有数字。让我们也假设最初ANS = 0。

3）现在让我们从最高位到最低位迭代所有位。假设当前位是B。

4）如果S中值为B位的1的元素数量大于1，则可能在不改变ANS中更高位的值的情况下在此位置上有1，因此我们应该将2^B添加到ANS并删除所有具有此位的0值的元素（它们不再适合）。

5）否则，无法在此位置上获得1，因此我们不会更改S和ANS，并继续进行下一位。

- kraskevich

网页内容由stack overflow 提供, 点击上面的

可以查看英文原文，
原文链接

- Stian Svedenborg · Accepted Answer

你可以加快速度的方法之一是利用以下事实：如果任何两个数字中的高位都被设置了，那么这两个数字的AND操作结果将始终大于使用低位的任何组合。

因此，如果按照设置的位数对数字进行排序，您可以大大减少操作次数。

为了有效地找到最重要的位，GCC具有内置的intrinsic函数：__builtin_clz(unsigned int x)，它返回最重要的设置位的索引。（其他编译器也有类似的intrinsics函数，在至少x86上可以转换为单个指令。）

const unsigned int BITS = sizeof(unsigned int)*8; // Assuming 8 bit bytes.

// Your implementation over.
unsigned int max_and_trivial( const std::vector<unsigned int> & input);    

// Partition the set.
unsigned int max_and( const std::vector<unsigned int> & input ) {
    // For small input, just use the trivial algorithm.
    if ( input.size() < 100 ) { 
        return max_and_trivial(input);
    }        

    std::vector<unsigned int> by_bit[BITS];

    for ( auto elem : input ) {
         unsigned int mask = elem;
         while (mask) { // Ignore elements that are 0.
             unsigned int most_sig = __builtin_clz(mask);
             by_bits[ most_sig ].push_back(elem);
             mask ^= (0x1 << BITS-1) >>  most_sig;
         }
    }

    // Now, if any of the vectors in by_bits have more 
    // than one element, the one with the highest index 
    // will include the largest AND-value.

    for ( unsigned int i = BITS-1; i >= 0; i--) {
        if ( by_bits[i].size() > 1 ) {
             return max_and_trivial( by_bits[i]);
        }
    }

    // If you get here, the largest value is 0.
    return 0;
}

该算法的最坏情况运行时间仍为O(N*N)，但平均情况下应该表现得更好。您还可以通过在搜索较小向量时重复分区步骤来进一步提高性能（只需记住在分区步骤中忽略最高有效位，这样做应该将性能提高到最坏情况为O(N)）。

确保输入数据中没有重复项还将进一步提高性能。