寻找快速计算h指数的算法

Question

寻找快速计算h指数的算法

algorithmsorting

15

http://en.wikipedia.org/wiki/H-index

这个维基页面是解释什么是h指数。

如果我有一个数组 [ 0 3 4 7 8 9 10 ]，我的h指数将会是4，因为我有4个数字大于4。如果我有5个数字大于5，那么我的h指数将会是5，依此类推。给定一个大于或等于0的整数数组，有哪些有效地计算h指数的方法呢？

编辑：数组不一定是排序的。

- cakester

7个回答

2

这可以在O(n)时间内完成。

找到数组的中位数。
如果中位数>(n-1)/2，则数字出现在中位数之前。迭代查找它。
如果中位数<(n-1)/2，则数字出现在中位数之后。迭代查找它。
如果中位数==(n-1)/2，则中位数是解。

在这里我假设n为奇数。对于偶数n稍微改变算法（将(n+1)/2替换为n/2，假设中位数的排名为n/2）。另外，在O(n)时间内找到实际中位数很复杂。使用一个好的枢轴代替它（如快速排序中那样）。

复杂度：n+n/2 +n/4... = O(n)

- ElKamina

太棒了=D，我想线性是最好的结果。 - cakester

1

这是O(NlogN)，因为需要在整个数组中搜索中位数。 - Толя

当答案不在数组中怎么办？例如 [0 0 0 9 9 9]。 - Толя

@Толя 当你只剩下一个元素，而这不是正确的解决方案时，那么就没有解决方案了。 - ElKamina

@Vikram Bhat 1/2 + 1/4 + ... (up to infinity) = 1，所以正如ElKamina所写：O(n)。 - artur grzesiak

这个算法是错误的，以例子 [2,2,2,2,2] 为例。这里的中位数是 2，因此 median == (n-1)/2 导致解决方案为 2。这意味着应该恰好有两个大于2的数字。检查数组证明了这个陈述是错误的。 - a_guest

1

回答用c#语言，但也容易转换为java语言

    public int HIndex(int[] citations) {
            Array.Sort(citations);

            var currentCount = 0;
            var length = citations.Length;
            for (var i = citations.Length - 1; i >= 0; i--)
            {
                currentCount = length - i;
                // if the count of items to the right is larger than current value it means thats the max we can expect for hindex

                if (currentCount - 1 >= citations[i])
                {
                    return currentCount - 1;
                }
            }

            return currentCount;
    }

- Dasith Wijes

0

这是我能想到的一个解决方案，不确定它是否是最好的。

将数组按升序排序。复杂度为nlog(n)。

从索引0到n迭代数组。复杂度为n。

对于每次迭代，假设索引为i。

if (arr[i] == (arr.length - (i+1))
    return arr[i]

e.g.,

arr =[ 0 3 4 7 8 9 10 ]
arr[2] = 4
i = 2
arr.length = 7
4 = (7- (2+1))

- vishnu viswanath

不需要排序（O（nlogn））。看看我的解决方案，它是O（n）。 - ElKamina

抱歉。 :) 那是一个打字错误。 - vishnu viswanath

@ElKamina。是的，你的答案更好。 - vishnu viswanath

这个算法是错误的，因为它只会在实际的h指数是数组的一部分时返回结果。将数字“11”添加到您的示例（[0, 3, 4, 7, 8, 9, 10, 11]）应该返回解决方案“5”，因为有五个数字大于5，但是这个算法找不到解决方案。 - a_guest

0

n为数组大小

对数组进行排序

然后h指数= max(min(f(i),i) for i=1:n)

由于h指数永远不会超过n，因此将数组中所有大于n的数字替换为n。

现在使用计数排序对数组进行排序。

时间复杂度O(n) 空间复杂度O(n)

- Shubham Jain

0

这是O(nlogn)时间复杂度，同时简洁明了。


public static int hindex(int[] array) {
        Arrays.sort(array);
        int pos = 0;
        while (pos < array.length && array[pos] <= array.length - pos) {
            pos++;
        }
        return array[pos - 1];
    }

- rbsomeg

这将返回数组中的一个元素，但并不总是正确的，例如 [0, 2]，答案应该是1。此外，这种解决方案对于像 [4, 5] 这样的情况无效。 - Justin Harris

-1

我对之前的实现不满意，所以我用Java编写了一个更快的解决方案来替换它。

public int hIndex(int[] citations) {
    if(citations == null || citations.length == 0)
    {
        return 0;
    }

    Arrays.sort(citations);

    int hIndex = 0;

    for(int i=0;i<citations.length;i++)
    {
        int hNew;

        if(citations[i]<citations.length-i)
        {
            hNew = citations[i];

            if(hNew>hIndex)
            {
                hIndex = hNew;
            }
        }
        else if(citations[i]>=citations.length-i)
        {
            hNew = citations.length-i;

            if(hNew>hIndex)
            {
                hIndex = hNew;
            }

            break;
        }
    }

    return hIndex;
}

- Hüseyin Burhan ÖZKAN

网页内容由stack overflow 提供, 点击上面的

可以查看英文原文，
原文链接

- Толя · Accepted Answer

这是我的思路，使用表格来实现O(N)时间复杂度，非常简单并且速度飞快：

private static int GetHIndex(int[] m)
{
    int[] s = new int[m.Length + 1];
    for (int i = 0; i < m.Length; i++) s[Math.Min(m.Length, m[i])]++;

    int sum = 0;
    for (int i = s.Length - 1; i >= 0; i--)
    {
        sum += s[i];
        if (sum >= i)
            return i;
    }

    return 0;
}