在另一个数组中查找一个数组（byte[]）？

Question

在另一个数组中查找一个数组（byte[]）？

c#arrayssearch

10

什么是在另一个byte[]中查找byte[]的最简单方法？我有一种感觉可以使用linq来完成，但我不知道如何做。

注意：我使用了[c#]进行搜索，但没有找到任何内容，我很惊讶。

- user34537

我认为我们需要更多的信息。您是在尝试在字节数组中查找字节子序列吗？您能举个例子吗？ - Andrew

3

例如，可参考Knuth-Morris-Pratt算法。 - jason

6个回答

9

这里有一个简单（天真？）的方法来实现它：

static int search(byte[] haystack, byte[] needle)
{
    for (int i = 0; i <= haystack.Length - needle.Length; i++)
    {
        if (match(haystack, needle, i))
        {
            return i;
        }
    }
    return -1;
}

static bool match(byte[] haystack, byte[] needle, int start)
{
    if (needle.Length + start > haystack.Length)
    {
        return false;
    }
    else
    {
        for (int i = 0; i < needle.Length; i++)
        {
            if (needle[i] != haystack[i + start])
            {
                return false;
            }
        }
        return true;
    }
}

- Ergwun

完美，正如我所需。可惜我不能使用LINQ或其他内置的东西来做到这一点。你刚刚写了这个吗？还是从别处复制/粘贴的？ - user34537

请注意，根据输入的不同，这可能非常慢。 - jason

@acidzombie24：很容易举出一些极其缓慢的例子。您可以使其反复开始在算法的匹配部分进行长时间搜索，然后几乎失败，然后不得不重新开始所有操作。 - jason

@acidzombie24：慢是一个相对的术语。有更快的方法来做这件事（请参见Jason的链接）。根据您预期的使用速度，这可能并不重要。 - Ergwun

@jason,@Ergwun：啊，我明白了。好的，Jason的答案加一分。实际上，我一直在寻找一个简单的解决方案，而不是自己编写代码（希望我能学到.NET库的新知识）。所以两个答案都加一分，我会保留原样。 - user34537

显示剩余3条评论

0

尝试使用lambda表达式：

private bool CheckPatternInArray(byte[] array, byte[] pattern)
{
    int fidx = 0;
    int result = Array.FindIndex(array, 0, array.Length, (byte b) =>
            {
                fidx = (b == pattern[fidx]) ? fidx + 1 : 0;
                return (fidx == pattern.Length);
            });
    return (result >= pattern.Length - 1);
}

如果你想要最快的解决方案，请查看这里。

- Alex Klaus

0

虽然这是一个老问题，但由于它仍然缺少LINQ，尽管它是一个常见的场景，我在下面添加了一个基于Michael's answer的LINQ扩展方法。它以字符串/byte[] IndexOf的精神编写。

它还明确处理空针集，而先前的解决方案返回匹配（索引0），现在将其作为缺失（索引-1）返回。

public static class LinqExtensions
{
    public static int IndexOf(this IEnumerable<byte> haystack, IEnumerable<byte> needle)
    {
        var needleArray = needle as byte[] ?? needle.ToArray();
        var haystackArray = haystack as byte[] ?? haystack.ToArray();

        var needleLength = needleArray.Length;
        var haystackLengthLimit = haystackArray.Length - needleLength;

        if (needleLength > 0)
        {
            for (var i = 0; i <= haystackLengthLimit; i++)
            {
                var j = 0;
                for (; j < needleLength; j++)
                {
                    if (needleArray[j] != haystackArray[i + j])
                        break;
                }

                if (j == needleLength)
                    return i;
            }
        }

        return -1;
    }
}

另外还有一些测试来展示它的工作方式。

    [Test]
    [TestCase(new byte[] { 1, 2, 3}, new byte[] {1, 3}, -1)]
    [TestCase(new byte[] { 1, 2, 3}, new byte[] {}, -1)]
    [TestCase(new byte[] { 1, 2, 3}, new byte[] {1}, 0)]
    [TestCase(new byte[] { 1, 2, 3}, new byte[] {2, 3}, 1)]
    [TestCase(new byte[] { 1, 2, 3, 20, 30, 40}, new byte[] {20, 30, 40}, 3)]
    [TestCase(new byte[] { 1, 2}, new byte[] {1, 2, 3}, -1)]
    [TestCase(new byte[] { }, new byte[] {1, 2, 3}, -1)]
    [TestCase(new byte[] { }, new byte[] {}, -1)]
    public void TestIndexOf(byte[] haystack, byte[] needle, int expectedIndex)
    {
        Assert.That(haystack.IndexOf(needle), Is.EqualTo(expectedIndex));
    }

- stoj

0

byte[] any = { 0xff, 0x14, 0x1f, 0x13, 0x12, 0x2f, 0x3f, 0x4f, 0x5f, 0x6f, 0x11, 0x22, 0x23 };
byte[] pattern = { 0x4f, 0x5f, 0x6f };
string anyHexString = BitConverter.ToString(any).Replace("-", "");
string patternHexString = BitConverter.ToString(pattern).Replace("-", "");
int findIndex = anyHexString.IndexOf(patternHexString) / 2;
Console.WriteLine(findIndex);

如果你不在意性能的话，可以使用这种方法，它几乎是最简洁和清晰的。

将字节数组转换为十六进制字符串并查找。

- 拉拉姬

-2

你可能自己就能想到这个，但有时我喜欢做简单的事情。

bool found = false;
int i = 0;
for(; i < byteArray.Length || found; i++)
{
  if(byteArray[i] == lookingFor)
  {
    found = true;
  }
}

- Aaron Anodide

2

我想你误解了问题。把这个问题看成是在一个字符串中查找一个单词，但这个单词其实是一个 byte[]，而这个字符串又是另外一个 byte[]。 - jason

是的，我把它读成了字节数组中的字节。我的错。如果你有ASCII码，你可以使用ASCIIEncoding.ASCII.GetString将byte[]转换为字符串。 - Aaron Anodide

网页内容由stack overflow 提供, 点击上面的

可以查看英文原文，
原文链接

- Michael Geary · Accepted Answer

这里是 Ergwun 出色答案的更快版本:

static int SearchBytes( byte[] haystack, byte[] needle ) {
    var len = needle.Length;
    var limit = haystack.Length - len;
    for( var i = 0;  i <= limit;  i++ ) {
        var k = 0;
        for( ;  k < len;  k++ ) {
            if( needle[k] != haystack[i+k] ) break;
        }
        if( k == len ) return i;
    }
    return -1;
}

在一个短暂的测试中，使用11MB大小的草堆和9字节大小的针，这个算法大约快了三倍。

优化措施包括：

外循环中不进行函数调用。
缓存针的长度和搜索限制。
移除match()函数开头的冗余长度测试。

当然，对于长字节数组，你可能想使用像Boyer-Moore搜索这样的算法，但是对于许多情况来说，像这样简单的算法已经足够好了，并且它还具有简短易懂、易于验证的特点。