无符号字符型指针 * - 等价于 C#

Question

无符号字符型指针 * - 等价于 C#

3

我将一个库从C++移植到C#，但遇到了一个我不确定如何解决的情况，涉及将unsigned char *转换为unsigned int *。

C++

unsigned int c4;
unsigned int c2;
unsigned int h4;

int pos(unsigned char *p)
{
    c4 = *(reinterpret_cast<unsigned int *>(p - 4));
    c2 = *(reinterpret_cast<unsigned short *>(p - 2));
    h4 = ((c4 >> 11) ^ c4) & (N4 - 1);

    if ((tab4[h4][0] != 0) && (tab4[h4][1] == c4))
    {
        c = 256;
        return (tab4[h4][0]);
    }

    c = 257;
    return (tab2[c2]);
}

C#（这是错误的）：

 public uint pos(byte p) 
 {
        c4 = (uint)(p - 4);
        c2 = (ushort)(p - 2);
        h4 = ((c4 >> 11) ^ c4) & (1 << 20 - 1);
        if ((tab4[h4, 0] != 0) && (tab4[h4, 1] == c4)) {
            c = 256;
            return (tab4[h4, 0]);
        }
        c = 257;
        return (tab2[c2]);
 }

我认为在C#示例中，您可以将byte p更改为byte[]，但是当涉及到将byte[]转换为单个uint值时，我感到困惑。

此外，有人能解释一下，为什么要将unsigned char *强制转换为unsigned int *吗？它有什么用途？

任何帮助/指引都将非常有用。

- Steve_B19

指针算术运算中的p-4和p-2需要在我们尝试为C#等效项定义适当签名之前查看调用上下文。源代码中的p的内存布局是什么？ - Cee McSharpface

2

你可以使用BitConverter类。 - bitbonk

@dlatikay 感谢您的评论。P的上下文是一个unsigned char [1 << 26]，整个文件使用fread读入char[]中。然后在循环中调用Pos x = pos（＆buf [i]）; - Steve_B19

在那个循环中，i 跳过了前四个字节，并以 6 递增？能展示一部分吗？我建议重新编写文件解析器，使用反映 Int32 后跟 ushort 布局的结构体，或者可能只需在 pos() 中有两个参数。我对重复使用 reinterpret_cast 的争议并不是我们在这里的重点。 - Cee McSharpface

1

让它顺其自然，@πάντα ῥεῖ。我们有一种方法可以在没有任何“reinterpret_cast”混乱的情况下帮助 OP。看起来一个干净的移植，没有任何指针算术或“不安全”的操作是可行的。 - Cee McSharpface

显示剩余4条评论

3个回答

1

在 C# 中实现类似的功能不需要逐语句地复制 C 版本的代码，特别是当原始版本使用指针时。
当我们假设一个架构，其中 int 是 32 位时，你可以简化 C# 版本如下：

uint[] tab2;
uint[,] tab4;
ushort c;

public uint pos(uint c4)
{
    var h4 = ((c4 >> 11) ^ c4) & (1 << 20 - 1);
    if ((tab4[h4, 0] != 0) && (tab4[h4, 1] == c4))
    {
        c = 256;
        return (tab4[h4, 0]);
    }
    else
    {
        c = 257;
        var c2 = (c4 >> 16) & 0xffff; // HIWORD
        return (tab2[c2]);
    }
}

这种简化是可能的，因为c4和c2重叠：c2是c4的高位字，只有在查找tab4不匹配时才需要它。

（标识符N4在原始代码中存在，但在您的翻译中被表达式1<<20替换。）

调用代码将必须循环遍历一个int数组，根据注释所述，这是可能的。虽然原始的C++代码从偏移量4开始向后查找，但C#等效代码将从偏移量0开始，这似乎更自然。

- Cee McSharpface

0

在 C++ 代码中，您正在发送指向 char 的指针，但通常 C# 不会以这种方式使用内存，您需要使用数组而不是指针。但是您可以使用 "unsafe" 关键字直接处理内存。 https://msdn.microsoft.com/en-us/library/chfa2zb8.aspx

- Uladzimir Palekh

网页内容由stack overflow 提供, 点击上面的

可以查看英文原文，
原文链接

- Ben Voigt · Accepted Answer

需要翻译的内容：

以下是需要翻译的问题代码行:

int pos(byte[] a, int offset)
{
    // Read the four bytes immediately preceding offset
    c4 = BitConverter.ToUInt32(a, offset - 4);
    // Read the two bytes immediately preceding offset
    c2 = BitConverter.ToUInt16(a, offset - 2);

并将调用从 x = pos(&buf[i])（即使在 C++ 中也与 x = pos(buf + i) 相同）更改为

x = pos(buf, i);

一个重要的注意点是，现有的C++代码存在错误，因为它违反了严格别名规则。