C# - 大端模式下的二进制读取器?

40

我正在尝试通过使用程序读取所有不同的信息来改进对STFS文件格式的理解。使用一个包含哪个偏移量包含什么信息的参考网站,我编写了一些代码,让二进制读取器遍历文件并将值放入正确的变量中。

问题在于所有数据都应该是Big Endian,而二进制读取器读取的却是Little Endian。那么,如何解决这个问题呢?

我可以创建一个Binary reader的模拟类,返回一个翻转后的字节数组吗?还是我可以更改类实例以使其以big endian方式读取,这样我就不必重新编写所有内容了?

感谢任何帮助。

编辑: 我尝试将Encoding.BigEndianUnicode添加为参数,但它仍然读取little endian。


@HansPassant,这是那种需要我开源代码的dll之一吗?为什么有些dll需要这样做呢? - mowwwalker
Skeet销售图书,代码附带少量字符串。请查看该页面的许可证部分。Apache条款在此处:http://www.apache.org/licenses/LICENSE-2.0.html - Hans Passant
@MikeNakis,没错,你关于字节数组的说法是正确的。我还在学习中 :D - mowwwalker
@HansPassant,谢谢,但我已经回答了我的问题 :D - mowwwalker
好的,就是这样。优秀的程序员在21分钟内完成奇迹 :) - Hans Passant
显示剩余3条评论
8个回答

42

我通常不会回答自己的问题,但我用一些简单的代码实现了我想要的效果:

class BinaryReader2 : BinaryReader { 
    public BinaryReader2(System.IO.Stream stream)  : base(stream) { }

    public override int ReadInt32()
    {
        var data = base.ReadBytes(4);
        Array.Reverse(data);
        return BitConverter.ToInt32(data, 0);
    }

    public Int16 ReadInt16()
    {
        var data = base.ReadBytes(2);
        Array.Reverse(data);
        return BitConverter.ToInt16(data, 0);
    }

    public Int64 ReadInt64()
    {
        var data = base.ReadBytes(8);
        Array.Reverse(data);
        return BitConverter.ToInt64(data, 0);
    }

    public UInt32 ReadUInt32()
    {
        var data = base.ReadBytes(4);
        Array.Reverse(data);
        return BitConverter.ToUInt32(data, 0);
    }

}

我知道那就是我想要的,但我不知道该怎么写。我找到了这个页面,它帮了我很多:http://www.codekeep.net/snippets/870c4ab3-419b-4dd2-a950-6d45beaf1295.aspx


12
离题了,但你类的字段 (a16 等) 是不必要的。你在构造函数中为它们分配一个数组,但在每个方法内部,你又用由 Read 函数返回的新数组替换了该数组。你可以在每个方法中放置 var a32 = base.ReadBytes... 并且抛弃这些字段。 - Daniel Earwicker
16
它们不仅仅是不必要的,而是有害的。将本来可能是线程安全代码(忽略共享底层流)转化为共享状态情况。 - skolima
9
在执行反转操作之前,你可能需要检查 BitConverter.IsLittleEndian。如果它的值为 false,则不需要进行反转操作。 - João Portela
1
@JoãoPortela 这取决于源数据的期望字节序! :D - Gusdor
4
是的,你需要知道源数据的字节序和BitConverter的字节序,如果不匹配:就要进行反转。 - João Portela
5
为每次读取分配一个数组会给垃圾回收器增加额外的工作量。你可以直接调用GetByte方法,按照需要重复调用多次,然后将字节移到指定位置并进行位运算(shift/OR)。 - Drew Noakes

17

个人认为这个答案稍微好一些,因为它不需要创建一个新的类,让大端调用显而易见,并允许在流中混合使用大端和小端调用。

public static class Helpers
{
  // Note this MODIFIES THE GIVEN ARRAY then returns a reference to the modified array.
  public static byte[] Reverse(this byte[] b)
  {
    Array.Reverse(b);
    return b;
  }

  public static UInt16 ReadUInt16BE(this BinaryReader binRdr)
  {
    return BitConverter.ToUInt16(binRdr.ReadBytesRequired(sizeof(UInt16)).Reverse(), 0);
  }

  public static Int16 ReadInt16BE(this BinaryReader binRdr)
  {
    return BitConverter.ToInt16(binRdr.ReadBytesRequired(sizeof(Int16)).Reverse(), 0);
  }

  public static UInt32 ReadUInt32BE(this BinaryReader binRdr)
  {
    return BitConverter.ToUInt32(binRdr.ReadBytesRequired(sizeof(UInt32)).Reverse(), 0);
  }

  public static Int32 ReadInt32BE(this BinaryReader binRdr)
  {
    return BitConverter.ToInt32(binRdr.ReadBytesRequired(sizeof(Int32)).Reverse(), 0);
  }

  public static byte[] ReadBytesRequired(this BinaryReader binRdr, int byteCount)
  {
    var result = binRdr.ReadBytes(byteCount);

    if (result.Length != byteCount)
      throw new EndOfStreamException(string.Format("{0} bytes required from stream, but only {1} returned.", byteCount, result.Length));

    return result;
  }
}

13
在执行反转操作之前,请记得检查 BitConverter.IsLittleEndian - João Portela
看起来你需要在 Reverse 后面加上 ".ToArray()",因为 Reverse 返回的是 IEnumerable<byte> 而不是 byte[](这是 BitConverter 需要的)。 - Joezer
3
自从.NET Core推出,也有了BinaryPrimitives类,这使得该方法已经过时:https://learn.microsoft.com/en-us/dotnet/api/system.buffers.binary.binaryprimitives - nikeee
@MorganHarris 是的。但这个检查是针对 BitConverter 的。如果在具有不同字节序的架构中运行此代码,则 BitConverter 工作的假设将不再成立。 - João Portela
1
@JoãoPortela 你知道吗,我甚至没有看到那里有 BitConverter ‍♂️ 是的,在我看来最好不要使用它。BinaryPrimitives 才是正确的工具。 - Morgan Harris
显示剩余3条评论

8

这是一个(针对我需要的情况)几乎完整的替代BinaryReader的解决方案,它正确地处理字节序,而不像大多数其他答案那样。默认情况下,它的工作方式与BinaryReader完全相同,但可以构造为读取所需的字节序。此外,Read<Primitive>方法被重载以允许您指定要在其中读取特定值的字节序 - 在处理混合LE / BE数据的流(不太可能的情况)时非常有用。

public class EndiannessAwareBinaryReader : BinaryReader
{
    public enum Endianness
    {
        Little,
        Big,
    }

    private readonly Endianness _endianness = Endianness.Little;

    public EndiannessAwareBinaryReader(Stream input) : base(input)
    {
    }

    public EndiannessAwareBinaryReader(Stream input, Encoding encoding) : base(input, encoding)
    {
    }

    public EndiannessAwareBinaryReader(Stream input, Encoding encoding, bool leaveOpen) : base(input, encoding, leaveOpen)
    {
    }

    public EndiannessAwareBinaryReader(Stream input, Endianness endianness) : base(input)
    {
        _endianness = endianness;
    }

    public EndiannessAwareBinaryReader(Stream input, Encoding encoding, Endianness endianness) : base(input, encoding)
    {
        _endianness = endianness;
    }

    public EndiannessAwareBinaryReader(Stream input, Encoding encoding, bool leaveOpen, Endianness endianness) : base(input, encoding, leaveOpen)
    {
        _endianness = endianness;
    }

    public override short ReadInt16() => ReadInt16(_endianness);

    public override int ReadInt32() => ReadInt32(_endianness);

    public override long ReadInt64() => ReadInt64(_endianness);

    public override ushort ReadUInt16() => ReadUInt16(_endianness);

    public override uint ReadUInt32() => ReadUInt32(_endianness);

    public override ulong ReadUInt64() => ReadUInt64(_endianness);

    public short ReadInt16(Endianness endianness) => BitConverter.ToInt16(ReadForEndianness(sizeof(short), endianness));

    public int ReadInt32(Endianness endianness) => BitConverter.ToInt32(ReadForEndianness(sizeof(int), endianness));

    public long ReadInt64(Endianness endianness) => BitConverter.ToInt64(ReadForEndianness(sizeof(long), endianness));

    public ushort ReadUInt16(Endianness endianness) => BitConverter.ToUInt16(ReadForEndianness(sizeof(ushort), endianness));

    public uint ReadUInt32(Endianness endianness) => BitConverter.ToUInt32(ReadForEndianness(sizeof(uint), endianness));

    public ulong ReadUInt64(Endianness endianness) => BitConverter.ToUInt64(ReadForEndianness(sizeof(ulong), endianness));

    private byte[] ReadForEndianness(int bytesToRead, Endianness endianness)
    {
        var bytesRead = ReadBytes(bytesToRead);

        if ((endianness == Endianness.Little && !BitConverter.IsLittleEndian)
            || (endianness == Endianness.Big && BitConverter.IsLittleEndian))
        {
            Array.Reverse(bytesRead);
        }

        return bytesRead;
    }
}

1
最佳解决方案,处理主机系统字节序和源数据字节序,并仅在必须时才反转数据。 - Thomas Hilbert
很棒的解决方案,不过BitConverter方法需要增加一个startIndex参数: public short ReadInt16(Endianness endianness) => BitConverter.ToInt16(ReadForEndianness(sizeof(short), endianness), 0); - Peter Wilson
@PeterWilson 你是在尝试在.NET Framework中使用这个吗? - Ian Kemp
这段代码可以正常工作,但效率不高,因为CPU有单指令来进行字节序转换,而不是通过数组操作。bswap - Bogdan Mart
1
非常好的解决方案,如果您正在使用.NET Core 2.1+,则可以使用此版本,它使用BinaryPrimitives,因此反转由框架处理,并且根据Stephen Toub的说法,它更具性能。 https://devblogs.microsoft.com/dotnet/performance-improvements-in-net-core-2-1/. - Mahmoud Ali

8
我不熟悉STFS,但改变字节序相对比较容易。 "网络顺序"是大端序,所以你只需要将其从网络顺序转换为主机顺序即可。
这很容易,因为已经有代码可以做到这一点。请查看IPAddress.NetworkToHostOrder,如此解释:ntohs() and ntohl() equivalent?

6

在我看来,你需要小心处理这个问题。将大端序转换为小端序的原因是如果要读取的字节采用大端序并且对它们进行计算的操作系统采用小端序。

C#不再是仅限于Windows平台的编程语言了。通过像Mono这样的端口,以及其他微软平台,如Windows Phone 7/8、Xbox 360/Xbox One、Windwos CE、Windows 8 Mobile、Linux With MONO、Apple with MONO等,操作系统可能是大端序,如果在没有进行任何检查的情况下转换代码,则会导致错误。

BitConverter已经有一个名为“IsLittleEndian”的字段,可以使用它确定操作环境是否采用小端序。然后您可以有条件地进行反转。

因此,我实际上只是编写了一些byte[]扩展,而不是创建一个大类:

    /// <summary>
    /// Get's a byte array from a point in a source byte array and reverses the bytes. Note, if the current platform is not in LittleEndian the input array is assumed to be BigEndian and the bytes are not returned in reverse order
    /// </summary>
    /// <param name="byteArray">The source array to get reversed bytes for</param>
    /// <param name="startIndex">The index in the source array at which to begin the reverse</param>
    /// <param name="count">The number of bytes to reverse</param>
    /// <returns>A new array containing the reversed bytes, or a sub set of the array not reversed.</returns>
    public static byte[] ReverseForBigEndian(this byte[] byteArray, int startIndex, int count)
    {
        if (BitConverter.IsLittleEndian)
            return byteArray.Reverse(startIndex, count);
        else
            return byteArray.SubArray(startIndex, count);

    }

    public static byte[] Reverse(this byte[] byteArray, int startIndex, int count)
    {
        byte[] ret = new byte[count];
        for (int i = startIndex + (count - 1); i >= startIndex; --i)
        {
            byte b = byteArray[i];
            ret[(startIndex + (count - 1)) - i] = b;
        }
        return ret;
    }

    public static byte[] SubArray(this byte[] byteArray, int startIndex, int count)
    {
        byte[] ret = new byte[count];
        for (int i = 0; i < count; ++i)            
            ret[0] = byteArray[i + startIndex];
        return ret;
    }

那么想象一下这个示例代码:

byte[] fontBytes = byte[240000]; //some data loaded in here, E.G. a TTF TrueTypeCollection font file. (which is in BigEndian)

int _ttcVersionMajor = BitConverter.ToUint16(fontBytes.ReverseForBigEndian(4, 2), 0);

//output
_ttcVersionMajor = 1 //TCCHeader is version 1

2

建议您使用BinaryPrimitives类。

        public override double ReadDouble()
        {
            return BinaryPrimitives.ReadDoubleBigEndian(ReadBytes(8));
        }

        public override short ReadInt16()
        {
            return BinaryPrimitives.ReadInt16BigEndian(ReadBytes(2));
        }

        public override int ReadInt32()
        {
            return BinaryPrimitives.ReadInt32BigEndian(ReadBytes(4));
        }

        public override long ReadInt64()
        {
            return BinaryPrimitives.ReadInt64BigEndian(ReadBytes(8));
        }

        public override float ReadSingle()
        {
            return BinaryPrimitives.ReadSingleBigEndian(ReadBytes(4));
        }

        public override ushort ReadUInt16()
        {
            return BinaryPrimitives.ReadUInt16BigEndian(ReadBytes(2));
        }

        public override uint ReadUInt32()
        {
            return BinaryPrimitives.ReadUInt32BigEndian(ReadBytes(4));
        }

        public override ulong ReadUInt64()
        {
            return BinaryPrimitives.ReadUInt64BigEndian(ReadBytes(8));
        }

1

我对Ian Kemp的出色建议进行了扩展,我正在使用新的BinaryPrimitives,它在.NET Core 2.1+中可用,根据Stephen Toub的帖子,它们具有更高的性能,并且可以在内部处理字节序和反转。

因此,如果您正在运行.NET Core 2.1+,您应该绝对使用此版本:

public class EndiannessAwareBinaryReader : BinaryReader
{
    public enum Endianness
    {
        Little,
        Big,
    }

    private readonly Endianness _endianness = Endianness.Little;

    public EndiannessAwareBinaryReader(Stream input) : base(input)
    {
    }

    public EndiannessAwareBinaryReader(Stream input, Encoding encoding) : base(input, encoding)
    {
    }

    public EndiannessAwareBinaryReader(Stream input, Encoding encoding, bool leaveOpen) : base(
        input, encoding, leaveOpen)
    {
    }

    public EndiannessAwareBinaryReader(Stream input, Endianness endianness) : base(input)
    {
        _endianness = endianness;
    }

    public EndiannessAwareBinaryReader(Stream input, Encoding encoding, Endianness endianness) :
        base(input, encoding)
    {
        _endianness = endianness;
    }

    public EndiannessAwareBinaryReader(Stream input, Encoding encoding, bool leaveOpen,
        Endianness endianness) : base(input, encoding, leaveOpen)
    {
        _endianness = endianness;
    }

    public override short ReadInt16() => ReadInt16(_endianness);

    public override int ReadInt32() => ReadInt32(_endianness);

    public override long ReadInt64() => ReadInt64(_endianness);

    public override ushort ReadUInt16() => ReadUInt16(_endianness);

    public override uint ReadUInt32() => ReadUInt32(_endianness);

    public override ulong ReadUInt64() => ReadUInt64(_endianness);

    public short ReadInt16(Endianness endianness) => endianness == Endianness.Little
        ? BinaryPrimitives.ReadInt16LittleEndian(ReadBytes(sizeof(short)))
        : BinaryPrimitives.ReadInt16BigEndian(ReadBytes(sizeof(short)));

    public int ReadInt32(Endianness endianness) => endianness == Endianness.Little
        ? BinaryPrimitives.ReadInt32LittleEndian(ReadBytes(sizeof(int)))
        : BinaryPrimitives.ReadInt32BigEndian(ReadBytes(sizeof(int)));

    public long ReadInt64(Endianness endianness) => endianness == Endianness.Little
        ? BinaryPrimitives.ReadInt64LittleEndian(ReadBytes(sizeof(long)))
        : BinaryPrimitives.ReadInt64BigEndian(ReadBytes(sizeof(long)));

    public ushort ReadUInt16(Endianness endianness) => endianness == Endianness.Little
        ? BinaryPrimitives.ReadUInt16LittleEndian(ReadBytes(sizeof(ushort)))
        : BinaryPrimitives.ReadUInt16BigEndian(ReadBytes(sizeof(ushort)));

    public uint ReadUInt32(Endianness endianness) => endianness == Endianness.Little
        ? BinaryPrimitives.ReadUInt32LittleEndian(ReadBytes(sizeof(uint)))
        : BinaryPrimitives.ReadUInt32BigEndian(ReadBytes(sizeof(uint)));

    public ulong ReadUInt64(Endianness endianness) => endianness == Endianness.Little
        ? BinaryPrimitives.ReadUInt64LittleEndian(ReadBytes(sizeof(ulong)))
        : BinaryPrimitives.ReadUInt64BigEndian(ReadBytes(sizeof(ulong)));
}

0
你可能会喜欢这个选项,它受到其他答案的启发,但是以通用的方式编写,以便实现更简洁的代码,可以包含所有变体,而不需要太多额外的代码。
public static class BinaryReaderEndianExtensions
{
    public static T ReadNativeEndian<T>(this BinaryReader reader, Func<byte[], T> converter) =>
        converter(reader.ReadBytesRequired(Marshal.SizeOf<T>()));
    public static T ReadForeignEndian<T>(this BinaryReader reader, Func<byte[], T> converter) =>
        converter(reader.ReadBytesRequired(Marshal.SizeOf<T>()).Reverse().ToArray());

    public static T ReadBigEndian<T>(this BinaryReader reader, Func<byte[], T> converter) =>
        BitConverter.IsLittleEndian ? reader.ReadForeignEndian(converter) : reader.ReadNativeEndian(converter);
    public static T ReadLittleEndian<T>(this BinaryReader reader, Func<byte[], T> converter) =>
        BitConverter.IsLittleEndian ? reader.ReadNativeEndian(converter) : reader.ReadForeignEndian(converter);

    public static UInt16 ReadUInt16BE(this BinaryReader reader) => reader.ReadBigEndian(bytes => BitConverter.ToUInt16(bytes));
    public static Int16 ReadInt16BE(this BinaryReader reader) => reader.ReadBigEndian(bytes => BitConverter.ToInt16(bytes));
    public static UInt32 ReadUInt32BE(this BinaryReader reader) => reader.ReadBigEndian(bytes => BitConverter.ToUInt32(bytes));
    public static Int32 ReadInt32BE(this BinaryReader reader) => reader.ReadBigEndian(bytes => BitConverter.ToInt32(bytes));
    public static UInt64 ReadUInt64BE(this BinaryReader reader) => reader.ReadBigEndian(bytes => BitConverter.ToUInt64(bytes));
    public static Int64 ReadInt64BE(this BinaryReader reader) => reader.ReadBigEndian(bytes => BitConverter.ToInt64(bytes));
    public static Double ReadDoubleBE(this BinaryReader reader) => reader.ReadBigEndian(bytes => BitConverter.ToDouble(bytes));
    public static Single ReadSingleBE(this BinaryReader reader) => reader.ReadBigEndian(bytes => BitConverter.ToSingle(bytes));
    public static Half ReadHalfBE(this BinaryReader reader) => reader.ReadBigEndian(bytes => BitConverter.ToHalf(bytes));

    public static UInt16 ReadUInt16LE(this BinaryReader reader) => reader.ReadLittleEndian(bytes => BitConverter.ToUInt16(bytes));
    public static Int16 ReadInt16LE(this BinaryReader reader) => reader.ReadLittleEndian(bytes => BitConverter.ToInt16(bytes));
    public static UInt32 ReadUInt32LE(this BinaryReader reader) => reader.ReadLittleEndian(bytes => BitConverter.ToUInt32(bytes));
    public static Int32 ReadInt32LE(this BinaryReader reader) => reader.ReadLittleEndian(bytes => BitConverter.ToInt32(bytes));
    public static UInt64 ReadUInt64LE(this BinaryReader reader) => reader.ReadLittleEndian(bytes => BitConverter.ToUInt64(bytes));
    public static Int64 ReadInt64LE(this BinaryReader reader) => reader.ReadLittleEndian(bytes => BitConverter.ToInt64(bytes));
    public static Double ReadDoubleLE(this BinaryReader reader) => reader.ReadLittleEndian(bytes => BitConverter.ToDouble(bytes));
    public static Single ReadSingleLE(this BinaryReader reader) => reader.ReadLittleEndian(bytes => BitConverter.ToSingle(bytes));
    public static Half ReadHalfLE(this BinaryReader reader) => reader.ReadLittleEndian(bytes => BitConverter.ToHalf(bytes));

    public static byte[] ReadBytesRequired(this BinaryReader reader, int count)
    {
        byte[] bytes = reader.ReadBytes(count);
        return bytes.Length == count ? bytes
            : throw new EndOfStreamException(string.Format("{0} bytes required from stream, but only {1} returned.", count, bytes.Length));
    }
}

网页内容由stack overflow 提供, 点击上面的
可以查看英文原文,
原文链接