读取文件的特定字节

Question

读取文件的特定字节

44

有没有办法从文件中读取特定的字节？

例如，我有以下代码来读取文件的所有字节：

byte[] test = File.ReadAllBytes(file);

我想读取从偏移量50到偏移量60的字节，并将它们放入一个数组中。

- Ahmed M. Taher

7个回答

33

这应该可以解决问题

var data = new byte[10];
int actualRead;

using (FileStream fs = new FileStream("c:\\MyFile.bin", FileMode.Open)) {
    fs.Position = 50;
    actualRead = 0;
    do {
        actualRead += fs.Read(data, actualRead, 10-actualRead);
    } while (actualRead != 10 && fs.Position < fs.Length);
}

执行完后，data将包含文件偏移量在50到60之间的10个字节，并且actualRead将包含一个数字，范围从0到10，表示实际读取了多少字节（当文件至少有50个但少于60个字节时，这很有意义）。如果文件小于50个字节，则会显示EndOfStreamException。

- Sergey Kalinichenko

2

你应该始终检查Read的返回值，并根据需要循环。即使另外20000个字节可用，Read返回1也是合法的。 - Marc Gravell

1

来自 MSDN 的 FilStream.Read：“即使尚未到达流的末尾，实现也可以返回少于请求的字节数。” - Marc Gravell

重要的是：文档明确保留了这个权利：所以 - 如果你不这样做，你就没有遵循发布的API。 - Marc Gravell

你仍然没有正确更新pos的值；想象一下每次它返回1个字节...这意味着你每次都会覆盖偏移量1（除了第一次），并且告诉它读取太多数据（10-pos）。 - Marc Gravell

与Robert Rouhani的解决方案不同，这个解决方案允许多文件访问。 - Soleil

6

LINQ版本：

byte[] test = File.ReadAllBytes(file).Skip(50).Take(10).ToArray();

- user703016

71

这里将读取所有文件内容，但实际上只会使用其中的10个字节。这并不是很优化的做法 :) - the_joric

1

@the_joric 然而，如果需要从文件中读取任意一段字节是一个常见需求的话，那么提供一个能够返回惰性 IEnumerable<byte> 的文件名辅助函数来代替 File.ReadAllBytes 将是一种有效的方法。 - Richard

1

@Richard -- 不完全是这样。Linq Skip方法仍然会遍历那些字节；它只是不会将它们“yield”给调用方法。你真正想要的是直接从偏移量发出读取请求。为每个操作系统使用定制API调用将是最快的解决方案，尽管提问者可能希望采用纯.NET方法以获得安心。 - Phil Whittington

@PhilWhittington 在这种情况下不会有影响，因为所有内容都在文件的第一个块中，所以无论哪种方式，都只需要一次读取文件数据。 - Richard

6

这太糟糕了，完全违背了仅从文件的一部分进行阅读的整个目的。 - brthornbury

2

这样的答案是为什么LINQ在性能方面有如此糟糕的声誉的原因之一。9个赞意味着人们已经实现了这个解决方案。现在，有一个看似无害的工具函数正在等待着请求加载文件的一小部分，而文件本身比计算机的RAM还要大。为那些不得不调试崩溃的同行们倒一杯吧。 - Eli Davis

3

你需要：

寻找你想要的数据
重复调用Read方法，检查返回值，直到获取所需的所有数据

例如：

public static byte[] ReadBytes(string path, int offset, int count) {
    using(var file = File.OpenRead(path)) {
        file.Position = offset;
        offset = 0;
        byte[] buffer = new byte[count];
        int read;
        while(count > 0  &&  (read = file.Read(buffer, offset, count)) > 0 )
        {
            offset += read;
            count -= read;
        }
        if(count < 0) throw new EndOfStreamException();
        return buffer;     
    }
}

- Marc Gravell

0

using System.IO;

public static byte[] ReadFile(string filePath)
{
    byte[] buffer;
    FileStream fileStream = new FileStream(filePath, FileMode.Open, FileAccess.Read);
    try
    {
        buffer = new byte[length];            // create buffer
        fileStream.Read(buffer, 50, 10);
     }
     finally
     {
         fileStream.Close();
     }
     return buffer;
 }

- DonCallisto

3

Read 调用中的 "offset" 是指缓冲区中的偏移量，而非流中的偏移量。 - Marc Gravell

0

你可以使用文件流，然后调用读取函数

string pathSource = @"c:\tests\source.txt";

using (FileStream fsSource = new FileStream(pathSource,
    FileMode.Open, FileAccess.Read))
{

    // Read the source file into a byte array.
    byte[] bytes = new byte[fsSource.Length];
    int numBytesToRead = 10;
    int numBytesRead = 50;
    // Read may return anything from 0 to numBytesToRead.
    int n = fsSource.Read(bytes, numBytesRead, numBytesToRead);
}

查看这个示例MSDN

- oqx

numBytesRead是偏移量，参数为(buffer, offset, count)。 - oqx

2

另一方面，FileStream.Read 的第二个参数是传递给第一个参数作为数组的偏移量，而不是文件中的偏移量。所以我的理解是正确的！ :-) （代码现在会抛出异常，因为索引 50 超出了 bytes 的末尾。） - Richard

-3

byte[] a = new byte[60];
byte[] b = new byte[10];
Array.Copy( a ,50, b , 0 , 10 );

- Mesh

仅仅因为你编辑了这个问题...在原始帖子中这不是很清楚。 - Mesh

我建议您查看我的编辑。文件要求已经在那里了（而且我没有改变标题）。 - Richard

网页内容由stack overflow 提供, 点击上面的

可以查看英文原文，
原文链接

- Robert Rouhani · Accepted Answer

创建一个BinaryReader，从第50字节开始读取10个字节：

byte[] test = new byte[10];
using (BinaryReader reader = new BinaryReader(new FileStream(file, FileMode.Open)))
{
    reader.BaseStream.Seek(50, SeekOrigin.Begin);
    reader.Read(test, 0, 10);
}