如何使用AsynchronousFileChannel读取大文件？

Question

如何使用AsynchronousFileChannel读取大文件？

5

  Path file = Paths.get("c:/large.log");
  AsynchronousFileChannel channel = AsynchronousFileChannel.open(file);
  final ByteBuffer buffer = ByteBuffer.allocate(1000);
  channel.read(buffer, 0, buffer,
      new CompletionHandler<Integer, ByteBuffer>() {
        public void completed(Integer result, ByteBuffer attachment) {
          System.out.println(new String(buffer.array()));
        }
  });

以这种方式，我可以读取 large.log 的前 1000 字节。如果我不想分配更大的字节数组（如 ByteBuffer.allocate(1000*1000)），该怎么读取后续日志呢？因为我认为这会导致 OutOfMemory。

能否有人给我提供样例代码呢？谢谢。

补充：我可以使用 JIO 循环读取大文件，因为我可以检查 java.io.BufferedReader.read() 的返回值。但我不知道如何使用 NIO2。

- liam xu

4个回答

1

GregHNZ的解决方案非常好，由于我在不同的项目中需要多次使用这种代码，我最终将其放入了一个辅助库RxIo中，并将其发布到了Maven Central Repository，并且也可在RxIo github仓库中获取。使用RxIo，您可以使用RxIo实用类来读取文件的所有字节，如下所示：

AsyncFiles
    .readAllBytes(Paths.get("input.txt"))
    .thenApply(bytes -> { /*... use bytes... */});

readAllBytes(Path file)方法会分配一个默认大小为262144的ByteBuffer，但你可以使用readAllBytes(Path file, int bufferSize)来指定不同的值。

你可以在单元测试文件夹中查看其他用例。

- Miguel Gamboa

0

如果文件中还有剩余内容，可以在completionHandler中启动另一个读取操作。但是我建议使用比1000更大的缓冲区，至少8192。

- user207421

可以运行。当然，您必须清除缓冲区，并在读取中增加位置参数，因此需要处理一些最终变量，但它是可以完成的。 - user207421

0

利用文件中的位置和文件大小，异步读操作需要在完成处理程序中反复调用以读取整个文件。每次读取操作完成后，需要使用读取的字节数增加位置。

以下是完成的异步读取整个文件的完成处理程序方法。有关完整示例，请参见 http://www.zoftino.com/java-asynchronous-io-nio2。

public void completed(Integer result, ByteBuffer attachment) {
    try {
        bb.flip();
        System.out.println("bytea red "+bb.limit());

        if(afc.size() > position) {
            position = position + bb.limit();
            bb.clear();
            //pass the same completion handler
            afc.read(bb, position, bb, this);
        }
    } catch (IOException e) {
        e.printStackTrace();
    }       
}

- Arnav Rao

网页内容由stack overflow 提供, 点击上面的

可以查看英文原文，
原文链接

- GregHNZ · Accepted Answer

这是一个可行的 hack。

需要注意以下几点：

我刚刚使用了你的 buffer.array() 作为输出。我必须使用 buffer.clear() 来重置位置，以便异步读取能够看到有1000个备用字节，但这并不会清除数组中现有的数据。结果当你在文件末尾时，如果读取的字节数少于1000个字节，则会打印整个缓冲区：刚刚读取的所有内容以及剩余的1000个字节。在实际应用中，您需要采取一些措施（如使用result或缓冲区的位置）来解决这个问题。
由于我无法弄清楚原因，类变量 buffer 在 completed 方法内部可以正常工作，但同样是类变量的 channel 是 null。我还没有找出为什么。所以我修改了它，使其将 channel 作为 attachment 传递而不是 buffer。对我来说还是毫无意义。
异步读取线程不足以保持 jvm 运行。所以我只是在 main 方法的末尾放置了一个 read。按下 Enter 退出。
类变量 pos 维护您从中读取文件的位置。
当你在 complete 方法中发起另一个异步读取时，魔法就会发生。这就是我放弃匿名类并实现接口本身的原因。
您需要将路径改回您自己的路径。

玩得开心。

import java.nio.*;
import java.nio.channels.*;
import java.nio.file.*;
import java.io.IOException;

public class TryNio implements CompletionHandler<Integer, AsynchronousFileChannel> {

       // need to keep track of the next position.
        int pos = 0;
        AsynchronousFileChannel channel =  null;
        ByteBuffer buffer = null;

        public void completed(Integer result, AsynchronousFileChannel attachment) {
                 // if result is -1 means nothing was read.
                if (result != -1) {
                        pos += result;  // don't read the same text again.
                                        // your output command.
                        System.out.println(new String(buffer.array()));

                        buffer.clear();  // reset the buffer so you can read more.
                }
                        // initiate another asynchronous read, with this.
                attachment.read(buffer, pos , attachment, this );


        }
        public void failed(Throwable exc,
                        AsynchronousFileChannel attachment) {
                System.err.println ("Error!");
                exc.printStackTrace();
        }

        public void doit() {
                Path file = Paths.get("/var/log/syslog");
                AsynchronousFileChannel channel =  null;
                try {
                        channel = AsynchronousFileChannel.open(file);
                } catch (IOException e) {
                        System.err.println ("Could not open file: " + file.toString());
                        System.exit(1); // yeah.  heh.
                }
                buffer = ByteBuffer.allocate(1000);

                 // start off the asynch read. 
                channel.read(buffer, pos , channel, this );
                // this method now exits, thread returns to main and waits for user input.
        }

        public static void main (String [] args) {
                TryNio tn = new TryNio();
                tn.doit();
             // wait fur user to press a key otherwise java exits because the 
             // asynch thread isn't important enough to keep it running.
                try { System.in.read(); } catch (IOException e) { }
        }
}