为什么使用BufferedInputStream按字节读取文件比使用FileInputStream更快？

Question

为什么使用BufferedInputStream按字节读取文件比使用FileInputStream更快？

75

我尝试使用FileInputStream将一个大约800KB的文件读入数组，需要大约3秒钟才能将其读入内存。然后我尝试使用BufferedInputStream包装FileInputStream来执行同样的代码，这次只需要大约76毫秒。为什么使用BufferedInputStream按字节读取文件明显更快，即使我仍然是按字节读取？以下是代码（其余代码都不相关）。请注意，这是��快速”代码。如果您想获得“慢速”代码，只需删除BufferedInputStream即可：

InputStream is = null;

    try {
        is = new BufferedInputStream(new FileInputStream(file));

        int[] fileArr = new int[(int) file.length()];

        for (int i = 0, temp = 0; (temp = is.read()) != -1; i++) {
            fileArr[i] = temp;
        }

BufferedInputStream比原始输入流快30倍以上，甚至更多。那么，为什么会这样，而且有没有可能使这段代码更加高效（不使用任何外部库）？

- ZimZim

3个回答

3

使用BufferedInputStream包装FileInputStream，将会以大块（默认为512字节左右）从FileInputStream请求数据。因此，如果您一次读取一个字符1000次，FileInputStream只需要访问磁盘两次。这将会更快！

- usha

3

可能会因平台而异，但在当前的Android上默认值是8192。 - pevik

同样的，8K，适用于大多数平台。 - Hovercraft Full Of Eels

1

由于磁盘访问的成本，因此需要缓冲输入流。假设您有一个大小为8kb的文件，如果没有使用BufferedInputStream，则需要访问磁盘8*1024次才能读取该文件。

此时，BufferedStream登场并充当FileInputStream和要读取的文件之间的中间人。

一次性将默认大小为8kb的字节块读入内存，然后FileInputStream将从这个中间人读取字节。这将减少操作时间。

private void exercise1WithBufferedStream() {
      long start= System.currentTimeMillis();
        try (FileInputStream myFile = new FileInputStream("anyFile.txt")) {
            BufferedInputStream bufferedInputStream = new BufferedInputStream(myFile);
            boolean eof = false;
            while (!eof) {
                int inByteValue = bufferedInputStream.read();
                if (inByteValue == -1) eof = true;
            }
        } catch (IOException e) {
            System.out.println("Could not read the stream...");
            e.printStackTrace();
        }
        System.out.println("time passed with buffered:" + (System.currentTimeMillis()-start));
    }


    private void exercise1() {
        long start= System.currentTimeMillis();
        try (FileInputStream myFile = new FileInputStream("anyFile.txt")) {
            boolean eof = false;
            while (!eof) {
                int inByteValue = myFile.read();
                if (inByteValue == -1) eof = true;
            }
        } catch (IOException e) {
            System.out.println("Could not read the stream...");
            e.printStackTrace();
        }
        System.out.println("time passed without buffered:" + (System.currentTimeMillis()-start));
    }

- huseyin

这个例子很好。然而，用这种方式检查执行时间是绝对不正确的 - 基准测试。例如使用JMH来进行正确的检查。 - Kirill Ch

网页内容由stack overflow 提供, 点击上面的

可以查看英文原文，
原文链接

- Sotirios Delimanolis · Accepted Answer

在 FileInputStream 中，方法 read() 会读取一个字节。来自源代码:

结果为：在 FileInputStream 中，方法 read() 会读取一个字节。来自源代码:

/**
 * Reads a byte of data from this input stream. This method blocks
 * if no input is yet available.
 *
 * @return     the next byte of data, or <code>-1</code> if the end of the
 *             file is reached.
 * @exception  IOException  if an I/O error occurs.
 */
public native int read() throws IOException;

这是一个直接调用操作系统使用磁盘读取单个字节的操作，它是一个重量级的操作。

使用BufferedInputStream时，该方法将委托给一个重载的read()方法，该方法将读取8192个字节并将它们缓冲起来，直到需要它们。它仍然只返回单个字节（但保留其他字节备用）。这样BufferedInputStream就可以减少从文件中读取所需的操作系统本地调用次数。

例如，您的文件长度为32768字节。使用FileInputStream将所有字节加载到内存中，您需要进行32768次操作系统本地调用。使用BufferedInputStream，无论您要调用多少个read()方法（仍为32768），都只需要4个。

至于如何使其更快，您可能要考虑Java 7的NIOFileChannel类，但我没有证据支持此做法。

注意：如果您直接使用FileInputStream的read(byte[], int, int)方法，并使用一个byte[>8192] ，则不需要使用BufferedInputStream进行包装。