缓存文件(用于更快的磁盘访问)

46
我正在处理大文件,直接写入磁盘速度很慢。由于文件太大,我无法将其加载到TMemoryStream中。
TFileStream没有缓冲,因此我想知道是否有自定义库可以提供带缓冲的流,或者我应该仅依赖操作系统提供的缓冲。操作系统的缓存可靠吗?我的意思是,如果缓存已满,则可能会从缓存中刷新旧文件(我的文件),以便为新文件腾出空间。
我的文件在GB级别。它包含数百万条记录。不幸的是,记录大小不固定。因此,我必须进行数百万次读取(介于4和500个字节之间)。读取(和写入)是连续的。我不会上下跳转文件(我认为这对缓冲来说是理想的)。
最后,我必须将这样的文件写回磁盘(再次进行数百万次小写操作)。
David提供了他的个人库,提供缓冲磁盘访问。
   Speed tests:
     Input file: 317MB.SFF
     Delphi stream: 9.84sec
     David's stream: 2.05sec
     ______________________________________

   More tests:
     Input file: input2_700MB.txt
     Lines: 19 millions
     Compiler optimization: ON
     I/O check: On
     FastMM: release mode
     **HDD**   

     Reading: **linear** (ReadLine) (PS: multiply time with 10)      
      We see clear performance drop at 8KB. Recommended 16 or 32KB
        Time: 618 ms  Cache size: 64KB.
        Time: 622 ms  Cache size: 128KB.
        Time: 622 ms  Cache size: 24KB.
        Time: 622 ms  Cache size: 32KB.
        Time: 622 ms  Cache size: 64KB.
        Time: 624 ms  Cache size: 256KB.
        Time: 625 ms  Cache size: 18KB.
        Time: 626 ms  Cache size: 26KB.
        Time: 626 ms  Cache size: 1024KB.
        Time: 626 ms  Cache size: 16KB.
        Time: 628 ms  Cache size: 42KB.
        Time: 644 ms  Cache size: 8KB.      <--- no difference until 8K
        Time: 664 ms  Cache size: 4KB.
        Time: 705 ms  Cache size: 2KB.
        Time: 791 ms  Cache size: 1KB.
        Time: 795 ms  Cache size: 1KB.

      **SSD**
      We see a small improvement as we go towards higher buffers. Recommended 16 or 32KB
        Time: 610 ms  Cache size: 128KB.
        Time: 611 ms  Cache size: 256KB.
        Time: 614 ms  Cache size: 32KB.
        Time: 623 ms  Cache size: 16KB.
        Time: 625 ms  Cache size: 66KB.
        Time: 639 ms  Cache size: 8KB.       <--- definitively not good with 8K
        Time: 660 ms  Cache size: 4KB.
     ______

     Reading: **Random** (ReadInteger) (100000 reads)
     SSD
       Time: 064 ms. Cache size: 1KB.   Count: 100000.  RAM: 13.27 MB         <-- probably the best buffer size for ReadInteger is 4bytes!
       Time: 067 ms. Cache size: 2KB.   Count: 100000.  RAM: 13.27 MB
       Time: 080 ms. Cache size: 4KB.   Count: 100000.  RAM: 13.27 MB
       Time: 098 ms. Cache size: 8KB.   Count: 100000.  RAM: 13.27 MB
       Time: 140 ms. Cache size: 16KB.  Count: 100000.  RAM: 13.27 MB
       Time: 213 ms. Cache size: 32KB.  Count: 100000.  RAM: 13.27 MB
       Time: 360 ms. Cache size: 64KB.  Count: 100000.  RAM: 13.27 MB
       Conclusion: don't use it for "random" reading   

2020年更新:
当顺序读取时,新的System.Classes.TBufferedFileStream似乎比上面介绍的库快70%。


6
内存映射文件? - Andreas Rejbrand
如果该文件仅由您的应用程序使用,则可以考虑将记录存储在数据库中。 - Najem
我不明白为什么任何缓冲流实现在性能上会有那么大的差异。它应该受到原始IO速度的限制。我怀疑你的基准测试是错误的。 - David Heffernan
嗨,David。我会再次测试并将代码放在线上。 - Gabriel
@server 很有趣。我猜在一个紧密循环中进行非常小的读取时,可能会出现一些低效率的问题。 - David Heffernan
显示剩余2条评论
5个回答

94

Windows文件缓存非常有效,尤其是在使用Vista或更高版本时。 TFileStream 是围绕Windows的ReadFile()WriteFile() API函数的一个松散包装器,对于许多用例来说,唯一更快的是内存映射文件。

然而,在一个常见的情况下,TFileStream 会成为性能瓶颈。那就是如果你每次调用流读写函数时只读写少量数据。例如,如果你逐个读取整数数组中的一项,则通过对ReadFile()的每次读取4个字节,会产生显着的开销。

同样,内存映射文件是解决这种瓶颈的绝佳方法,但另一个常用的方法是读取一个更大的缓冲区,比如几千字节,然后从这个内存缓存中解析流的未来读取,而不是进一步调用ReadFile()。这种方法只适用于顺序访问。


根据你更新的问题所描述的使用模式,我认为以下类可以提高你的性能:

unit BufferedFileStream;

interface

uses
  SysUtils, Math, Classes, Windows;

type
  TBaseCachedFileStream = class(TStream)
  private
    function QueryInterface(const IID: TGUID; out Obj): HResult; stdcall;
    function _AddRef: Integer; stdcall;
    function _Release: Integer; stdcall;
  protected
    FHandle: THandle;
    FOwnsHandle: Boolean;
    FCache: PByte;
    FCacheSize: Integer;
    FPosition: Int64;//the current position in the file (relative to the beginning of the file)
    FCacheStart: Int64;//the postion in the file of the start of the cache (relative to the beginning of the file)
    FCacheEnd: Int64;//the postion in the file of the end of the cache (relative to the beginning of the file)
    FFileName: string;
    FLastError: DWORD;
    procedure HandleError(const Msg: string);
    procedure RaiseSystemError(const Msg: string; LastError: DWORD); overload;
    procedure RaiseSystemError(const Msg: string); overload;
    procedure RaiseSystemErrorFmt(const Msg: string; const Args: array of const);
    function CreateHandle(FlagsAndAttributes: DWORD): THandle; virtual; abstract;
    function GetFileSize: Int64; virtual;
    procedure SetSize(NewSize: Longint); override;
    procedure SetSize(const NewSize: Int64); override;
    function FileRead(var Buffer; Count: Longword): Integer;
    function FileWrite(const Buffer; Count: Longword): Integer;
    function FileSeek(const Offset: Int64; Origin: TSeekOrigin): Int64;
  public
    constructor Create(const FileName: string); overload;
    constructor Create(const FileName: string; CacheSize: Integer); overload;
    constructor Create(const FileName: string; CacheSize: Integer; Handle: THandle); overload; virtual;
    destructor Destroy; override;
    property CacheSize: Integer read FCacheSize;
    function Read(var Buffer; Count: Longint): Longint; override;
    function Write(const Buffer; Count: Longint): Longint; override;
    function Seek(const Offset: Int64; Origin: TSeekOrigin): Int64; override;
  end;
  TBaseCachedFileStreamClass = class of TBaseCachedFileStream;

  IDisableStreamReadCache = interface
    ['{0B6D0004-88D1-42D5-BC0F-447911C0FC21}']
    procedure DisableStreamReadCache;
    procedure EnableStreamReadCache;
  end;

  TReadOnlyCachedFileStream = class(TBaseCachedFileStream, IDisableStreamReadCache)
  (* This class works by filling the cache each time a call to Read is made and
     FPosition is outside the existing cache.  By filling the cache we mean
     reading from the file into the temporary cache.  Calls to Read when
     FPosition is in the existing cache are then dealt with by filling the
     buffer with bytes from the cache.
  *)
  private
    FUseAlignedCache: Boolean;
    FViewStart: Int64;
    FViewLength: Int64;
    FDisableStreamReadCacheRefCount: Integer;
    procedure DisableStreamReadCache;
    procedure EnableStreamReadCache;
    procedure FlushCache;
  protected
    function CreateHandle(FlagsAndAttributes: DWORD): THandle; override;
    function GetFileSize: Int64; override;
  public
    constructor Create(const FileName: string; CacheSize: Integer; Handle: THandle); overload; override;
    property UseAlignedCache: Boolean read FUseAlignedCache write FUseAlignedCache;
    function Read(var Buffer; Count: Longint): Longint; override;
    procedure SetViewWindow(const ViewStart, ViewLength: Int64);
  end;

  TWriteCachedFileStream = class(TBaseCachedFileStream, IDisableStreamReadCache)
  (* This class works by caching calls to Write.  By this we mean temporarily
     storing the bytes to be written in the cache.  As each call to Write is
     processed the cache grows.  The cache is written to file when:
       1.  A call to Write is made when the cache is full.
       2.  A call to Write is made and FPosition is outside the cache (this
           must be as a result of a call to Seek).
       3.  The class is destroyed.

     Note that data can be read from these streams but the reading is not
     cached and in fact a read operation will flush the cache before
     attempting to read the data.
  *)
  private
    FFileSize: Int64;
    FReadStream: TReadOnlyCachedFileStream;
    FReadStreamCacheSize: Integer;
    FReadStreamUseAlignedCache: Boolean;
    procedure DisableStreamReadCache;
    procedure EnableStreamReadCache;
    procedure CreateReadStream;
    procedure FlushCache;
  protected
    function CreateHandle(FlagsAndAttributes: DWORD): THandle; override;
    function GetFileSize: Int64; override;
  public
    constructor Create(const FileName: string; CacheSize, ReadStreamCacheSize: Integer; ReadStreamUseAlignedCache: Boolean); overload;
    destructor Destroy; override;
    function Read(var Buffer; Count: Longint): Longint; override;
    function Write(const Buffer; Count: Longint): Longint; override;
  end;

implementation

function GetFileSizeEx(hFile: THandle; var FileSize: Int64): BOOL; stdcall; external kernel32;
function SetFilePointerEx(hFile: THandle; DistanceToMove: Int64; lpNewFilePointer: PInt64; dwMoveMethod: DWORD): BOOL; stdcall; external kernel32;

{ TBaseCachedFileStream }

constructor TBaseCachedFileStream.Create(const FileName: string);
begin
  Create(FileName, 0);
end;

constructor TBaseCachedFileStream.Create(const FileName: string; CacheSize: Integer);
begin
  Create(FileName, CacheSize, 0);
end;

constructor TBaseCachedFileStream.Create(const FileName: string; CacheSize: Integer; Handle: THandle);
const
  DefaultCacheSize = 16*1024;
  //16kb - this was chosen empirically - don't make it too large otherwise the progress report is 'jerky'
begin
  inherited Create;
  FFileName := FileName;
  FOwnsHandle := Handle=0;
  if FOwnsHandle then begin
    FHandle := CreateHandle(FILE_ATTRIBUTE_NORMAL);
  end else begin
    FHandle := Handle;
  end;
  FCacheSize := CacheSize;
  if FCacheSize<=0 then begin
    FCacheSize := DefaultCacheSize;
  end;
  GetMem(FCache, FCacheSize);
end;

destructor TBaseCachedFileStream.Destroy;
begin
  FreeMem(FCache);
  if FOwnsHandle and (FHandle<>0) then begin
    CloseHandle(FHandle);
  end;
  inherited;
end;

function TBaseCachedFileStream.QueryInterface(const IID: TGUID; out Obj): HResult;
begin
  if GetInterface(IID, Obj) then begin
    Result := S_OK;
  end else begin
    Result := E_NOINTERFACE;
  end;
end;

function TBaseCachedFileStream._AddRef: Integer;
begin
  Result := -1;
end;

function TBaseCachedFileStream._Release: Integer;
begin
  Result := -1;
end;

procedure TBaseCachedFileStream.HandleError(const Msg: string);
begin
  if FLastError<>0 then begin
    RaiseSystemError(Msg, FLastError);
  end;
end;

procedure TBaseCachedFileStream.RaiseSystemError(const Msg: string; LastError: DWORD);
begin
  raise EStreamError.Create(Trim(Msg+'  ')+SysErrorMessage(LastError));
end;

procedure TBaseCachedFileStream.RaiseSystemError(const Msg: string);
begin
  RaiseSystemError(Msg, GetLastError);
end;

procedure TBaseCachedFileStream.RaiseSystemErrorFmt(const Msg: string; const Args: array of const);
var
  LastError: DWORD;
begin
  LastError := GetLastError; // must call GetLastError before Format
  RaiseSystemError(Format(Msg, Args), LastError);
end;

function TBaseCachedFileStream.GetFileSize: Int64;
begin
  if not GetFileSizeEx(FHandle, Result) then begin
    RaiseSystemErrorFmt('GetFileSizeEx failed for %s.', [FFileName]);
  end;
end;

procedure TBaseCachedFileStream.SetSize(NewSize: Longint);
begin
  SetSize(Int64(NewSize));
end;

procedure TBaseCachedFileStream.SetSize(const NewSize: Int64);
begin
  Seek(NewSize, soBeginning);
  if not Windows.SetEndOfFile(FHandle) then begin
    RaiseSystemErrorFmt('SetEndOfFile for %s.', [FFileName]);
  end;
end;

function TBaseCachedFileStream.FileRead(var Buffer; Count: Longword): Integer;
begin
  if Windows.ReadFile(FHandle, Buffer, Count, LongWord(Result), nil) then begin
    FLastError := 0;
  end else begin
    FLastError := GetLastError;
    Result := -1;
  end;
end;

function TBaseCachedFileStream.FileWrite(const Buffer; Count: Longword): Integer;
begin
  if Windows.WriteFile(FHandle, Buffer, Count, LongWord(Result), nil) then begin
    FLastError := 0;
  end else begin
    FLastError := GetLastError;
    Result := -1;
  end;
end;

function TBaseCachedFileStream.FileSeek(const Offset: Int64; Origin: TSeekOrigin): Int64;
begin
  if not SetFilePointerEx(FHandle, Offset, @Result, ord(Origin)) then begin
    RaiseSystemErrorFmt('SetFilePointerEx failed for %s.', [FFileName]);
  end;
end;

function TBaseCachedFileStream.Read(var Buffer; Count: Integer): Longint;
begin
  raise EAssertionFailed.Create('Cannot read from this stream');
end;

function TBaseCachedFileStream.Write(const Buffer; Count: Integer): Longint;
begin
  raise EAssertionFailed.Create('Cannot write to this stream');
end;

function TBaseCachedFileStream.Seek(const Offset: Int64; Origin: TSeekOrigin): Int64;
//Set FPosition to the value specified - if this has implications for the
//cache then overriden Write and Read methods must deal with those.
begin
  case Origin of
  soBeginning:
    FPosition := Offset;
  soEnd:
    FPosition := GetFileSize+Offset;
  soCurrent:
    inc(FPosition, Offset);
  end;
  Result := FPosition;
end;

{ TReadOnlyCachedFileStream }

constructor TReadOnlyCachedFileStream.Create(const FileName: string; CacheSize: Integer; Handle: THandle);
begin
  inherited;
  SetViewWindow(0, inherited GetFileSize);
end;

function TReadOnlyCachedFileStream.CreateHandle(FlagsAndAttributes: DWORD): THandle;
begin
  Result := Windows.CreateFile(
    PChar(FFileName),
    GENERIC_READ,
    FILE_SHARE_READ,
    nil,
    OPEN_EXISTING,
    FlagsAndAttributes,
    0
  );
  if Result=INVALID_HANDLE_VALUE then begin
    RaiseSystemErrorFmt('Cannot open %s.', [FFileName]);
  end;
end;

procedure TReadOnlyCachedFileStream.DisableStreamReadCache;
begin
  inc(FDisableStreamReadCacheRefCount);
end;

procedure TReadOnlyCachedFileStream.EnableStreamReadCache;
begin
  dec(FDisableStreamReadCacheRefCount);
end;

procedure TReadOnlyCachedFileStream.FlushCache;
begin
  FCacheStart := 0;
  FCacheEnd := 0;
end;

function TReadOnlyCachedFileStream.GetFileSize: Int64;
begin
  Result := FViewLength;
end;

procedure TReadOnlyCachedFileStream.SetViewWindow(const ViewStart, ViewLength: Int64);
begin
  if ViewStart<0 then begin
    raise EAssertionFailed.Create('Invalid view window');
  end;
  if (ViewStart+ViewLength)>inherited GetFileSize then begin
    raise EAssertionFailed.Create('Invalid view window');
  end;
  FViewStart := ViewStart;
  FViewLength := ViewLength;
  FPosition := 0;
  FCacheStart := 0;
  FCacheEnd := 0;
end;

function TReadOnlyCachedFileStream.Read(var Buffer; Count: Longint): Longint;
var
  NumOfBytesToCopy, NumOfBytesLeft, NumOfBytesRead: Longint;
  CachePtr, BufferPtr: PByte;
begin
  if FDisableStreamReadCacheRefCount>0 then begin
    FileSeek(FPosition+FViewStart, soBeginning);
    Result := FileRead(Buffer, Count);
    if Result=-1 then begin
      Result := 0;//contract is to return number of bytes that were read
    end;
    inc(FPosition, Result);
  end else begin
    Result := 0;
    NumOfBytesLeft := Count;
    BufferPtr := @Buffer;
    while NumOfBytesLeft>0 do begin
      if (FPosition<FCacheStart) or (FPosition>=FCacheEnd) then begin
        //the current position is not available in the cache so we need to re-fill the cache
        FCacheStart := FPosition;
        if UseAlignedCache then begin
          FCacheStart := FCacheStart - (FCacheStart mod CacheSize);
        end;
        FileSeek(FCacheStart+FViewStart, soBeginning);
        NumOfBytesRead := FileRead(FCache^, CacheSize);
        if NumOfBytesRead=-1 then begin
          exit;
        end;
        Assert(NumOfBytesRead>=0);
        FCacheEnd := FCacheStart+NumOfBytesRead;
        if NumOfBytesRead=0 then begin
          FLastError := ERROR_HANDLE_EOF;//must be at the end of the file
          break;
        end;
      end;

      //read from cache to Buffer
      NumOfBytesToCopy := Min(FCacheEnd-FPosition, NumOfBytesLeft);
      CachePtr := FCache;
      inc(CachePtr, FPosition-FCacheStart);
      Move(CachePtr^, BufferPtr^, NumOfBytesToCopy);
      inc(Result, NumOfBytesToCopy);
      inc(FPosition, NumOfBytesToCopy);
      inc(BufferPtr, NumOfBytesToCopy);
      dec(NumOfBytesLeft, NumOfBytesToCopy);
    end;
  end;
end;

{ TWriteCachedFileStream }

constructor TWriteCachedFileStream.Create(const FileName: string; CacheSize, ReadStreamCacheSize: Integer; ReadStreamUseAlignedCache: Boolean);
begin
  inherited Create(FileName, CacheSize);
  FReadStreamCacheSize := ReadStreamCacheSize;
  FReadStreamUseAlignedCache := ReadStreamUseAlignedCache;
end;

destructor TWriteCachedFileStream.Destroy;
begin
  FlushCache;//make sure that the final calls to Write get recorded in the file
  FreeAndNil(FReadStream);
  inherited;
end;

function TWriteCachedFileStream.CreateHandle(FlagsAndAttributes: DWORD): THandle;
begin
  Result := Windows.CreateFile(
    PChar(FFileName),
    GENERIC_READ or GENERIC_WRITE,
    0,
    nil,
    CREATE_ALWAYS,
    FlagsAndAttributes,
    0
  );
  if Result=INVALID_HANDLE_VALUE then begin
    RaiseSystemErrorFmt('Cannot create %s.', [FFileName]);
  end;
end;

procedure TWriteCachedFileStream.DisableStreamReadCache;
begin
  CreateReadStream;
  FReadStream.DisableStreamReadCache;
end;

procedure TWriteCachedFileStream.EnableStreamReadCache;
begin
  Assert(Assigned(FReadStream));
  FReadStream.EnableStreamReadCache;
end;

function TWriteCachedFileStream.GetFileSize: Int64;
begin
  Result := FFileSize;
end;

procedure TWriteCachedFileStream.CreateReadStream;
begin
  if not Assigned(FReadStream) then begin
    FReadStream := TReadOnlyCachedFileStream.Create(FFileName, FReadStreamCacheSize, FHandle);
    FReadStream.UseAlignedCache := FReadStreamUseAlignedCache;
  end;
end;

procedure TWriteCachedFileStream.FlushCache;
var
  NumOfBytesToWrite: Longint;
begin
  if Assigned(FCache) then begin
    NumOfBytesToWrite := FCacheEnd-FCacheStart;
    if NumOfBytesToWrite>0 then begin
      FileSeek(FCacheStart, soBeginning);
      if FileWrite(FCache^, NumOfBytesToWrite)<>NumOfBytesToWrite then begin
        RaiseSystemErrorFmt('FileWrite failed for %s.', [FFileName]);
      end;
      if Assigned(FReadStream) then begin
        FReadStream.FlushCache;
      end;
    end;
    FCacheStart := FPosition;
    FCacheEnd := FPosition;
  end;
end;

function TWriteCachedFileStream.Read(var Buffer; Count: Integer): Longint;
begin
  FlushCache;
  CreateReadStream;
  Assert(FReadStream.FViewStart=0);
  if FReadStream.FViewLength<>FFileSize then begin
    FReadStream.SetViewWindow(0, FFileSize);
  end;
  FReadStream.Position := FPosition;
  Result := FReadStream.Read(Buffer, Count);
  inc(FPosition, Result);
end;

function TWriteCachedFileStream.Write(const Buffer; Count: Longint): Longint;
var
  NumOfBytesToCopy, NumOfBytesLeft: Longint;
  CachePtr, BufferPtr: PByte;
begin
  Result := 0;
  NumOfBytesLeft := Count;
  BufferPtr := @Buffer;
  while NumOfBytesLeft>0 do begin
    if ((FPosition<FCacheStart) or (FPosition>FCacheEnd))//the current position is outside the cache
    or (FPosition-FCacheStart=FCacheSize)//the cache is full
    then begin
      FlushCache;
      Assert(FCacheStart=FPosition);
    end;

    //write from Buffer to the cache
    NumOfBytesToCopy := Min(FCacheSize-(FPosition-FCacheStart), NumOfBytesLeft);
    CachePtr := FCache;
    inc(CachePtr, FPosition-FCacheStart);
    Move(BufferPtr^, CachePtr^, NumOfBytesToCopy);
    inc(Result, NumOfBytesToCopy);
    inc(FPosition, NumOfBytesToCopy);
    FCacheEnd := Max(FCacheEnd, FPosition);
    inc(BufferPtr, NumOfBytesToCopy);
    dec(NumOfBytesLeft, NumOfBytesToCopy);
  end;
  FFileSize := Max(FFileSize, FPosition);
end;

end.

14
主耶稣。 Delphi课程:11.2秒。 您的课程:1.6秒。 - Gabriel
2
@AndriyM 这里没有任何火箭科学。文件缓冲优化早在我出生之前就存在了,而且看起来几乎就像这样。实际上,我怀疑这段代码是你能得到的最简单的代码。 - David Heffernan
5
谢谢您提醒我修改这段代码,以使其作为独立单元编译。 - David Heffernan
2
它可以工作,而且并不是很微妙。从3秒降到了0.6秒,因子为5,非常感谢 :-) - Marco van de Voort
2
我将这个加入了我的代码中,我很高兴我这样做了。有没有机会把它放到Github上,这样它就能得到妥善的处理了? - Leonardo Herrera
显示剩余30条评论

14

TFileStream 类在内部使用 CreateFile 函数来管理文件,该函数始终使用缓冲区来操作文件,除非您指定了 FILE_FLAG_NO_BUFFERING 标志(请注意,您无法直接使用 TFileStream 指定此标志)。有关更多信息,请查看以下链接:

另外,您可以尝试使用 TGpHugeFileStream,它是 Primoz Gabrijelcic 的 GpHugeFile 单元的一部分。


3
这是RRUZ的意思,但在许多情况下,它并不完全正确。如果你每次只读取少量数据,调用ReadFile会有一些开销,并且这种开销会变得很重要。 - David Heffernan
1
@Altar,不,我并不是在说那个。我是说TFileStream使用缓冲区来保存数据,在大多数情况下都可以正常工作。如果你想提高性能,你可以从头开始编写一个对象(类)来访问和写入文件,使用更大的缓冲区或者使用像TGpHugeFileStream这样的类。 - RRUZ
2
TFileStream不使用缓冲区。它只是ReadFile/WriteFile的轻量级包装器。Windows有文件缓存,这些API例程受益于它们。 - David Heffernan
5
@RRUZ 我认为在这个讨论中那不算是缓冲! - David Heffernan
1
@Altar,在一条已删除的评论中,您提到需要读写访问权限。这是正确的吗? - David Heffernan
显示剩余4条评论

11

对大家有兴趣的是:Embarcadero在最新版的Delphi 10.1 Berlin中新增了TBufferedFileStream请参阅文档)。

不幸的是,我不能说它与此处给出的解决方案相比如何,因为我还没有购买更新。我也知道这个问题是在Delphi 7上提出的,但我相信引用Delphi自己的实现将来可能会很有用。


7
如果你有这样的代码 很多
while Stream.Position < Stream.Size do

您可以通过将FileStream.Size缓存到变量中来进行优化,这样可以加快速度。Stream.Size使用三个虚函数调用来查找实际大小。


9
调用Windows文件API的速度较慢,并非使用虚拟方法的缘故。实现缓存是解决方案。 - Arnaud Bouchez
2
@ArnaudBouchez,Size不是简单的虚拟方法。它调用kernel32.FileSeek三次!添加Position(也调用FileSeek)-您将在一个代码行中获得4个内核调用。这比Read/Write内部的内核调用多四倍。 - Alex
2
@Alex 确实如此。这正是我的观点 - 请再次阅读我的评论。这就是为什么我建议缓存大小,甚至在编写内容时计算循环内的当前位置。尽可能避免API调用总是一个好主意! - Arnaud Bouchez

0

所以TFileStream速度太慢了,它会从磁盘读取所有内容。 而TMemoryStream可能不够大。(如果你这么说的话)

那么为什么不使用TFileStream将一个块加载到TMemoryStream中进行处理呢?这可以通过一个简单的预解析器来完成,该解析器只查看数据中的大小标头,但这会重新引入您的问题。

让代码意识到其大文件可能会出现问题并避免这种情况并不是坏事:允许它处理(不完整的)TMemoryStream块,这也带来了线程增强功能(如果需要,硬盘访问不是瓶颈)。


2
顺便提一下,Delphi的Tmemorystream也不是最优的,因为它总是以8k递增。FPC的Tmemorystream则呈指数增长,直到增量达到一定大小。通过使用经过改进的FPC Tmemorystream覆盖Delphi的tmemorystream.realloc,可以加快大型写入的速度,因为减少了重新分配的次数。 - Marco van de Voort
1
David的BufferedFileStream已经实现了缓冲磁盘访问。 - Gabriel

网页内容由stack overflow 提供, 点击上面的
可以查看英文原文,
原文链接