我注意到,使用paramiko的sftp进行get或put时,无法获得与rsync / sftp / scp / finder相同的传输速度。在我们的千兆网络中,通过mac mini服务器(运行Mac os 10.12.6)进行文件传输,可以持续达到95-100MB / sec。如果我使用paramiko的sftp.get,则其最大速度为25MB / sec。我曾使用paramiko 1.17,并升级到2.3.1,但速度基本相同。有什么想法可能会导致这种限制吗?谢谢!Adam
我遇到了同样的问题,并实现了其他人提出的一些建议。 可以做三件事:
Increase the buffer size in your transport.
transport = paramiko.Transport(ftp_host, ftp_port)
transport.default_window_size = 4294967294 # 2147483647
transport.packetizer.REKEY_BYTES = pow(2, 40)
transport.packetizer.REKEY_PACKETS = pow(2, 40)
Perform a read ahead prior to getting the file.
ftp_file = ftp_conn.file(file_name, "r")
ftp_file_size = ftp_file.stat().st_size
ftp_file.prefetch(ftp_file_size)
ftp_file.set_pipelined()
ftp_file_data = ftp_file.read(ftp_file_size)
The other thing you can do when transferring larger files is implementing "chunks". This splits the files into smaller pieces that are transferred individually. I have only tested this with a transfer to s3.
chunk_size = 6000000 #6 MB
chunk_count = int(math.ceil(ftp_file_size / float(chunk_size)))
multipart_upload = s3_conn.create_multipart_upload(Bucket=bucket_name, Key=s3_key_val)
parts = []
for i in range(chunk_count):
print("Transferring chunk {}...".format(i + 1), "of ", chunk_count)
start_time = time.time()
ftp_file.prefetch(chunk_size * (i+1) # This statement is where the magic was to keep speeds high.
chunk = ftp_file.read(int(chunk_size))
part = s3_conn.upload_part(
Bucket=bucket_name,
Key=s3_file_path,
PartNumber=part_number,
UploadId=multipart_upload["UploadId"],
Body=chunk
)
end_time = time.time()
total_seconds = end_time - start_time
print("speed is {} kb/s total seconds taken {}".format(math.ceil((int(chunk_size) / 1024) / total_seconds), total_seconds))
part_output = {"PartNumber": i, "ETag": part["ETag"]}
parts.append(part)
print("Chunk {} Transferred Successfully!".format(i + 1))
part_info = {"Parts": parts}
s3_conn.complete_multipart_upload(
Bucket=bucket_name,
Key=s3_key_val,
UploadId=multipart_upload["UploadId"],
MultipartUpload=part_info
)
ftp_file.set_pipelined()