Amazon S3通过Java API上传:InputStream源

3

我正在测试使用"aws-java-sdk-s3"将小对象上传到S3的不同方法。 由于是小对象,所以我使用默认的API(用于大型和巨大对象的传输API...)。

  1. Uploading a File as a source, perfect !

     File file = ....
     s3Client.putObject(new PutObjectRequest(bucket, key, file));
    
  2. Uploading ByteArrayInputStream, perfect !

    InputStream  stream = new ByteArrayInputStream("How are you?".getBytes()))
    s3Client.putObject(new PutObjectRequest(bucket, key, stream  ));
    
  3. Updloading a Resource As Stream , problems .!

    InputStream stream  = this.getClass().getResourceAsStream("myFile.data");
    s3Client.putObject(new PutObjectRequest(bucket, key, stream  ));
    

异常处理:

com.amazonaws.ResetException: The request to the service failed with a retryable reason, but resetting the request input stream has failed.
 See exception.getExtraInfo or debug-level logging for the original failure that caused this retry.;  
If the request involves an input stream, the maximum stream buffer size can be configured via request.getRequestClientOptions().setReadLimit(int)

Caused by: java.io.IOException: Resetting to invalid mark
    at java.io.BufferedInputStream.reset(BufferedInputStream.java:448)
    at com.amazonaws.internal.SdkFilterInputStream.reset(SdkFilterInputStream.java:112)
    at com.amazonaws.internal.SdkFilterInputStream.reset(SdkFilterInputStream.java:112)
    at com.amazonaws.util.LengthCheckInputStream.reset(LengthCheckInputStream.java:126)
    at com.amazonaws.internal.SdkFilterInputStream.reset(SdkFilterInputStream.java:112)

我可以使用Apache File Utils将类路径资源转换为文件对象,但它有点糟糕......

  1. 我是否必须根据流的类型来配置ReadLimit?
  2. 推荐使用什么值?

API版本:"aws-java-sdk-s3" rev="1.11.442"

2个回答

1
你的。
 this.getClass().getResourceAsStream("myFile.data");

返回一个BufferedInputStream(如您在异常中所见)。使用BufferedInputStream时,必须将缓冲区大小设置为至少128K(131072),如AWS S3文档中所述:

当使用BufferedInputStream作为数据源时,请记住在初始化BufferedInputStream时使用大小不小于RequestClientOptions.DEFAULT_STREAM_BUFFER_SIZE的缓冲区。这是为了确保SDK可以在签名和重试期间正确标记和重置流,并具有足够的内存缓冲区。

https://docs.aws.amazon.com/AWSJavaSDK/latest/javadoc/com/amazonaws/services/s3/AmazonS3Client.html#putObject-java.lang.String-java.lang.String-java.io.InputStream-com.amazonaws.services.s3.model.ObjectMetadata-


1
我已经实现了一个与您的用例非常相似(虽然不完全相同)的功能。我需要编写一些数据到一个JSON文件(压缩格式),并将其存储在S3中。数据在哈希映射中可用,因此哈希映射的内容将被复制到JSON文件中。如果这没有帮助,请随意忽略。另外,我从未在任何地方设置过任何形式的限制。
public void serializeResults(AmazonS3Client s3, Map<String, Object> dm, String environment)
        throws IOException {
    logger.info("start writeZipToS3");
    Gson gson = new GsonBuilder().create();
    try {
        ByteArrayOutputStream byteOut = new ByteArrayOutputStream();
        ZipOutputStream zout = new ZipOutputStream(byteOut);

        ZipEntry ze = new ZipEntry(String.format("results-%s.json", environment));
        zout.putNextEntry(ze);
        String json = gson.toJson(dm);
        zout.write(json.getBytes());
        zout.closeEntry();
        zout.close();
        byte[] bites = byteOut.toByteArray();
        ObjectMetadata om = new ObjectMetadata();
        om.setContentLength(bites.length);
        PutObjectRequest por = new PutObjectRequest("home",
                String.format("zc-service/results-%s.zip", environment),
                new ByteArrayInputStream(bites), om);
        s3.putObject(por);

    } catch (IOException e) {
        e.printStackTrace();
    }
    logger.info("stop writeZipToS3");
}

我希望这能对你有所帮助。

此致敬礼


1
谢谢@Sinchan。当InputStream是ByteArrayInputStream时,它可以完美地工作。问题出在某些InputStream的子类型上,例如Resource Stream的BufferedInputStream。 - Azimuts
有许多技巧可以使用...似乎使用标记支持的InputStreams会出现问题。 - Azimuts
1
同时尝试将基于标记的InputStream包装成非标记的方式,但是这样会出现另一个错误(com.amazonaws.SdkClientException: More data read than expected: dataLength=8192; expectedLength=0; includeSkipped=false; in.getClass()=class com.amazonaws.internal.ReleasableInputStream; markedSupported=false; marked=0; resetSinceLastMarked=false; markCount=0; resetCount=0)问题在于,如果你创建一个通用接口,很难控制使用什么类型的InputStream... - Azimuts

网页内容由stack overflow 提供, 点击上面的
可以查看英文原文,
原文链接