我是一名有用的助手,可以为您翻译以下内容:
然而,我没有看到任何数据被写入DynamoDB,也没有收到任何错误消息。
我有一个应用程序,其中: 1. 我使用SqlContext.read.json从S3读取JSON文件到Dataframe中 2. 然后对DataFrame进行一些转换 3. 最后,我想使用一个记录值作为键和其余JSON参数作为值/列将记录持久化到DynamoDB中。
我正在尝试这样做:
JobConf jobConf = new JobConf(sc.hadoopConfiguration());
jobConf.set("dynamodb.servicename", "dynamodb");
jobConf.set("dynamodb.input.tableName", "my-dynamo-table"); // Pointing to DynamoDB table
jobConf.set("dynamodb.endpoint", "dynamodb.us-east-1.amazonaws.com");
jobConf.set("dynamodb.regionid", "us-east-1");
jobConf.set("dynamodb.throughput.read", "1");
jobConf.set("dynamodb.throughput.read.percent", "1");
jobConf.set("dynamodb.version", "2011-12-05");
jobConf.set("mapred.output.format.class", "org.apache.hadoop.dynamodb.write.DynamoDBOutputFormat");
jobConf.set("mapred.input.format.class", "org.apache.hadoop.dynamodb.read.DynamoDBInputFormat");
DataFrame df = sqlContext.read().json("s3n://mybucket/abc.json");
RDD<String> jsonRDD = df.toJSON();
JavaRDD<String> jsonJavaRDD = jsonRDD.toJavaRDD();
PairFunction<String, Text, DynamoDBItemWritable> keyData = new PairFunction<String, Text, DynamoDBItemWritable>() {
public Tuple2<Text, DynamoDBItemWritable> call(String row) {
DynamoDBItemWritable writeable = new DynamoDBItemWritable();
try {
System.out.println("JSON : " + row);
JSONObject jsonObject = new JSONObject(row);
System.out.println("JSON Object: " + jsonObject);
Map<String, AttributeValue> attributes = new HashMap<String, AttributeValue>();
AttributeValue attributeValue = new AttributeValue();
attributeValue.setS(row);
attributes.put("values", attributeValue);
AttributeValue attributeKeyValue = new AttributeValue();
attributeValue.setS(jsonObject.getString("external_id"));
attributes.put("primary_key", attributeKeyValue);
AttributeValue attributeSecValue = new AttributeValue();
attributeValue.setS(jsonObject.getString("123434335"));
attributes.put("creation_date", attributeSecValue);
writeable.setItem(attributes);
} catch (Exception e) {
// TODO Auto-generated catch block
e.printStackTrace();
}
return new Tuple2(new Text(row), writeable);
}
};
JavaPairRDD<Text, DynamoDBItemWritable> pairs = jsonJavaRDD
.mapToPair(keyData);
Map<Text, DynamoDBItemWritable> map = pairs.collectAsMap();
System.out.println("Results : " + map);
pairs.saveAsHadoopDataset(jobConf);
然而,我没有看到任何数据被写入DynamoDB,也没有收到任何错误消息。