我尝试在Spark任务中使用lambda表达式,但是它抛出了“java.lang.IllegalArgumentException:无效的lambda反序列化”异常。当代码类似于“transform(pRDD->pRDD.map(t->t._2))”时,会抛出此异常。以下是代码片段。
JavaPairDStream<String,Integer> aggregate = pairRDD.reduceByKey((x,y)->x+y);
JavaDStream<Integer> con = aggregate.transform(
(Function<JavaPairRDD<String,Integer>, JavaRDD<Integer>>)pRDD-> pRDD.map(
(Function<Tuple2<String,Integer>,Integer>)t->t._2));
JavaPairDStream<String,Integer> aggregate = pairRDD.reduceByKey((x,y)->x+y);
JavaDStream<Integer> con = aggregate.transform(
(Function<JavaPairRDD<String,Integer>, JavaRDD<Integer>> & Serializable)pRDD-> pRDD.map(
(Function<Tuple2<String,Integer>,Integer> & Serializable)t->t._2));
上述两个选项都没有起作用。但是,如果我将下面的对象“f”作为参数传递,而不是lambda表达式“t->t_.2”,它就能正常工作。
Function f = new Function<Tuple2<String,Integer>,Integer>(){
@Override
public Integer call(Tuple2<String,Integer> paramT1) throws Exception {
return paramT1._2;
}
};
请问Lambda表达式的正确格式是什么?
public static void main(String[] args) {
Function f = new Function<Tuple2<String,Integer>,Integer>(){
@Override
public Integer call(Tuple2<String,Integer> paramT1) throws Exception {
return paramT1._2;
}
};
JavaStreamingContext ssc = JavaStreamingFactory.getInstance();
JavaReceiverInputDStream<String> lines = ssc.socketTextStream("localhost", 9999);
JavaDStream<String> words = lines.flatMap(s->{return Arrays.asList(s.split(" "));});
JavaPairDStream<String,Integer> pairRDD = words.mapToPair(x->new Tuple2<String,Integer>(x,1));
JavaPairDStream<String,Integer> aggregate = pairRDD.reduceByKey((x,y)->x+y);
JavaDStream<Integer> con = aggregate.transform(
(Function<JavaPairRDD<String,Integer>, JavaRDD<Integer>>)pRDD-> pRDD.map(
(Function<Tuple2<String,Integer>,Integer>)t->t._2));
//JavaDStream<Integer> con = aggregate.transform(pRDD-> pRDD.map(f)); It works
con.print();
ssc.start();
ssc.awaitTermination();
}