我已经使用Apache Spark有一段时间了,但现在在执行以下示例时(我刚升级到Spark 2.1.1),出现了以前从未发生过的错误:
./opt/sparkFiles/spark-2.1.1-bin-hadoop2.7/bin/run-example SparkPi
这里是实际的堆栈跟踪信息:
17/07/05 10:50:54 ERROR SparkContext: Failed to add file:/opt/sparkFiles/spark-2.1.1-bin-hadoop2.7/examples/jars/spark-warehouse/ to Spark environment
java.lang.IllegalArgumentException: Directory /opt/sparkFiles/spark-2.1.1-bin-hadoop2.7/examples/jars/spark-warehouse is not allowed for addJar
at org.apache.spark.SparkContext.liftedTree1$1(SparkContext.scala:1735)
at org.apache.spark.SparkContext.addJar(SparkContext.scala:1729)
at org.apache.spark.SparkContext$$anonfun$11.apply(SparkContext.scala:466)
at org.apache.spark.SparkContext$$anonfun$11.apply(SparkContext.scala:466)
at scala.collection.immutable.List.foreach(List.scala:381)
at org.apache.spark.SparkContext.<init>(SparkContext.scala:466)
at org.apache.spark.SparkContext$.getOrCreate(SparkContext.scala:2320)
at org.apache.spark.sql.SparkSession$Builder$$anonfun$6.apply(SparkSession.scala:868)
at org.apache.spark.sql.SparkSession$Builder$$anonfun$6.apply(SparkSession.scala:860)
at scala.Option.getOrElse(Option.scala:121)
at org.apache.spark.sql.SparkSession$Builder.getOrCreate(SparkSession.scala:860)
at org.apache.spark.examples.SparkPi$.main(SparkPi.scala:31)
at org.apache.spark.examples.SparkPi.main(SparkPi.scala)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:743)
at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:187)
at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:212)
at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:126)
at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
Pi is roughly 3.1433757168785843
我不确定这个是否是一个错误或者我有所遗漏,因为示例仍然被执行,你可以在结尾看到Pi is roughly...的结果。
下面是spark-env.sh的配置行:
export SPARK_MASTER_IP=X.X.X.X
export SPARK_MASTER_WEBUI_PORT=YYYY
export SPARK_WORKER_CORES=4
export SPARK_WORKER_MEMOiRY=7g
这里是 spark-defaults.sh 的配置行:
spark.master local[*]
spark.driver.cores 4
spark.driver.memory 2g
spark.executor.cores 4
spark.executor.memory 4g
spark.ui.showConsoleProgress false
spark.driver.extraClassPath /opt/sparkFiles/spark-2.1.1-bin-hadoop2.7/lib/postgresql-9.4.1207.jar
spark.eventLog.enabled true
spark.eventLog.dir file:///opt/sparkFiles/spark-2.1.1-bin-hadoop2.7/logs
spark.history.fs.logDirectory file:///opt/sparkFiles/spark-2.1.1-bin-hadoop2.7/logs
Apache Spark版本:2.1.1
Java版本:1.8.0_91
Python版本:2.7.5
我尝试使用这个配置,但没有成功:
spark.sql.warehouse.dir file:///c:/tmp/spark-warehouse
很奇怪,因为当我编译脚本并使用spark-submit启动时,我没有遇到这个错误。没有找到任何JIRA票据或类似的东西。