主线程中的异常: java.lang.NoClassDefFoundError: org/apache/spark/Logging

6
我刚接触Spark。我尝试在Oracle VirtualBox 5.1.4r110228上运行一个Spark应用程序(.jar),该应用程序利用Spark Steaming对Twitter进行情感分析。我已经创建了我的Twitter帐户并生成了所有必需的(4)令牌。但是我遇到了NoClassDefFoundError异常而被阻止。
我已经搜索了几天。到目前为止,我找到的最好建议是在下面的URL中,但显然我的环境仍然缺少一些东西。

http://javarevisited.blogspot.com/2011/06/noclassdeffounderror-exception-in.html#ixzz4Ia99dsp0

什么是在编译时出现的库,但在运行时缺失?我们怎样修复这个问题?

日志库是什么?我看到一篇文章说这个日志库可能会被弃用。除此之外,在我的环境中我也看到了log4j。

在我的CDH 5.8中,我正在运行以下软件版本:Spark-2.0.0-bin-hadoop2.7 / spark-core_2.10-2.0.0 jdk-8u101-linux-x64 / jre-bu101-linux-x64

我在结尾处附加了异常的详细信息。这是我执行应用程序并在遇到异常后进行验证的步骤:

  1. 解压twitter-streaming.zip(Spark应用程序)
  2. cd twitter-streaming
  3. 运行./sbt/sbt assembly
  4. 使用您的Twitter帐户更新env.sh

$ cat env.sh

export SPARK_HOME=/home/cloudera/spark-2.0.0-bin-hadoop2.7
export CONSUMER_KEY=<my_consumer_key>
export CONSUMER_SECRET=<my_consumer_secret>
export ACCESS_TOKEN=<my_twitterapp_access_token>
export ACCESS_TOKEN_SECRET=<my_twitterapp_access_token>

提交脚本submit.sh封装了带有所需凭证信息的env.sh中的spark-submit命令: $ cat submit.sh
source ./env.sh
$SPARK_HOME/bin/spark-submit --class "TwitterStreamingApp" --master local[*] ./target/scala-2.10/twitter-streaming-assembly-1.0.jar $CONSUMER_KEY $CONSUMER_SECRET $ACCESS_TOKEN $ACCESS_TOKEN_SECRET

组装过程的日志: [cloudera@quickstart twitter-streaming]$ ./sbt/sbt assembly

Launching sbt from sbt/sbt-launch-0.13.7.jar
[info] Loading project definition from /home/cloudera/workspace/twitter-streaming/project
[info] Set current project to twitter-streaming (in build file:/home/cloudera/workspace/twitter-streaming/)
[info] Including: twitter4j-stream-3.0.3.jar
[info] Including: twitter4j-core-3.0.3.jar
[info] Including: scala-library.jar
[info] Including: unused-1.0.0.jar
[info] Including: spark-streaming-twitter_2.10-1.4.1.jar
[info] Checking every *.class/*.jar file's SHA-1.
[info] Merging files...
[warn] Merging 'META-INF/LICENSE.txt' with strategy 'first'
[warn] Merging 'META-INF/MANIFEST.MF' with strategy 'discard'
[warn] Merging 'META-INF/maven/org.spark-project.spark/unused/pom.properties' with strategy 'first'
[warn] Merging 'META-INF/maven/org.spark-project.spark/unused/pom.xml' with strategy 'first'
[warn] Merging 'log4j.properties' with strategy 'discard'
[warn] Merging 'org/apache/spark/unused/UnusedStubClass.class' with strategy 'first'
[warn] Strategy 'discard' was applied to 2 files
[warn] Strategy 'first' was applied to 4 files
[info] SHA-1: 69146d6fdecc2a97e346d36fafc86c2819d5bd8f
[info] Packaging /home/cloudera/workspace/twitter-streaming/target/scala-2.10/twitter-streaming-assembly-1.0.jar ...
[info] Done packaging.
[success] Total time: 6 s, completed Aug 27, 2016 11:58:03 AM

我不确定它的确切含义,但当我运行Hadoop NativeCheck时,所有内容看起来都很好:

$ hadoop checknative -a

16/08/27 13:27:22 INFO bzip2.Bzip2Factory: Successfully loaded & initialized native-bzip2 library system-native
16/08/27 13:27:22 INFO zlib.ZlibFactory: Successfully loaded & initialized native-zlib library
Native library checking:
hadoop:  true /usr/lib/hadoop/lib/native/libhadoop.so.1.0.0
zlib:    true /lib64/libz.so.1
snappy:  true /usr/lib/hadoop/lib/native/libsnappy.so.1
lz4:     true revision:10301
bzip2:   true /lib64/libbz2.so.1
openssl: true /usr/lib64/libcrypto.so

这是我的异常控制台日志: $ ./submit.sh
Using Spark's default log4j profile: org/apache/spark/log4j-defaults.properties
16/08/28 20:13:23 INFO SparkContext: Running Spark version 2.0.0
16/08/28 20:13:24 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
16/08/28 20:13:24 WARN Utils: Your hostname, quickstart.cloudera resolves to a loopback address: 127.0.0.1; using 10.0.2.15 instead (on interface eth0)
16/08/28 20:13:24 WARN Utils: Set SPARK_LOCAL_IP if you need to bind to another address
16/08/28 20:13:24 INFO SecurityManager: Changing view acls to: cloudera
16/08/28 20:13:24 INFO SecurityManager: Changing modify acls to: cloudera
16/08/28 20:13:24 INFO SecurityManager: Changing view acls groups to: 
16/08/28 20:13:24 INFO SecurityManager: Changing modify acls groups to: 
16/08/28 20:13:24 INFO SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users  with view permissions: Set(cloudera); groups with view permissions: Set(); users  with modify permissions: Set(cloudera); groups with modify permissions: Set()
16/08/28 20:13:25 INFO Utils: Successfully started service 'sparkDriver' on port 37550.
16/08/28 20:13:25 INFO SparkEnv: Registering MapOutputTracker
16/08/28 20:13:25 INFO SparkEnv: Registering BlockManagerMaster
16/08/28 20:13:25 INFO DiskBlockManager: Created local directory at /tmp/blockmgr-37a0492e-67e3-4ad5-ac38-40448c25d523
16/08/28 20:13:25 INFO MemoryStore: MemoryStore started with capacity 366.3 MB
16/08/28 20:13:25 INFO SparkEnv: Registering OutputCommitCoordinator
16/08/28 20:13:25 INFO Utils: Successfully started service 'SparkUI' on port 4040.
16/08/28 20:13:25 INFO SparkUI: Bound SparkUI to 0.0.0.0, and started at http://10.0.2.15:4040
16/08/28 20:13:25 INFO SparkContext: Added JAR file:/home/cloudera/workspace/twitter-streaming/target/scala-2.10/twitter-streaming-assembly-1.1.jar at spark://10.0.2.15:37550/jars/twitter-streaming-assembly-1.1.jar with timestamp 1472440405882
16/08/28 20:13:26 INFO Executor: Starting executor ID driver on host localhost
16/08/28 20:13:26 INFO Utils: Successfully started service 'org.apache.spark.network.netty.NettyBlockTransferService' on port 41264.
16/08/28 20:13:26 INFO NettyBlockTransferService: Server created on 10.0.2.15:41264
16/08/28 20:13:26 INFO BlockManagerMaster: Registering BlockManager BlockManagerId(driver, 10.0.2.15, 41264)
16/08/28 20:13:26 INFO BlockManagerMasterEndpoint: Registering block manager 10.0.2.15:41264 with 366.3 MB RAM, BlockManagerId(driver, 10.0.2.15, 41264)
16/08/28 20:13:26 INFO BlockManagerMaster: Registered BlockManager BlockManagerId(driver, 10.0.2.15, 41264)
Exception in thread "main" java.lang.NoClassDefFoundError: org/apache/spark/Logging
    at java.lang.ClassLoader.defineClass1(Native Method)
    at java.lang.ClassLoader.defineClass(ClassLoader.java:763)
    at java.security.SecureClassLoader.defineClass(SecureClassLoader.java:142)
    at java.net.URLClassLoader.defineClass(URLClassLoader.java:467)
    at java.net.URLClassLoader.access$100(URLClassLoader.java:73)
    at java.net.URLClassLoader$1.run(URLClassLoader.java:368)
    at java.net.URLClassLoader$1.run(URLClassLoader.java:362)
    at java.security.AccessController.doPrivileged(Native Method)I
    at java.net.URLClassLoader.findClass(URLClassLoader.java:361)
    at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
    at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
    at org.apache.spark.streaming.twitter.TwitterUtils$.createStream(TwitterUtils.scala:44)
    at TwitterStreamingApp$.main(TwitterStreamingApp.scala:42)
    at TwitterStreamingApp.main(TwitterStreamingApp.scala)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:498)
    at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:729)
    at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:185)
    at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:210)
    at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:124)
    at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
Caused by: java.lang.ClassNotFoundException: org.apache.spark.Logging
    at java.net.URLClassLoader.findClass(URLClassLoader.java:381)
    at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
    at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
    ... 23 more
16/08/28 20:13:26 INFO SparkContext: Invoking stop() from shutdown hook
16/08/28 20:13:26 INFO SparkUI: Stopped Spark web UI at http://10.0.2.15:4040
16/08/28 20:13:26 INFO MapOutputTrackerMasterEndpoint: MapOutputTrackerMasterEndpoint stopped!
16/08/28 20:13:26 INFO MemoryStore: MemoryStore cleared
16/08/28 20:13:26 INFO BlockManager: BlockManager stopped
16/08/28 20:13:26 INFO BlockManagerMaster: BlockManagerMaster stopped
16/08/28 20:13:26 INFO OutputCommitCoordinator$OutputCommitCoordinatorEndpoint: OutputCommitCoordinator stopped!
16/08/28 20:13:26 INFO SparkContext: Successfully stopped SparkContext
16/08/28 20:13:26 INFO ShutdownHookManager: Shutdown hook called
16/08/28 20:13:26 INFO ShutdownHookManager: Deleting directory /tmp/spark-5e29c3b2-74c2-4d89-970f-5be89d176b26

我明白我的帖子很冗长。非常感谢您的建议或见解!!
-jsung8

1
错误提示说,在运行时找不到org.apache.spark包中的Logging类进行加载。这个类在twitter-streaming-assembly-1.0.jar中简单地丢失了。请确保在构建时将其放入jar文件中。在构建此示例时是否创建了其他jar文件? - Amit Kumar
2
这可能会对您有所帮助:https://dev59.com/6lkT5IYBdhLWcg3wPdCU#39194820 - Ajeet Shah
1
@amit_kumar和Ajeets,这看起来确实很有前途!非常感谢你们提供的方向和链接!!虽然“287358”帖子上的建议非常明确,但对我来说挑战在于识别正确的pom.xml文件进行编辑,以及下载/配置Spark 1.6.2。我找到了与twitter-stream目录名称相关的半打pom.xml文件……我会尝试一下并分享我所发现的。 - jsung8
@jsung8,答案2873538是我最近写的,你应该注意到org.apache.spark.Logging存在于Spark 1.5.2,而不是你似乎正在使用的1.6.2 - Ajeet Shah
@jsung8,你能发布你的pom.xml或.sbt文件吗? - Amit Kumar
5个回答

1

用法: spark-core_2.11-1.5.2.jar

我遇到了与@jsung8描述的相同问题,并试图找到@youngstephen建议的.jar文件,但未能找到。然而,按照@youngstephen的建议,将spark-core_2.11-1.5.2.logging.jar替换为spark-core_2.11-1.5.2.jar解决了异常。


在执行spark-submit命令时需要core_2.11-1.5.2.logging.jar文件。 - R Pidugu

0

自spark 1.5.2版本后,org/apache/spark/Logging被移除。

由于您的spark-core版本是2.0,因此最简单的解决方案是:

下载一个单独的spark-core_2.11-1.5.2.logging.jar并将其放置在您的spark根目录下的jars目录中。

无论如何,这解决了我的问题,希望对您有所帮助。


你在哪里找到它的?你有链接吗? - Romain Jouin
@romainjouin:抱歉,没有相关链接。我只是和我的合作伙伴讨论了一下。 - Young Stephen

0
可能导致这个问题的一个原因是库和类冲突。我曾经遇到过这个问题,并通过一些Maven排除来解决它:
   <dependency>
       <groupId>org.apache.spark</groupId>
       <artifactId>spark-core_2.11</artifactId>
       <version>2.0.0</version>
       <scope>provided</scope>
       <exclusions>
           <exclusion>
               <groupId>log4j</groupId>
               <artifactId>log4j</artifactId>
           </exclusion>
       </exclusions>
   </dependency>

   <dependency>
       <groupId>org.apache.spark</groupId>
       <artifactId>spark-streaming_2.11</artifactId>
       <version>2.0.0</version>
       <scope>provided</scope>
   </dependency>

   <dependency>
       <groupId>org.apache.spark</groupId>
       <artifactId>spark-streaming-kafka-0-10_2.11</artifactId>
       <version>2.0.0</version>
       <exclusions>
           <exclusion>
               <groupId>org.slf4j</groupId>
               <artifactId>slf4j-log4j12</artifactId>
           </exclusion>
           <exclusion>
               <groupId>log4j</groupId>
               <artifactId>log4j</artifactId>
           </exclusion>
       </exclusions>
   </dependency>

0

由于以下错误,Spark核心版本应降级至1.5。

java.lang.NoClassDefFoundError: org/apache/spark/Logging

http://bahir.apache.org/docs/spark/2.0.0/spark-streaming-twitter/ 提供了更好的解决方案。通过添加以下依赖项,我的问题得到了解决。

 <dependency>
<groupId>org.apache.bahir</groupId>
<artifactId>spark-streaming-twitter_2.11</artifactId>
<version>2.0.0</version>
</dependency>

0

您正在使用旧版本的Spark Twitter连接器。您堆栈跟踪中的这个类提示了这一点:

org.apache.spark.streaming.twitter.TwitterUtils

Spark在2.0版本中删除了该集成。您正在使用旧版Spark的一个,它引用了旧的Logging类,该类已经在Spark 2.0中移动到了不同的包中。

如果您想使用Spark 2.0,则需要使用Bahir项目中的Twitter连接器。


网页内容由stack overflow 提供, 点击上面的
可以查看英文原文,
原文链接