Oozie和作业历史服务器配置问题

7

问题

我正在尝试安装伪分布式CDH,但不使用CDM。通过控制台一切“正常”。但是,一旦开始使用Hue,尝试使用Pig时就会收到错误。

Hue中显示的错误如下:

JA017:无法查找与操作[0000000-160112011607704-oozie-oozi-W@pig]相关联的已启动的Hadoop作业ID[job_local2125047777_0001]。此操作失败!

我认为这是一个错误,由于Oozie工作流连接Pig和作业历史服务器出了问题导致的误传。

在此之前,我无法从Hue使用Hive,因为Oozie在HDFS上安装Oozie的sharelib时出现了困难。我通过在/etc/hadoop/conf/core-site.xml/etc/oozie/conf/hadoop-conf/core-site.xml之间创建符号链接来解决这个问题。如此建议:Apache Oozie failed loading ShareLib

脚本信息

我编写的用于在Scientific Linux 7上安装CDH的配置脚本可在此处找到:https://github.com/coatless/stat490uiuc/blob/master/install_scripts/cdh_build.sh

具体而言,我正在尝试从pig脚本中获取结果:

data = LOAD '/user/hue/pig/examples/data/midsummer.txt' as (text:CHARARRAY);

upper_case = FOREACH data GENERATE org.apache.pig.piggybank.evaluation.string.UPPER(text);

STORE upper_case INTO '$output' ;

尝试的解决方案

从谷歌上搜索,我找到了以下解决方案,但实施后并没有奏效。

建议运行以下命令:

sudo -u hdfs hadoop fs -mkdir -p /user/history
sudo -u hdfs hadoop fs -chmod -R 1777 /user/history
sudo -u hdfs hadoop fs -chown mapred:hadoop /user/history

尝试重新启动资源管理器、节点管理器、HDFS和历史服务器,但均无效。

在讨论中,有另一个用户建议在job.properties中设置一个属性,指定user.name=mapred。然而,我在Hue作业中找不到任何关于job.properties的参考。

这篇帖子建议在mapred-site.xml文件中声明历史服务器的固定路径:

<property>
  <name>mapreduce.jobhistory.done-dir</name>
  <value>/user/history/done</value>
</property>
<property>
   <name>mapreduce.jobhistory.intermediate-done-dir</name>
   <value>/user/history/done_intermediate</value>
</property>

这也没有起作用。

表明问题可能与权限有关,但是用户没有提供解决问题的具体方法。

任何帮助将不胜感激。

完整的Oozie日志

Oozie日志文件中的完整错误文本:

2016-01-11 23:51:59,195  WARN ParameterVerifier:523 - SERVER[server-name] USER[-] GROUP[-] TOKEN[-] APP[-] JOB[-] ACTION[-] The application does not define formal parameters in its XML definition
2016-01-11 23:51:59,275  WARN LiteWorkflowAppService:523 - SERVER[server-name] USER[-] GROUP[-] TOKEN[-] APP[-] JOB[-] ACTION[-] libpath [hdfs://localhost:8020/user/hue/oozie/workspaces/_cloudera_-oozie-1-1452577913.73/lib] does not exist
2016-01-11 23:51:59,572  INFO ActionStartXCommand:520 - SERVER[server-name] USER[cloudera] GROUP[-] TOKEN[] APP[pig-app-hue-script] JOB[0000000-160111235108256-oozie-oozi-W] ACTION[0000000-160111235108256-oozie-oozi-W@:start:] Start action [0000000-160111235108256-oozie-oozi-W@:start:] with user-retry state : userRetryCount [0], userRetryMax [0], userRetryInterval [10]
2016-01-11 23:51:59,595  INFO ActionStartXCommand:520 - SERVER[server-name] USER[cloudera] GROUP[-] TOKEN[] APP[pig-app-hue-script] JOB[0000000-160111235108256-oozie-oozi-W] ACTION[0000000-160111235108256-oozie-oozi-W@:start:] [***0000000-160111235108256-oozie-oozi-W@:start:***]Action status=DONE
2016-01-11 23:51:59,596  INFO ActionStartXCommand:520 - SERVER[server-name] USER[cloudera] GROUP[-] TOKEN[] APP[pig-app-hue-script] JOB[0000000-160111235108256-oozie-oozi-W] ACTION[0000000-160111235108256-oozie-oozi-W@:start:] [***0000000-160111235108256-oozie-oozi-W@:start:***]Action updated in DB!
2016-01-11 23:52:00,052  INFO ActionStartXCommand:520 - SERVER[server-name] USER[cloudera] GROUP[-] TOKEN[] APP[pig-app-hue-script] JOB[0000000-160111235108256-oozie-oozi-W] ACTION[0000000-160111235108256-oozie-oozi-W@pig] Start action [0000000-160111235108256-oozie-oozi-W@pig] with user-retry state : userRetryCount [0], userRetryMax [0], userRetryInterval [10]
2016-01-11 23:52:03,487  WARN Credentials:96 - SERVER[server-name] Null token ignored for oozie mr token
2016-01-11 23:52:03,506  WARN Credentials:96 - SERVER[server-name] Null token ignored for oozie mr token
2016-01-11 23:52:03,562  WARN JobResourceUploader:64 - SERVER[server-name] Hadoop command-line option parsing not performed. Implement the Tool interface and execute your application with ToolRunner to remedy this.
2016-01-11 23:52:03,563  WARN JobResourceUploader:171 - SERVER[server-name] No job jar file set.  User classes may not be found. See Job or Job#setJar(String).
2016-01-11 23:52:04,169  WARN MRApps:582 - SERVER[server-name] cache file (mapreduce.job.cache.files) hdfs://localhost:8020/user/oozie/share/lib/lib_20160111222734/pig/json-simple-1.1.jar conflicts with cache file (mapreduce.job.cache.files) hdfs://localhost:8020/user/oozie/share/lib/lib_20160111222734/oozie/json-simple-1.1.jar This will be an error in Hadoop 2.0
2016-01-11 23:52:08,611  WARN Credentials:96 - SERVER[server-name] Null token ignored for oozie mr token
2016-01-11 23:52:08,618  WARN PigActionExecutor:523 - SERVER[server-name] USER[cloudera] GROUP[-] TOKEN[] APP[pig-app-hue-script] JOB[0000000-160111235108256-oozie-oozi-W] ACTION[0000000-160111235108256-oozie-oozi-W@pig] Exception in check(). Message[JA017: Could not lookup launched hadoop Job ID [job_local1961106749_0001] which was associated with  action [0000000-160111235108256-oozie-oozi-W@pig].  Failing this action!]
org.apache.oozie.action.ActionExecutorException: JA017: Could not lookup launched hadoop Job ID [job_local1961106749_0001] which was associated with  action [0000000-160111235108256-oozie-oozi-W@pig].  Failing this action!
       at org.apache.oozie.action.hadoop.JavaActionExecutor.check(JavaActionExecutor.java:1274)
       at org.apache.oozie.action.hadoop.JavaActionExecutor.start(JavaActionExecutor.java:1203)
       at org.apache.oozie.command.wf.ActionStartXCommand.execute(ActionStartXCommand.java:250)
       at org.apache.oozie.command.wf.ActionStartXCommand.execute(ActionStartXCommand.java:64)
       at org.apache.oozie.command.XCommand.call(XCommand.java:286)
       at org.apache.oozie.service.CallableQueueService$CompositeCallable.call(CallableQueueService.java:321)
       at org.apache.oozie.service.CallableQueueService$CompositeCallable.call(CallableQueueService.java:250)
       at org.apache.oozie.service.CallableQueueService$CallableWrapper.run(CallableQueueService.java:175)
       at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
       at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
       at java.lang.Thread.run(Thread.java:745)
2016-01-11 23:52:08,620  WARN ActionStartXCommand:523 - SERVER[server-name] USER[cloudera] GROUP[-] TOKEN[] APP[pig-app-hue-script] JOB[0000000-160111235108256-oozie-oozi-W] ACTION[0000000-160111235108256-oozie-oozi-W@pig] Error starting action [pig]. ErrorType [FAILED], ErrorCode [JA017], Message [JA017: Could not lookup launched hadoop Job ID [job_local1961106749_0001] which was associated with  action [0000000-160111235108256-oozie-oozi-W@pig].  Failing this action!]
org.apache.oozie.action.ActionExecutorException: JA017: Could not lookup launched hadoop Job ID [job_local1961106749_0001] which was associated with  action [0000000-160111235108256-oozie-oozi-W@pig].  Failing this action!
       at org.apache.oozie.action.hadoop.JavaActionExecutor.check(JavaActionExecutor.java:1274)
       at org.apache.oozie.action.hadoop.JavaActionExecutor.start(JavaActionExecutor.java:1203)
       at org.apache.oozie.command.wf.ActionStartXCommand.execute(ActionStartXCommand.java:250)
       at org.apache.oozie.command.wf.ActionStartXCommand.execute(ActionStartXCommand.java:64)
       at org.apache.oozie.command.XCommand.call(XCommand.java:286)
       at org.apache.oozie.service.CallableQueueService$CompositeCallable.call(CallableQueueService.java:321)
       at org.apache.oozie.service.CallableQueueService$CompositeCallable.call(CallableQueueService.java:250)
       at org.apache.oozie.service.CallableQueueService$CallableWrapper.run(CallableQueueService.java:175)
       at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
       at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
       at java.lang.Thread.run(Thread.java:745)
2016-01-11 23:52:08,621  WARN ActionStartXCommand:523 - SERVER[server-name] USER[cloudera] GROUP[-] TOKEN[] APP[pig-app-hue-script] JOB[0000000-160111235108256-oozie-oozi-W] ACTION[0000000-160111235108256-oozie-oozi-W@pig] Failing Job due to failed action [pig]
2016-01-11 23:52:08,623  WARN LiteWorkflowInstance:523 - SERVER[server-name] USER[cloudera] GROUP[-] TOKEN[] APP[pig-app-hue-script] JOB[0000000-160111235108256-oozie-oozi-W] ACTION[0000000-160111235108256-oozie-oozi-W@pig] Workflow Failed. Failing node [pig]
2016-01-11 23:52:08,768  INFO KillXCommand:520 - SERVER[server-name] USER[cloudera] GROUP[-] TOKEN[] APP[pig-app-hue-script] JOB[0000000-160111235108256-oozie-oozi-W] ACTION[] STARTED WorkflowKillXCommand for jobId=0000000-160111235108256-oozie-oozi-W
2016-01-11 23:52:08,806  INFO KillXCommand:520 - SERVER[server-name] USER[cloudera] GROUP[-] TOKEN[] APP[pig-app-hue-script] JOB[0000000-160111235108256-oozie-oozi-W] ACTION[] ENDED WorkflowKillXCommand for jobId=0000000-160111235108256-oozie-oozi-W
2016-01-11 23:52:09,038  INFO CallbackServlet:520 - SERVER[server-name] USER[-] GROUP[-] TOKEN[-] APP[-] JOB[0000000-160111235108256-oozie-oozi-W] ACTION[0000000-160111235108256-oozie-oozi-W@pig] callback for action [0000000-160111235108256-oozie-oozi-W@pig]
2016-01-11 23:52:09,072 ERROR CompletedActionXCommand:517 - SERVER[server-name] USER[-] GROUP[-] TOKEN[] APP[-] JOB[0000000-160111235108256-oozie-oozi-W] ACTION[0000000-160111235108256-oozie-oozi-W@pig] XException,
org.apache.oozie.command.CommandException: E0800: Action it is not running its in [FAILED] state, action [0000000-160111235108256-oozie-oozi-W@pig]
       at org.apache.oozie.command.wf.CompletedActionXCommand.eagerVerifyPrecondition(CompletedActionXCommand.java:92)
       at org.apache.oozie.command.XCommand.call(XCommand.java:257)
       at org.apache.oozie.service.CallableQueueService$CallableWrapper.run(CallableQueueService.java:175)
       at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
       at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
       at java.lang.Thread.run(Thread.java:745)
2016-01-11 23:52:09,082  WARN CallableQueueService$CallableWrapper:523 - SERVER[server-name] USER[-] GROUP[-] TOKEN[] APP[-] JOB[0000000-160111235108256-oozie-oozi-W] ACTION[0000000-160111235108256-oozie-oozi-W@pig] exception callable [callback], E0800: Action it is not running its in [FAILED] state, action [0000000-160111235108256-oozie-oozi-W@pig]
org.apache.oozie.command.CommandException: E0800: Action it is not running its in [FAILED] state, action [0000000-160111235108256-oozie-oozi-W@pig]
       at org.apache.oozie.command.wf.CompletedActionXCommand.eagerVerifyPrecondition(CompletedActionXCommand.java:92)
       at org.apache.oozie.command.XCommand.call(XCommand.java:257)
       at org.apache.oozie.service.CallableQueueService$CallableWrapper.run(CallableQueueService.java:175)
       at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
       at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
       at java.lang.Thread.run(Thread.java:745)

1
嘿,你是怎么解决这个问题的?我也遇到了同样的错误,请帮帮我! - ChikuMiku
1个回答

1
你应该使用HUE文件浏览器仔细检查/user/history目录及其所有子目录的权限是否正确。
在我的情况下,所有用户都对/user/history的所有子文件夹具有权限,但是HUE文件浏览器告诉我'/user/history'目录本身具有以下权限设置:
Name        User     Group     Permissions
history     mapred   hadoop    drwxrwx--- 

这会导致在使用非mapred用户时出现错误。 以下命令可以帮助解决问题:
sudo -u hdfs hadoop fs -chmod 777 /user/history

请跟随此链接:https://stackoverflow.com/questions/43426691/oozie-workflow-failed-due-to-error-ja017/44695231#44695231 - Mayur Maheshwari
1
不知道这个方法能不能解决问题,但检查一下肯定不会有坏处,所以我会点赞的 :) - Dennis Jaheruddin

网页内容由stack overflow 提供, 点击上面的
可以查看英文原文,
原文链接