Hadoop无法设置路径\ tmp \的权限

5
我正在Windows机器上通过cygwin运行Hadoop 1.2.1(或者至少试图运行),但不幸的是,我的Hadoop出了严重的问题。当我尝试在本地模式下执行简单的Pig脚本时,我收到了以下错误信息。
Backend error message during job submission
-------------------------------------------
java.io.IOException: Failed to set permissions of path: \tmp\hadoop-antonbelev\mapred\staging\antonbelev1696923409\.staging to 0700
    at org.apache.hadoop.fs.FileUtil.checkReturnValue(FileUtil.java:691)
    at org.apache.hadoop.fs.FileUtil.setPermission(FileUtil.java:664)
    at org.apache.hadoop.fs.RawLocalFileSystem.setPermission(RawLocalFileSystem.java:514)
    at org.apache.hadoop.fs.RawLocalFileSystem.mkdirs(RawLocalFileSystem.java:349)
    at org.apache.hadoop.fs.FilterFileSystem.mkdirs(FilterFileSystem.java:193)
    at org.apache.hadoop.mapreduce.JobSubmissionFiles.getStagingDir(JobSubmissionFiles.java:126)
    at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:942)
    at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:936)
    at java.security.AccessController.doPrivileged(Native Method)
    at javax.security.auth.Subject.doAs(Subject.java:415)
    at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1190)
    at org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:936)
    at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:910)
    at org.apache.hadoop.mapred.jobcontrol.Job.submit(Job.java:378)
    at org.apache.hadoop.mapred.jobcontrol.JobControl.startReadyJobs(JobControl.java:247)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:601)
    at org.apache.pig.backend.hadoop20.PigJobControl.mainLoopAction(PigJobControl.java:157)
    at org.apache.pig.backend.hadoop20.PigJobControl.run(PigJobControl.java:134)
    at java.lang.Thread.run(Thread.java:722)
    at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher$1.run(MapReduceLauncher.java:270)

Pig Stack Trace
---------------
ERROR 2244: Job failed, hadoop does not return any error message

org.apache.pig.backend.executionengine.ExecException: ERROR 2244: Job failed, hadoop does not return any error message
    at org.apache.pig.tools.grunt.GruntParser.executeBatch(GruntParser.java:148)
    at org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:202)
    at org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:173)
    at org.apache.pig.tools.grunt.Grunt.exec(Grunt.java:84)
    at org.apache.pig.Main.run(Main.java:607)
    at org.apache.pig.Main.main(Main.java:156)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:601)
    at org.apache.hadoop.util.RunJar.main(RunJar.java:160)

我猜Hadoop的安装或配置文件出了问题,但我是Hadoop的新手,这只是一个假设。有人可以帮助我解决这个问题吗?谢谢! : )
另外,为什么在“\tmp\hadoop-antonbelev\mapred\staging\antonbelev1696923409\.staging to 0700”的路径中使用了Windows反斜杠?我尝试找到这个文件,但它不存在。
更新:
以下是我的配置文件:
core-site.xml:
<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>

<!-- Put site-specific property overrides in this file. -->

<configuration>
    <property>
        <name>fs.default.name</name>
        <value>localhost:9100</value>
    </property>
</configuration>

hdfs-site.xml:

<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>

<!-- Put site-specific property overrides in this file. -->

<configuration>
    <property>
        <name>dfs.replication</name>
        <value>1</value>
    </property>
</configuration>

mapred-site.xml:

<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>

<!-- Put site-specific property overrides in this file. -->

<configuration>
    <property>
        <name>mapred.job.tracker</name>
        <value>localhost:9101</value>
    </property>
</configuration>

hadoop-env.sh:

# Set Hadoop-specific environment variables here.

# The only required environment variable is JAVA_HOME.  All others are
# optional.  When running a distributed configuration it is best to
# set JAVA_HOME in this file, so that it is correctly defined on
# remote nodes.

# The java implementation to use.  Required.
 export JAVA_HOME="C:/Program Files/Java/jdk1.7.0_07"

# Extra Java CLASSPATH elements.  Optional.
# export HADOOP_CLASSPATH=

# The maximum amount of heap to use, in MB. Default is 1000.
# export HADOOP_HEAPSIZE=2000

# Extra Java runtime options.  Empty by default.
# export HADOOP_OPTS=-server

# Command specific options appended to HADOOP_OPTS when specified
export HADOOP_NAMENODE_OPTS="-Dcom.sun.management.jmxremote $HADOOP_NAMENODE_OPTS"
export HADOOP_SECONDARYNAMENODE_OPTS="-Dcom.sun.management.jmxremote $HADOOP_SECONDARYNAMENODE_OPTS"
export HADOOP_DATANODE_OPTS="-Dcom.sun.management.jmxremote $HADOOP_DATANODE_OPTS"
export HADOOP_BALANCER_OPTS="-Dcom.sun.management.jmxremote $HADOOP_BALANCER_OPTS"
export HADOOP_JOBTRACKER_OPTS="-Dcom.sun.management.jmxremote $HADOOP_JOBTRACKER_OPTS"
# export HADOOP_TASKTRACKER_OPTS=
# The following applies to multiple commands (fs, dfs, fsck, distcp etc)
# export HADOOP_CLIENT_OPTS

# Extra ssh options.  Empty by default.
# export HADOOP_SSH_OPTS="-o ConnectTimeout=1 -o SendEnv=HADOOP_CONF_DIR"

# Where log files are stored.  $HADOOP_HOME/logs by default.
# export HADOOP_LOG_DIR=${HADOOP_HOME}/logs

# File naming remote slave hosts.  $HADOOP_HOME/conf/slaves by default.
# export HADOOP_SLAVES=${HADOOP_HOME}/conf/slaves

# host:path where hadoop code should be rsync'd from.  Unset by default.
# export HADOOP_MASTER=master:/home/$USER/src/hadoop

# Seconds to sleep between slave commands.  Unset by default.  This
# can be useful in large clusters, where, e.g., slave rsyncs can
# otherwise arrive faster than the master can service them.
# export HADOOP_SLAVE_SLEEP=0.1

# The directory where pid files are stored. /tmp by default.
# NOTE: this should be set to a directory that can only be written to by 
#       the users that are going to run the hadoop daemons.  Otherwise there is
#       the potential for a symlink attack.
# export HADOOP_PID_DIR=/var/hadoop/pids

# A string representing this instance of hadoop. $USER by default.
# export HADOOP_IDENT_STRING=$USER

# The scheduling priority for daemon processes.  See 'man nice'.
# export HADOOP_NICENESS=10

我不确定是否有其他配置文件相关。

你能提供你的配置文件吗? - SSaikia_JtheRocker
@SSaikia_JtheRocker,我已经更新了我的问题。我是否错过了其他相关的配置文件? - Anton Belev
我现在能够最好地说的是手动配置core-site.xml中的hadoop属性: <property> <name>hadoop.tmp.dir</name> <value>/tmp/${user.name}</value> </property> - SSaikia_JtheRocker
这个漏洞有第三方补丁,详见此答案:https://dev59.com/GnPYa4cB1Zd3GeqPj4i-#18381218 - Nathan Fig
请问您的解决方案是什么? - SMKS
2个回答

0
请将此条目添加到 core-site.xml 中,如下所示:
<property>
<name>hadoop.tmp.dir</name>
 <value>/tmp/hadoop-${user.name}</value>
 <description>A base for other temporary directories.</description>
</property>

你的配置有问题,Hadoop正在读取的tmp文件夹位于root或/tmp下。


0
尝试更改您用作Hadoop tmp文件夹的文件夹的文件权限。类似这样:
sudo chmod a+w /app/hadoop/tmp -R

2
它是一台Windows机器。 - szedjani
winutils.exe chmod 777 pathTodhadoopBin 这对我起作用了。 - Mukul Badhan

网页内容由stack overflow 提供, 点击上面的
可以查看英文原文,
原文链接