我刚接触MapReduce,运行一个任务时花费了很长时间,尽管这是一个相对较小的任务,我猜测可能有些地方出了问题。我使用的是Hadoop版本2.6,以下是一些我认为可能有帮助的信息。MapReduce程序本身很简单,所以我不会在这里添加它们,除非有人真的想让我提供更多的洞察力 - 运行MapReduce的Python代码与此处完全相同:http://www.michael-noll.com/tutorials/writing-an-hadoop-mapreduce-program-in-python/。如果有人能给出出了什么问题或原因的线索,那就太好了。谢谢。
Name: streamjob1669011192523346656.jar
Application Type: MAPREDUCE
Application Tags:
State: ACCEPTED
FinalStatus: UNDEFINED
Started: 3-Jul-2015 00:17:10
Elapsed: 20mins, 57sec
Tracking URL: UNASSIGNED
Diagnostics:
运行程序后,我得到的结果如下:
bin/hadoop jar share/hadoop/tools/lib/hadoop-streaming-2.6.0.jar - file python-files/mapper.py -mapper python-files/mapper.py -file python - files/reducer.py -reducer python-files/reducer.py -input /user/input/* - output /user/output
15/07/03 00:16:41 WARN streaming.StreamJob: -file option is deprecated, please use generic option -files instead.
2015-07-03 00:16:43.510 java[3708:1b03] Unable to load realm info from SCDynamicStore
15/07/03 00:16:44 WARN util.NativeCodeLoader: Unable to load native- hadoop library for your platform... using builtin-java classes where applicable
packageJobJar: [python-files/mapper.py, python-files/reducer.py, /var/folders/4x/v16lrvy91ld4t8rqvnzbr83m0000gn/T/hadoop-unjar8212926403009053963/] [] /var/folders/4x/v16lrvy91ld4t8rqvnzbr83m0000gn/T/streamjob1669011192523346656.jar tmpDir=null
15/07/03 00:16:53 INFO client.RMProxy: Connecting to ResourceManager at /0.0.0.0:8032
15/07/03 00:16:55 INFO client.RMProxy: Connecting to ResourceManager at /0.0.0.0:8032
15/07/03 00:17:05 INFO mapred.FileInputFormat: Total input paths to process : 1
15/07/03 00:17:06 INFO mapreduce.JobSubmitter: number of splits:2
15/07/03 00:17:07 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1435852353333_0003
15/07/03 00:17:11 INFO impl.YarnClientImpl: Submitted application application_1435852353333_0003
15/07/03 00:17:11 INFO mapreduce.Job: The url to track the job: http://mymacbook.home:8088/proxy/application_1435852353333_0003/
15/07/03 00:17:11 INFO mapreduce.Job: Running job: job_1435852353333_0003