AWS Glue - 无法设置spark.yarn.executor.memoryOverhead

Question

AWS Glue - 无法设置spark.yarn.executor.memoryOverhead

6

在AWS Glue中运行Python作业时，我遇到了错误：

原因：由于超过内存限制，容器被YARN终止。使用了5.6 GB的物理内存，超过了5.5 GB。考虑增加spark.yarn.executor.memoryOverhead。

在脚本开头运行此命令时：

print '--- Before Conf --'
print 'spark.yarn.driver.memory', sc._conf.get('spark.yarn.driver.memory')
print 'spark.yarn.driver.cores', sc._conf.get('spark.yarn.driver.cores')
print 'spark.yarn.executor.memory', sc._conf.get('spark.yarn.executor.memory')
print 'spark.yarn.executor.cores', sc._conf.get('spark.yarn.executor.cores')
print "spark.yarn.executor.memoryOverhead", sc._conf.get("spark.yarn.executor.memoryOverhead")

print '--- Conf --'
sc._conf.setAll([('spark.yarn.executor.memory', '15G'),('spark.yarn.executor.memoryOverhead', '10G'),('spark.yarn.driver.cores','5'),('spark.yarn.executor.cores', '5'), ('spark.yarn.cores.max', '5'), ('spark.yarn.driver.memory','15G')])

print '--- After Conf ---'
print 'spark.driver.memory', sc._conf.get('spark.driver.memory')
print 'spark.driver.cores', sc._conf.get('spark.driver.cores')
print 'spark.executor.memory', sc._conf.get('spark.executor.memory')
print 'spark.executor.cores', sc._conf.get('spark.executor.cores')
print "spark.executor.memoryOverhead", sc._conf.get("spark.executor.memoryOverhead")

我得到了以下输出：

--- 在配置之前 --- spark.yarn.driver.memory 无 spark.yarn.driver.cores 无 spark.yarn.executor.memory 无 spark.yarn.executor.cores 无 spark.yarn.executor.memoryOverhead 无 --- 配置 --- --- 在配置之后 --- spark.yarn.driver.memory 15G spark.yarn.driver.cores 5 spark.yarn.executor.memory 15G spark.yarn.executor.cores 5 spark.yarn.executor.memoryOverhead 10G

看起来像是设置了 spark.yarn.executor.memoryOverhead，但为什么它没有被识别？我仍然收到相同的错误。

我看到其他帖子中提到了设置 spark.yarn.executor.memoryOverhead 的问题，但不知道为什么在这里似乎已经设置了却无法工作？

- KDilla

2个回答