要检查Apache Spark中正在运行的应用程序,可以从以下URL的Web界面上检查它们:
http://<master>:8080
我的问题是如何从终端检查正在运行的应用程序,是否有返回应用程序状态的命令?
要检查Apache Spark中正在运行的应用程序,可以从以下URL的Web界面上检查它们:
http://<master>:8080
我的问题是如何从终端检查正在运行的应用程序,是否有返回应用程序状态的命令?
$ yarn application -help
usage: application
-appStates <States> Works with -list to filter applications
based on input comma-separated list of
application states. The valid application
state can be one of the following:
ALL,NEW,NEW_SAVING,SUBMITTED,ACCEPTED,RUN
NING,FINISHED,FAILED,KILLED
-appTypes <Types> Works with -list to filter applications
based on input comma-separated list of
application types.
-help Displays help for all commands.
-kill <Application ID> Kills the application.
-list List applications. Supports optional use
of -appTypes to filter applications based
on application type, and -appStates to
filter applications based on application
state.
-movetoqueue <Application ID> Moves the application to a different
queue.
-queue <Queue Name> Works with the movetoqueue command to
specify which queue to move an
application to.
-status <Application ID> Prints the status of the application.
spark-submit --status
(如Mastering Apache Spark 2.0所述)。spark-submit --status [submission ID]
请参考spark-submit的代码:
if (!master.startsWith("spark://") && !master.startsWith("mesos://")) {
SparkSubmit.printErrorAndExit(
"Requesting submission statuses is only supported in standalone or Mesos mode!")
}
--status
选项仅适用于Spark独立模式或使用集群部署模式的Mesos(不适用于YARN)。 - DNA--master
参数时,监听器的端口与提交应用程序时使用的端口相同吗?(例如 7077) - CavazTo create the job, use the following curl command:
curl -X POST http://spark-cluster-ip:6066/v1/submissions/create
--header "Content-Type:application/json;charset=UTF-8"
--data
'{
"action" : "CreateSubmissionRequest",
"appArgs" : [ "blah" ],
"appResource" : "path-to-jar-file",
"clientSparkVersion" : "2.2.0",
"environmentVariables" : { "SPARK_ENV_LOADED" : "1" },
"mainClass" : "app-class",
"sparkProperties" : {
"spark.jars" : "path-to-jar-file",
"spark.driver.supervise" : "false",
"spark.app.name" : "app-name",
"spark.submit.deployMode" : "cluster",
"spark.master" : "spark://spark-master-ip:6066"
}
}'
The response includes success or failure of the above operation and submissionId
{
'submissionId': 'driver-20170829014216-0001',
'serverSparkVersion': '2.2.0',
'success': True,
'message': 'Driver successfully submitted as driver-20170829014216-0001',
'action': 'CreateSubmissionResponse'
}
To delete the job, use the submissionId obtained above:
curl -X POST http://spark-cluster-ip:6066/v1/submissions/kill/driver-driver-20170829014216-0001
The response again contains success/failure status:
{
'success': True,
'message': 'Kill request for driver-20170829014216-0001 submitted',
'action': 'KillSubmissionResponse',
'serverSparkVersion': '2.2.0',
'submissionId': 'driver-20170829014216-0001'
}
To get the status, use the following command:
curl http://spark-cluster-ip:6066/v1/submissions/status/driver-20170829014216-0001
The response includes driver state -- current status of the app:
{
"action" : "SubmissionStatusResponse",
"driverState" : "RUNNING",
"serverSparkVersion" : "2.2.0",
"submissionId" : "driver-20170829203736-0004",
"success" : true,
"workerHostPort" : "10.32.1.18:38317",
"workerId" : "worker-20170829013941-10.32.1.18-38317"
}
我在这里了解到了有关REST API的信息。
就像我的情况一样,我的Spark应用程序在亚马逊的AWS EMR上远程运行。因此,我使用Lynx命令行浏览器来访问Spark应用程序的状态。 当您从一个终端提交了Spark作业时,请打开另一个终端并从新终端执行以下命令。
**lynx http://localhost:<4043 or other spark job port>**