在 Docker、Mesos 和 Marathon 中运行 Docker 容器

6

我正在使用mesosphere的docker镜像在家用电脑上运行一个mesos集群。在这个mesos集群中,我想使用marathon来运行docker容器。

我可以手动在我的mesos从属节点上运行一个容器(使用“docker run”命令)。但是,当我尝试将此应用程序提交到marathon时,我得到以下错误提示:

在marathon容器日志中一直会看到以下信息:

 marathon_1 | [2015-08-18 01:21:23,453] INFO Received status update for task neo4j.6cb4f068-4547-11e5-a85f-0242ac110004: TASK_FAILED (Docker container run error: Container exited on error: terminated with signal Aborted) (mesosphere.marathon.MarathonScheduler:96)
 marathon_1 | [2015-08-18 01:21:23,461] INFO Task neo4j.6cb4f068-4547-11e5-a85f-0242ac110004 expunged and removed from TaskTracker (mesosphere.marathon.tasks.TaskTracker:106)

我可以看到一个Docker容器在从节点上启动(并很快死亡),如果我查看这些日志,它们会包含以下内容:
root@default:/# docker logs b65
--container="mesos-20150818-004556-1684252864-5050-1-S0.59f8925a-fa0a-4363-8723-610f648690c4" --docker="docker" --help="false" --initialize_driver_logging="true" --logbufsecs="0" --logging_level="INFO" --mapped_directory="/mnt/mesos/sandbox" --quiet="false" --sandbox_directory="/var/work/slaves/20150818-004556-1684252864-5050-1-S0/frameworks/20150816-214616-1731963072-5050-1-0000/executors/neo4j.53429d8e-4546-11e5-a85f-0242ac110004/runs/59f8925a-fa0a-4363-8723-610f648690c4" --stop_timeout="0ns"
--container="mesos-20150818-004556-1684252864-5050-1-S0.59f8925a-fa0a-4363-8723-610f648690c4" --docker="docker" --help="false" --initialize_driver_logging="true" --logbufsecs="0" --logging_level="INFO" --mapped_directory="/mnt/mesos/sandbox" --quiet="false" --sandbox_directory="/var/work/slaves/20150818-004556-1684252864-5050-1-S0/frameworks/20150816-214616-1731963072-5050-1-0000/executors/neo4j.53429d8e-4546-11e5-a85f-0242ac110004/runs/59f8925a-fa0a-4363-8723-610f648690c4" --stop_timeout="0ns"
I0818 01:13:25.838296     6 exec.cpp:132] Version: 0.23.0
I0818 01:13:25.842473     9 exec.cpp:206] Executor registered on slave 20150818-004556-1684252864-5050-1-S0
Registered docker executor on 192.168.99.100
Starting task neo4j.53429d8e-4546-11e5-a85f-0242ac110004
W0818 01:13:25.842473     6 logging.cpp:81] RAW: Received signal SIGTERM from process 0 of user 0; exiting

我已经克服了一些常见的小问题,包括:

1)我为所有容器使用--net=host
2)mesos_slave容器带有--privileged=true
3)我的容器名为mesos_(显然mesos-*是mesos保留的)
4)我使用MESOS_CONTAINERIZERS=docker,mesos和MESOS_EXECUTOR_REGISTRATION_TIMEOUT=5mins部署了mesos_slave

为了完整起见,这是我发布到Marathon的app.json:

{
   "id": "neo4j",
   "cpus": 0.3,
   "mem": 512.0,
   "disk": 2048.0,
   "container": {
     "type": "DOCKER",
     "docker": {
       "image": "tpires/neo4j",
       "network": "HOST",
       "portMappings": [
         { "containerPort": 7474, "hostPort": 0 }
       ]
     },
     "volumes": [
        {
           "containerPath": "/var/lib/neo4j",
           "hostPath": "/var/lib/neo4j",
           "mode": "RW"
        }
      ]
   }
 }

我不确定我还缺少什么。似乎有很多人都运行了类似的设置。
更新:
这是来自Mesos的标准输出/错误输出。
stdout
--container="mesos-20150903-010158-1684252864-5050-1-S0.74c8376b-e89c-4260-b00e-a76266fd0f87" --docker="docker" --help="false" --initialize_driver_logging="true" --logbufsecs="0" --logging_level="INFO" --mapped_directory="/mnt/mesos/sandbox" --quiet="false" --sandbox_directory="/var/work/slaves/20150903-010158-1684252864-5050-1-S0/frameworks/20150816-214616-1731963072-5050-1-0000/executors/neo4j.d8a8124f-51d7-11e5-98ce-0242f92f6b1a/runs/74c8376b-e89c-4260-b00e-a76266fd0f87" --stop_timeout="0ns"
--container="mesos-20150903-010158-1684252864-5050-1-S0.74c8376b-e89c-4260-b00e-a76266fd0f87" --docker="docker" --help="false" --initialize_driver_logging="true" --logbufsecs="0" --logging_level="INFO" --mapped_directory="/mnt/mesos/sandbox" --quiet="false" --sandbox_directory="/var/work/slaves/20150903-010158-1684252864-5050-1-S0/frameworks/20150816-214616-1731963072-5050-1-0000/executors/neo4j.d8a8124f-51d7-11e5-98ce-0242f92f6b1a/runs/74c8376b-e89c-4260-b00e-a76266fd0f87" --stop_timeout="0ns"
Registered docker executor on 192.168.99.100
Starting task neo4j.d8a8124f-51d7-11e5-98ce-0242f92f6b1a

stderr
I0903 01:05:20.462013     6 exec.cpp:132] Version: 0.23.0
I0903 01:05:20.492404     9 exec.cpp:206] Executor registered on slave 20150903-010158-1684252864-5050-1-S0
W0903 01:05:20.492404     6 logging.cpp:81] RAW: Received signal SIGTERM from process 0 of user 0; exiting

在启动我的Docker应用后,以下是我从我的从属日志中截取的一段代码片段。
slave1_1 | I0903 01:18:44.524652  1573 slave.cpp:1244] Got assigned task neo4j.b838a050-51d9-11e5-98ce-0242f92f6b1a for framework 20150816-214616-1731963072-5050-1-0000
slave1_1 | I0903 01:18:44.527017  1577 gc.cpp:84] Unscheduling '/var/work/slaves/20150903-010158-1684252864-5050-1-S0/frameworks/20150816-214616-1731963072-5050-1-0000' from gc
slave1_1 | I0903 01:18:44.527429  1572 gc.cpp:84] Unscheduling '/var/work/meta/slaves/20150903-010158-1684252864-5050-1-S0/frameworks/20150816-214616-1731963072-5050-1-0000' from gc
slave1_1 | I0903 01:18:44.527667  1573 slave.cpp:1355] Launching task neo4j.b838a050-51d9-11e5-98ce-0242f92f6b1a for framework 20150816-214616-1731963072-5050-1-0000
slave1_1 | I0903 01:18:44.539386  1573 slave.cpp:4733] Launching executor neo4j.b838a050-51d9-11e5-98ce-0242f92f6b1a of framework 20150816-214616-1731963072-5050-1-0000 with resources cpus(*):0.1; mem(*):32 in work directory '/var/work/slaves/20150903-010158-1684252864-5050-1-S0/frameworks/20150816-214616-1731963072-5050-1-0000/executors/neo4j.b838a050-51d9-11e5-98ce-0242f92f6b1a/runs/e7622a60-f973-437f-9869-dafb4326ff59'
slave1_1 | I0903 01:18:44.541831  1573 slave.cpp:1573] Queuing task 'neo4j.b838a050-51d9-11e5-98ce-0242f92f6b1a' for executor neo4j.b838a050-51d9-11e5-98ce-0242f92f6b1a of framework '20150816-214616-1731963072-5050-1-0000
slave1_1 | I0903 01:18:44.547209  1574 docker.cpp:766] Starting container 'e7622a60-f973-437f-9869-dafb4326ff59' for task 'neo4j.b838a050-51d9-11e5-98ce-0242f92f6b1a' (and executor 'neo4j.b838a050-51d9-11e5-98ce-0242f92f6b1a') of framework '20150816-214616-1731963072-5050-1-0000'
slave1_1 | I0903 01:18:44.906893  1576 slave.cpp:2333] Got registration for executor 'neo4j.b838a050-51d9-11e5-98ce-0242f92f6b1a' of framework 20150816-214616-1731963072-5050-1-0000 from executor(1)@127.0.0.1:57299
slave1_1 | I0903 01:18:44.908016  1576 docker.cpp:1008] Ignoring updating container 'e7622a60-f973-437f-9869-dafb4326ff59' with resources passed to update is identical to existing resources
slave1_1 | I0903 01:18:44.908555  1576 slave.cpp:1729] Sending queued task 'neo4j.b838a050-51d9-11e5-98ce-0242f92f6b1a' to executor 'neo4j.b838a050-51d9-11e5-98ce-0242f92f6b1a' of framework 20150816-214616-1731963072-5050-1-0000
slave1_1 | I0903 01:18:45.903827  1571 docker.cpp:390] Checkpointing pid 3404 to '/var/work/meta/slaves/20150903-010158-1684252864-5050-1-S0/frameworks/20150816-214616-1731963072-5050-1-0000/executors/neo4j.b838a050-51d9-11e5-98ce-0242f92f6b1a/runs/e7622a60-f973-437f-9869-dafb4326ff59/pids/forked.pid'
slave1_1 | I0903 01:18:50.074597  1570 slave.cpp:2671] Handling status update TASK_FAILED (UUID: 3e19edd3-1867-4bfb-8eb8-4f40032e1c6e) for task neo4j.b838a050-51d9-11e5-98ce-0242f92f6b1a of framework 20150816-214616-1731963072-5050-1-0000 from executor(1)@127.0.0.1:57299
slave1_1 | E0903 01:18:50.184105  1572 slave.cpp:2821] Failed to update resources for container e7622a60-f973-437f-9869-dafb4326ff59 of executor neo4j.b838a050-51d9-11e5-98ce-0242f92f6b1a running task neo4j.b838a050-51d9-11e5-98ce-0242f92f6b1a on status update for terminal task, destroying container: Failed to 'docker inspect mesos-20150903-010158-1684252864-5050-1-S0.e7622a60-f973-437f-9869-dafb4326ff59': exit status = exited with status 1 stderr = Error: No such image or container: mesos-20150903-010158-1684252864-5050-1-S0.e7622a60-f973-437f-9869-dafb4326ff59
slave1_1 | I0903 01:18:50.184566  1577 docker.cpp:1318] Destroying container 'e7622a60-f973-437f-9869-dafb4326ff59'
slave1_1 | I0903 01:18:50.184692  1577 docker.cpp:1380] Sending SIGTERM to executor with pid: 3404
slave1_1 | I0903 01:18:50.184561  1572 status_update_manager.cpp:322] Received status update TASK_FAILED (UUID: 3e19edd3-1867-4bfb-8eb8-4f40032e1c6e) for task neo4j.b838a050-51d9-11e5-98ce-0242f92f6b1a of framework 20150816-214616-1731963072-5050-1-0000
slave1_1 | I0903 01:18:50.185951  1572 status_update_manager.cpp:826] Checkpointing UPDATE for status update TASK_FAILED (UUID: 3e19edd3-1867-4bfb-8eb8-4f40032e1c6e) for task neo4j.b838a050-51d9-11e5-98ce-0242f92f6b1a of framework 20150816-214616-1731963072-5050-1-0000
slave1_1 | I0903 01:18:50.192136  1572 slave.cpp:2926] Forwarding the update TASK_FAILED (UUID: 3e19edd3-1867-4bfb-8eb8-4f40032e1c6e) for task neo4j.b838a050-51d9-11e5-98ce-0242f92f6b1a of framework 20150816-214616-1731963072-5050-1-0000 to master@192.168.99.100:5050
slave1_1 | I0903 01:18:50.192317  1572 slave.cpp:2856] Sending acknowledgement for status update TASK_FAILED (UUID: 3e19edd3-1867-4bfb-8eb8-4f40032e1c6e) for task neo4j.b838a050-51d9-11e5-98ce-0242f92f6b1a of framework 20150816-214616-1731963072-5050-1-0000 to executor(1)@127.0.0.1:57299
slave1_1 | I0903 01:18:50.207334  1570 status_update_manager.cpp:394] Received status update acknowledgement (UUID: 3e19edd3-1867-4bfb-8eb8-4f40032e1c6e) for task neo4j.b838a050-51d9-11e5-98ce-0242f92f6b1a of framework 20150816-214616-1731963072-5050-1-0000
slave1_1 | I0903 01:18:50.207512  1570 status_update_manager.cpp:826] Checkpointing ACK for status update TASK_FAILED (UUID: 3e19edd3-1867-4bfb-8eb8-4f40032e1c6e) for task neo4j.b838a050-51d9-11e5-98ce-0242f92f6b1a of framework 20150816-214616-1731963072-5050-1-0000
slave1_1 | I0903 01:18:50.215162  1577 docker.cpp:1422] Running docker stop on container 'e7622a60-f973-437f-9869-dafb4326ff59'
slave1_1 | I0903 01:18:50.285275  1575 docker.cpp:1520] Executor for container 'e7622a60-f973-437f-9869-dafb4326ff59' has exited
slave1_1 | I0903 01:18:50.285773  1570 slave.cpp:3349] Executor 'neo4j.b838a050-51d9-11e5-98ce-0242f92f6b1a' of framework 20150816-214616-1731963072-5050-1-0000 has terminated with unknown status
slave1_1 | I0903 01:18:50.285828  1570 slave.cpp:3460] Cleaning up executor 'neo4j.b838a050-51d9-11e5-98ce-0242f92f6b1a' of framework 20150816-214616-1731963072-5050-1-0000
slave1_1 | I0903 01:18:50.286135  1577 gc.cpp:56] Scheduling '/var/work/slaves/20150903-010158-1684252864-5050-1-S0/frameworks/20150816-214616-1731963072-5050-1-0000/executors/neo4j.b838a050-51d9-11e5-98ce-0242f92f6b1a/runs/e7622a60-f973-437f-9869-dafb4326ff59' for gc 6.99999668920296days in the future
slave1_1 | I0903 01:18:50.286226  1570 slave.cpp:3549] Cleaning up framework 20150816-214616-1731963072-5050-1-0000
slave1_1 | I0903 01:18:50.286789  1570 status_update_manager.cpp:284] Closing status update streams for framework 20150816-214616-1731963072-5050-1-0000
slave1_1 | I0903 01:18:50.286638  1577 gc.cpp:56] Scheduling '/var/work/slaves/20150903-010158-1684252864-5050-1-S0/frameworks/20150816-214616-1731963072-5050-1-0000/executors/neo4j.b838a050-51d9-11e5-98ce-0242f92f6b1a' for gc 6.9999966881363days in the future
slave1_1 | I0903 01:18:50.286897  1577 gc.cpp:56] Scheduling '/var/work/meta/slaves/20150903-010158-1684252864-5050-1-S0/frameworks/20150816-214616-1731963072-5050-1-0000/executors/neo4j.b838a050-51d9-11e5-98ce-0242f92f6b1a/runs/e7622a60-f973-437f-9869-dafb4326ff59' for gc 6.99999668767407days in the future
slave1_1 | I0903 01:18:50.286917  1577 gc.cpp:56] Scheduling '/var/work/meta/slaves/20150903-010158-1684252864-5050-1-S0/frameworks/20150816-214616-1731963072-5050-1-0000/executors/neo4j.b838a050-51d9-11e5-98ce-0242f92f6b1a' for gc 6.9999966873363days in the future
slave1_1 | I0903 01:18:50.287289  1577 gc.cpp:56] Scheduling '/var/work/slaves/20150903-010158-1684252864-5050-1-S0/frameworks/20150816-214616-1731963072-5050-1-0000' for gc 6.99999668119111days in the future
slave1_1 | I0903 01:18:50.287345  1577 gc.cpp:56] Scheduling '/var/work/meta/slaves/20150903-010158-1684252864-5050-1-S0/frameworks/20150816-214616-1731963072-5050-1-0000' for gc 6.9999966808days in the future
slave1_1 | I0903 01:18:55.549118  1571 slave.cpp:1244] Got assigned task neo4j.becac151-51d9-11e5-98ce-0242f92f6b1a for framework 20150816-214616-1731963072-5050-1-0000
slave1_1 | I0903 01:18:55.552353  1574 gc.cpp:84] Unscheduling '/var/work/slaves/20150903-010158-1684252864-5050-1-S0/frameworks/20150816-214616-1731963072-5050-1-0000' from gc
slave1_1 | I0903 01:18:55.552784  1577 gc.cpp:84] Unscheduling '/var/work/meta/slaves/20150903-010158-1684252864-5050-1-S0/frameworks/20150816-214616-1731963072-5050-1-0000' from gc
slave1_1 | I0903 01:18:55.552979  1571 slave.cpp:1355] Launching task neo4j.becac151-51d9-11e5-98ce-0242f92f6b1a for framework 20150816-214616-1731963072-5050-1-0000
slave1_1 | I0903 01:18:55.559772  1571 slave.cpp:4733] Launching executor neo4j.becac151-51d9-11e5-98ce-0242f92f6b1a of framework 20150816-214616-1731963072-5050-1-0000 with resources cpus(*):0.1; mem(*):32 in work directory '/var/work/slaves/20150903-010158-1684252864-5050-1-S0/frameworks/20150816-214616-1731963072-5050-1-0000/executors/neo4j.becac151-51d9-11e5-98ce-0242f92f6b1a/runs/1ddbafcf-ff1b-40e6-8892-957545559025'
slave1_1 | I0903 01:18:55.561199  1571 slave.cpp:1573] Queuing task 'neo4j.becac151-51d9-11e5-98ce-0242f92f6b1a' for executor neo4j.becac151-51d9-11e5-98ce-0242f92f6b1a of framework '20150816-214616-1731963072-5050-1-0000
slave1_1 | I0903 01:18:55.566067  1573 docker.cpp:766] Starting container '1ddbafcf-ff1b-40e6-8892-957545559025' for task 'neo4j.becac151-51d9-11e5-98ce-0242f92f6b1a' (and executor 'neo4j.becac151-51d9-11e5-98ce-0242f92f6b1a') of framework '20150816-214616-1731963072-5050-1-0000'
slave1_1 | I0903 01:18:55.868223  1570 slave.cpp:2333] Got registration for executor 'neo4j.becac151-51d9-11e5-98ce-0242f92f6b1a' of framework 20150816-214616-1731963072-5050-1-0000 from executor(1)@127.0.0.1:56971
slave1_1 | I0903 01:18:55.869742  1570 docker.cpp:1008] Ignoring updating container '1ddbafcf-ff1b-40e6-8892-957545559025' with resources passed to update is identical to existing resources
slave1_1 | I0903 01:18:55.870088  1570 slave.cpp:1729] Sending queued task 'neo4j.becac151-51d9-11e5-98ce-0242f92f6b1a' to executor 'neo4j.becac151-51d9-11e5-98ce-0242f92f6b1a' of framework 20150816-214616-1731963072-5050-1-0000
slave1_1 | I0903 01:18:56.828328  1572 docker.cpp:390] Checkpointing pid 3474 to '/var/work/meta/slaves/20150903-010158-1684252864-5050-1-S0/frameworks/20150816-214616-1731963072-5050-1-0000/executors/neo4j.becac151-51d9-11e5-98ce-0242f92f6b1a/runs/1ddbafcf-ff1b-40e6-8892-957545559025/pids/forked.pid'
slave1_1 | I0903 01:18:58.996363  1571 slave.cpp:3842] Current disk usage 20.25%. Max allowed age: 4.882317422062026days
slave1_1 | I0903 01:18:59.170140  1570 slave.cpp:4179] Querying resource estimator for oversubscribable resources
slave1_1 | I0903 01:18:59.170402  1570 slave.cpp:4193] Received oversubscribable resources  from the resource estimator

这样的设置通常应该可以工作。你能否检查两个日志:Mesos从属节点日志(我假设你只有一个,对吗?)和沙盒中任务的标准输出/错误输出(可以通过Mesos UI完成)。谢谢! - js84
@js84 很抱歉回复晚了。度假和升级到Windows 10之后,我回来了。已更新帖子并提供所需信息。谢谢。 - jeff
为了理解DockerContainerizer如何与Docker接口,启用增强的调试日志记录可能会很有用:GLOG_v=1 mesos-slave.sh --master=[...] - Till
你还遇到这个问题吗?如果是的话,请检查mesos-slave机器上是否存在“tpires/neo4j”镜像。如果没有,您需要在“image”:“tpires/neo4j”下使用“forcePullImage”:true。 - V.G
1个回答

0

不确定问题具体是什么,但我已经成功地在我的笔记本电脑上解决了所有问题。最终我升级到了mesoscloud docker镜像:mesoscloud/mesos-master(slave):0.24.1-ubuntu-14.04。


网页内容由stack overflow 提供, 点击上面的
可以查看英文原文,
原文链接