Spark shell命令行

24

我是Spark的新手,正在尝试弄清楚如何使用Spark shell。

查看了Spark网站的文档,但没有说明如何创建目录或在Spark shell中查看所有文件。如果有人能帮助我,我会非常感激。

2个回答

63
在这种情况下,你可以假设Spark shell只是一个普通的Scala REPL,因此适用相同的规则。你可以使用:help获取可用命令列表。
Welcome to
      ____              __
     / __/__  ___ _____/ /__
    _\ \/ _ \/ _ `/ __/  '_/
   /___/ .__/\_,_/_/ /_/\_\   version 2.3.0
      /_/

Using Scala version 2.11.8 (OpenJDK 64-Bit Server VM, Java 1.8.0_151)
Type in expressions to have them evaluated.
Type :help for more information.

scala> :help
All commands can be abbreviated, e.g., :he instead of :help.
:edit <id>|<line>        edit history
:help [command]          print this summary or command-specific help
:history [num]           show the history (optional num is commands to show)
:h? <string>             search the history
:imports [name name ...] show import history, identifying sources of names
:implicits [-v]          show the implicits in scope
:javap <path|class>      disassemble a file or class name
:line <id>|<line>        place line(s) at the end of history
:load <path>             interpret lines in a file
:paste [-raw] [path]     enter paste mode or paste a file
:power                   enable power user mode
:quit                    exit the interpreter
:replay [options]        reset the repl and replay all previous commands
:require <path>          add a jar to the classpath
:reset [options]         reset the repl to its initial state, forgetting all session entries
:save <path>             save replayable session to a file
:sh <command line>       run a shell command (result is implicitly => List[String])
:settings <options>      update compiler options, if possible; see reset
:silent                  disable/enable automatic printing of results
:type [-v] <expr>        display the type of an expression without evaluating it
:kind [-v] <expr>        display the kind of expression's type
:warnings                show the suppressed warnings from the most recent line which had any

正如您在上面看到的那样,您可以使用:sh来调用 shell 命令。例如:

scala> :sh mkdir foobar
res0: scala.tools.nsc.interpreter.ProcessResult = `mkdir foobar` (0 lines, exit 0)

scala> :sh touch foobar/foo
res1: scala.tools.nsc.interpreter.ProcessResult = `touch foobar/foo` (0 lines, exit 0)

scala> :sh touch foobar/bar
res2: scala.tools.nsc.interpreter.ProcessResult = `touch foobar/bar` (0 lines, exit 0)

scala> :sh ls foobar
res3: scala.tools.nsc.interpreter.ProcessResult = `ls foobar` (2 lines, exit 0)

scala> res3.line foreach println
line   lines

scala> res3.lines foreach println
bar
foo

1
我遇到了一个奇怪的错误 - 有任何想法吗?scala> :sh ls res5: scala.tools.nsc.interpreter.ProcessResult = ls (2 lines, exit 0)scala> res5 foreach println <console>:12: error: value foreach is not a member of scala.tools.nsc.interpreter.ProcessResult res5 foreach println - WoodChopper
5
伐木工只需执行 res5.lines foreach println - Ryan Hartman
res3 foreach println 不起作用,应该改为 res3.lines foreach println - Holger Brandl
1
@HolgerBrandl 谢谢。这是一个相当古老的答案(1.x,使用Scala 2.10),那时直接使用foreach是可能的。已更新。 - zero323
但那行不通。:sh cd /home/dean/bin/spark-2.3.0-bin-hadoop2.7 java.io.IOException: Cannot run program "cd": error=2, No such file or directory - Dean Schulze

10

点击查看图片描述

:q或者:quit命令用于退出您的Scala REPL。


8
给使用红色背景的点赞,这是一种大胆的做法。 - Merlin

网页内容由stack overflow 提供, 点击上面的
可以查看英文原文,
原文链接