使用Python解析复杂的shell脚本参数

Question

使用Python解析复杂的shell脚本参数

8

当我写shell脚本时，我经常发现自己花费大部分时间(特别是在调试时)来处理参数。我写或维护的许多脚本很容易超过80%的输入解析和净化。与我在Python脚本中使用argparse处理大部分烦琐工作、容易构建复杂的选项结构和净化/字符串解析行为相比，我认为这是一个值得注意的问题。

因此，我希望能够让Python完成这些繁重的工作，然后在我的shell脚本中获取这些简化和净化后的值，而无需进一步担心用户指定的参数。

举个具体的例子，在我工作的许多shell脚本中，它们已经被定义为按照特定顺序接受它们的参数。您可以调用start_server.sh --server myserver --port 80，但start_server.sh --port 80 --server myserver会失败，并提示You must specify a server to start. - 这使得解析代码更简单，但几乎没有直观性。

因此，第一步的解决方案可以是让Python接收参数，对它们进行排序（保持它们旁边的参数），然后返回已排序的参数。因此，shell脚本仍然需要执行一些解析和净化操作，但用户可以输入比shell脚本本身接受的更任意的内容，例如：# script.sh -o -aR --dir /tmp/test --verbose。

#!/bin/bash

args=$(order.py "$@")
# args is set to "-a --dir /tmp/test -o -R --verbose"

# simpler processing now that we can guarantee the order of parameters

这里有一些明显的限制，特别是parse.py无法区分带参数的最终选项和索引参数的开始，但这似乎并不太糟糕。

所以我的问题是：1）是否存在现有的（最好是Python）实用程序，可以通过比bash更强大的方式进行CLI解析，然后在经过净化后由我的bash脚本的其余部分访问，或者2）是否有人做过这个？是否存在问题、陷阱或我不知道的更好的解决方案？你能分享一下你的实现吗？

一个（非常未成熟的）想法：

#!/bin/bash

# Some sort of simple syntax to describe to Python what arguments to accept
opts='
"a", "append", boolean, help="Append to existing file"
"dir", str, help="Directory to run from"
"o", "overwrite", boolean, help="Overwrite duplicates"
"R", "recurse", boolean, help="Recurse into subdirectories"
"v", "verbose", boolean, help="Print additional information"
'

# Takes in CLI arguments and outputs a sanitized structure (JSON?) or fails
p=$(parse.py "Runs complex_function with nice argument parsing" "$opts" "$@")
if [ $? -ne 0 ]; exit 1; fi # while parse outputs usage to stderr

# Takes the sanitized structure and an argument to get
append=$(arg.py "$p" append)
overwrite=$(arg.py "$p" overwrite)
recurse=$(arg.py "$p" recurse)
verbose=$(arg.py "$p" verbose)

cd $(python arg.py "$p" dir)

complex_function $append $overwrite $recurse $verbose

只需要两行代码和简明的参数说明，就可以开始实际脚本行为了。也许我有点神经，但这似乎比我现在必须做的要好得多。

我看过解析shell脚本参数和类似易于使用的CLI参数解析的维基页面，但是这些模式中很多都感觉不够流畅且容易出错，而且我不喜欢每次编写shell脚本时都需要重新实现它们，尤其是Python、Java等语言有如此好的参数处理库的情况下。

- dimo414

2

你尝试过getopt吗？ - tuxuday

@tuxuday 你比我快了... getopt 应该会有帮助 dimo414 - Michael Ballent

我以前使用过 getopt 和 getopts（请参见问题底部的维基页面链接），但它们仍然有限制 - 引用链接：“getopt 无法处理空参数字符串或带有嵌入式空格的参数。”和“[getopts] 只能处理短选项（-h）而不需要诡计。” 我意识到在 bash 中有可用的解决方案，但在我看来，Python 提供的选项更加优越且更易于操作。我很好奇是否存在 Python 实用程序来完成此任务。 “你很蠢，使用 bash。”最终可能是这个问题的可接受答案。 - dimo414

把参数作为连字符“-”之间的字符串处理，然后使用split("-")和sort排序。 - Luka Rahne

@ralu，开始不错，但是对于长参数（--dir）无法持续工作，并且存在失败的边缘情况。例如，script.sh -dir /tmp/my-dashed-file -a -b 将返回 -a -b -dashed -dir /tmp/my -file。在 '-' 上分割可能是您想要的，这会稍微好一些，但仍然无法处理 script.sh -t "This string -10+4/3 shouldn't be parsed"。一般来说，最好让 shell 分割输入字符串，并让脚本仅进行参数解析。 - dimo414

显示剩余4条评论

4个回答

2

有着同样需求的我，最终编写了一个受optparse启发的bash解析器（实际上在内部使用python）。您可以在此处找到它：https://github.com/carlobaldassi/bash_optparse。请参阅底部的README进行快速说明。您可能想查看简单示例：https://github.com/carlobaldassi/bash_optparse/blob/master/doc/example_script_simple。从我的经验来看，它非常强大（我非常谨慎），功能丰富等等，我在我的脚本中大量使用它。希望对他人有用。欢迎反馈/贡献。

- Carlo Baldassi

您创建了一款复杂的工具，其声明式语法非常优美。为什么不允许人们使用普通的安装说明呢？实际上，我只是不知道该如何安装它。我尝试了"./configure; make; make install"，但失败了。 - snowindy

2

你可以在bash中利用关联数组来帮助实现你的目标。

declare -A opts=($(getopts.py $@))
cd ${opts[dir]}
complex_function ${opts[append]}  ${opts[overwrite]} ${opts[recurse]} \
                 ${opts[verbose]} ${opts[args]}

为了使此功能正常工作，getopts.py 应该是一个解析和清理参数的 Python 脚本。它应该打印如下字符串：

[dir]=/tmp
[append]=foo
[overwrite]=bar
[recurse]=baz
[verbose]=fizzbuzz
[args]="a b c d"

你可以设置一些值来检查选项是否能够被正确解析和清理。

getopts.py 返回的结果：

[__error__]=true

添加到Bash脚本中：

if ${opts[__error__]}; then
    exit 1
fi

如果你更愿意使用getopts.py的退出代码，你可以尝试使用eval：

getopts=$(getopts.py $@) || exit 1
eval declare -A opts=($getopts)

或者：

getopts=$(getopts.py $@)
if [[ $? -ne 0 ]]; then
    exit 1;
fi
eval declare -A opts=($getopts)

- Swiss

使用关联数组的聪明想法！那将非常好 - 不过不幸的是，在我的工作环境中，我无法保证所有机器都有这个功能。话虽如此，自己编写也不会太难，可以让Python脚本返回一堆变量赋值，例如 $in_dir=/tmp; $in_append=0; 等等。 - dimo414

0

我的问题最初的前提是假设委托给Python是简化参数解析的正确方法。如果我们放弃语言要求，实际上我们可以在Bash中使用getopts和一点eval魔法来完成不错的工作：

main() {
  local _usage='foo [-a] [-b] [-f val] [-v val] [args ...]'
  eval "$(parse_opts 'f:v:ab')"
  echo "f=$f v=$v a=$a b=$b -- $#: $*"
}

main "$@"

parse_opts 的实现在 this gist 中，但基本方法是将选项转换为 local 变量，然后可以像普通变量一样处理。所有标准的 getopts 代码都被隐藏起来，错误处理也按预期工作。

由于它在函数内使用了 local 变量，因此 parse_opts 不仅适用于命令行参数，还可用于脚本中的任何函数。

_{* 我说“不错的工作”，因为Bash的getopts是一个相当有限的解析器，只支持单字母选项。优雅、表达力强的CLI仍然最好在其他语言如Python中实现。但对于相当小的函数或脚本，这提供了一个不错的折衷方案，而不会增加太多复杂性或膨胀。}

- dimo414

网页内容由stack overflow 提供, 点击上面的

可以查看英文原文，
原文链接

- dimo414 · Accepted Answer

编辑：我还没有使用它，但如果我今天发布这个答案，我可能会推荐https://github.com/docopt/docopts而不是像下面描述的自定义方法。

我编写了一个简短的Python脚本，完成了我想要的大部分功能。我并不确定它是否已经达到生产质量（尤其是错误处理方面还有欠缺），但总比没有好。我欢迎任何反馈意见。

它利用set内置命令重新分配位置参数，允许脚本的其余部分按照需要处理它们。

bashparse.py

#!/usr/bin/env python

import optparse, sys
from pipes import quote

'''
Uses Python's optparse library to simplify command argument parsing.

Takes in a set of optparse arguments, separated by newlines, followed by command line arguments, as argv[2] and argv[3:]
and outputs a series of bash commands to populate associated variables.
'''

class _ThrowParser(optparse.OptionParser):
    def error(self, msg):
        """Overrides optparse's default error handling
        and instead raises an exception which will be caught upstream
        """
        raise optparse.OptParseError(msg)

def gen_parser(usage, opts_ls):
    '''Takes a list of strings which can be used as the parameters to optparse's add_option function.
    Returns a parser object able to parse those options
    '''
    parser = _ThrowParser(usage=usage)
    for opts in opts_ls:
        if opts:
            # yes, I know it's evil, but it's easy
            eval('parser.add_option(%s)' % opts)
    return parser

def print_bash(opts, args):
    '''Takes the result of optparse and outputs commands to update a shell'''
    for opt, val in opts.items():
        if val:
            print('%s=%s' % (opt, quote(val)))
    print("set -- %s" % " ".join(quote(a) for a in args))

if __name__ == "__main__":
    if len(sys.argv) < 2:
        sys.stderr.write("Needs at least a usage string and a set of options to parse")
        sys.exit(2)
    parser = gen_parser(sys.argv[1], sys.argv[2].split('\n'))

    (opts, args) = parser.parse_args(sys.argv[3:])
    print_bash(opts.__dict__, args)

示例用法：

#!/bin/bash

usage="[-f FILENAME] [-t|--truncate] [ARGS...]"
opts='
"-f"
"-t", "--truncate",action="store_true"
'

echo "$(./bashparse.py "$usage" "$opts" "$@")"
eval "$(./bashparse.py "$usage" "$opts" "$@")"

echo
echo OUTPUT

echo $f
echo $@
echo $0 $2

如果按照以下方式运行：./run.sh one -f 'a_filename.txt' "two' still two" three输出如下结果（注意内部定位变量仍然正确）：

f=a_filename.txt
set -- one 'two'"'"' still two' three

OUTPUT
a_filename.txt
one two' still two three
./run.sh two' still two

不考虑调试输出，你只需要大约四行代码就能构建一个强大的参数解析器。你有什么想法？