YAML.dump在多行字符串中添加不必要的换行符

19

我有一个多行字符串:

>>> import credstash
>>> d = credstash.getSecret('alex_test_key', region='ap-southeast-2')

查看原始数据(前162个字符):

>>> credstash.getSecret('alex_test_key', region='ap-southeast-2')[0:162]
u'-----BEGIN RSA PRIVATE KEY-----\nMIIEogIBAAKCAQEA6oySC+8/N9VNpk0gJS7Gk8vn9sYN7FhjpAQnoHRqTN/Oaiyx\nxk2AleP2vXpojA/DHldT1JO+o3j56AHD+yfNFFeYvgWKDY35g49HsZZhbyCEAB45\n'

并且:

>>> print d[0:162]                                                                                                                                                                                          
-----BEGIN RSA PRIVATE KEY-----
MIIEogIBAAKCAQEA6oySC+8/N9VNpk0gJS7Gk8vn9sYN7FhjpAQnoHRqTN/Oaiyx
xk2AleP2vXpojA/DHldT1JO+o3j56AHD+yfNFFeYvgWKDY35g49HsZZhbyCEAB45

我把数据写到了一个 YAML 文件中:

>>> import yaml
>>> with open('foo.yaml', 'w') as f:                                                                                                                                                                        
...     yaml.dump(d, f, default_flow_style=False, explicit_start=True)
... 
现在它看起来是这样的:
$ head -5 foo.yaml 
--- !!python/unicode '-----BEGIN RSA PRIVATE KEY-----

  MIIEogIBAAKCAQEA6oySC+8/N9VNpk0gJS7Gk8vn9sYN7FhjpAQnoHRqTN/Oaiyx

  xk2AleP2vXpojA/DHldT1JO+o3j56AHD+yfNFFeYvgWKDY35g49HsZZhbyCEAB45

即每行有两个换行符。

现在,如果我将其读回字符串,我会发现在往返过程中一切都没问题:

>>> with open('foo.yaml', 'r') as f:
...     d = yaml.load(f)
... 
>>> print d[0:162]
-----BEGIN RSA PRIVATE KEY-----
MIIEogIBAAKCAQEA6oySC+8/N9VNpk0gJS7Gk8vn9sYN7FhjpAQnoHRqTN/Oaiyx
xk2AleP2vXpojA/DHldT1JO+o3j56AHD+yfNFFeYvgWKDY35g49HsZZhbyCEAB45

(我不明白为什么。)

我的真正问题是,如果人类读取这个YAML文件,他们可能会像我一样假设我的程序已经破坏了私钥文件的格式。

是否有一种方法可以使用yaml.dump输出没有额外换行符的内容?

4个回答

27
如果您的YAML文件中只有这一项内容,那么可以使用选项default_style='|'来转储,这将为所有标量提供块样式文字(可能不是您想要的)。
您的字符串不包含特殊字符(需要\转义和双引号),因为PyYAML决定用单引号表示。在单引号样式中,双换行符是表示在字符串中出现的单个换行符的方式。这在加载时会被"撤销",但确实不太易读。
如果您想以个别基础获得块样式文字,则可以执行多个操作:
  • 调整再显示器以使用文字标量块样式输出所有具有嵌入式换行符的字符串(假设它们不需要\转义特殊字符,否则将强制使用双引号)

    import sys
    import yaml
    
    x = u"""\
    -----BEGIN RSA PRIVATE KEY-----
    MIIEogIBAAKCAQEA6oySC+8/N9VNpk0gJS7Gk8vn9sYN7FhjpAQnoHRqTN/Oaiyx
    xk2AleP2vXpojA/DHldT1JO+o3j56AHD+yfNFFeYvgWKDY35g49HsZZhbyCEAB45
    ...
    """
    
    yaml.SafeDumper.org_represent_str = yaml.SafeDumper.represent_str
    
    def repr_str(dumper, data):
        if '\n' in data:
            return dumper.represent_scalar(u'tag:yaml.org,2002:str', data, style='|')
        return dumper.org_represent_str(data)
    
    yaml.add_representer(str, repr_str, Dumper=yaml.SafeDumper)
    
    yaml.safe_dump(dict(a=1, b='hello world', c=x), sys.stdout)
    
  • 创建一个字符串的子类,具有其特殊的表示形式。您应该能够从这里(这里)(这里)以及(这里)获取相关代码:

  • import sys
    import yaml
    
    class PSS(str):
        pass
    
    x = PSS("""\
    -----BEGIN RSA PRIVATE KEY-----
    MIIEogIBAAKCAQEA6oySC+8/N9VNpk0gJS7Gk8vn9sYN7FhjpAQnoHRqTN/Oaiyx
    xk2AleP2vXpojA/DHldT1JO+o3j56AHD+yfNFFeYvgWKDY35g49HsZZhbyCEAB45
    ...
    """)
    
    def pss_representer(dumper, data):
            style = '|'
            # if sys.versioninfo < (3,) and not isinstance(data, unicode):
            #     data = unicode(data, 'ascii')
            tag = u'tag:yaml.org,2002:str'
            return dumper.represent_scalar(tag, data, style=style)
    
    yaml.add_representer(PSS, pss_representer, Dumper=yaml.SafeDumper)
    
    yaml.safe_dump(dict(a=1, b='hello world', c=x), sys.stdout)
    
  • 使用 ruamel.yaml:

    import sys
    from ruamel.yaml import YAML
    from ruamel.yaml.scalarstring import PreservedScalarString as pss
    
    x = pss("""\
    -----BEGIN RSA PRIVATE KEY-----
    MIIEogIBAAKCAQEA6oySC+8/N9VNpk0gJS7Gk8vn9sYN7FhjpAQnoHRqTN/Oaiyx
    xk2AleP2vXpojA/DHldT1JO+o3j56AHD+yfNFFeYvgWKDY35g49HsZZhbyCEAB45
    ...
    """)
    
    yaml = YAML()
    
    yaml.dump(dict(a=1, b='hello world', c=x), sys.stdout)
    

所有这些都给出:

a: 1
b: hello world
c: |
  -----BEGIN RSA PRIVATE KEY-----
  MIIEogIBAAKCAQEA6oySC+8/N9VNpk0gJS7Gk8vn9sYN7FhjpAQnoHRqTN/Oaiyx
  xk2AleP2vXpojA/DHldT1JO+o3j56AHD+yfNFFeYvgWKDY35g49HsZZhbyCEAB45
  ...
请注意,不需要指定default_flow_style=False,因为literal scalars只能以块样式出现。

无法处理 x = '\"key:value\",\n \n' - ahuigo
你应该发布一个全面的程序,询问你正在做什么错误的事情。 - Anthon
适应 Representer 真是救了我的一天,这正是我正在寻找的。 - scravy

3

在Anthon的回答基础上,我找到了一些可以传递给default_style的其他选项的文档,链接在这里

为了表示我所有数据的最佳折衷方案是:

with open('foo.yaml', 'w') as f:
  yaml.safe_dump(secrets, f, explicit_start=True, default_style='\"', width=4096)

这导致了一个类似于YAML文件的格式:

---
"alex_test": "yyyyyyyy"
"alex_test_key": "-----BEGIN RSA PRIVATE KEY-----\nMIIEogIBAAKCAQEA6oySC+8/N9VNpk0gJS7Gk8vn9sYN7FhjpAQnoHRqTN/Oaiyx\nxk2AleP2vXpojA/DHldT1JO+o3j56AHD+yfNFFeYvgWKDY35g49HsZZhbyCEAB45\ni6LLqO6aixyhvabSy7r1bP8QBrHWUIEZRerrw0TlhuKHDoFpmRAjAAIZ/5q9PSxg\n1yCwTVMlMvBiRksPsKi0fcA/v8G+yqFBL7IeaNCPSoa/3ZdgPWbh9P69DyOlB97a\nh1+0Jmh1gtAhyiz1/hmiN7LAclKHyOOnTEEyIMJioZqJURshKdF85RKILgw2X8Lp\n78mO5VyvvGxo3BNjVr0BOrSJ3t17ugijROx3HwIBIwKCAQAUGq1uvLxGnUErgvQg\ncbk/3kcVJutAJNVXM45eNd05ygpg30JwFUWJMMwBnMch8rjz+NtMvDTpcMT2oRDM\nYn9K4u/VxfXj55kLRsuhgesYJ1vFfu79VxjFVkfCx/CbOi9TSooQqCXx8fxtTOTo\nvF1Z4VWAlxLj/HbD+hGg6jy+Iwq/8HWsHN/VFPqhNqdKvzXGOtyynSZBOUf7upMX\nPh4REE4hYMZwdDnl+NRNmm8XA9TOE+Uf8WLDooKcXjp70CES0ehiC+VD0wG5JEVQ\nbZmDTdBxPcQsO31sNwRwUIX0J4K4Z9npa3dJdRqXJuof48RLzSGwM42eJzmTRNSw\n6I77AoGBAP9LO2A2ZAD7LJBKe48GE8wzkgaQd9vc3RImwrMAXPMVP9wdKR4m/X73\ngWxQ1QbueTtBRaNwkF8l9+Iham3H3kAbBONsbvJIO9Co0n1k+S9mutO1ZWfTMWZp\nIfMz2lncLonxXCXnDndzXtTjcqHeZFmSmDZZZugPXYWtC5N2ic3pAoGBAOsypk5z\na9FG3H46TIjYKyV0Z/R0Hvrp8w+AXdogKyHh0nj9Sevr+JMgOR4ayqYUKGG3sRtM\nzyoWCJ+Wb7Rd0olc2SeouQYSzk2wFKvnnq5o0Q8YZIYkiQN82FXoN2jcELdcVdW6\n1VJuUk9K3nDe+Gz6dkHZnthFC6usL15pHs/HAoGBAIqWjfJmq1D9YVWkxrtbEg/E\nOVQFSGFpRM9W3rjxkYtGDLlRqJtW/qQCs/j4rihVkkS9CIvspiUF+5gDgusjW2Sg\n9AZuEFejji9xltZbYrNVBlWrnXLgXKVPA80qxv2UyM6KVpg7miOWZq4VElffIIhl\nhdRcaxBC2v9skUFsPC3zAoGBAN3CCoR7dEj5q1Jxe1x0C2x1EY62oN3yhhXuD1ih\n/MgssIC0TQMDDvEeYb1Mde0LsQutMfUrKbn3hHk2EYzNfVzxJIR6gpCypUHvKW7h\nst71HOJY087vP1sPT6F0jAPILQSnhCFJwdFgtAGeXLOQZpKjAckPA3t0TNUQD2ek\n8SpNAoGAfQrNfepCTbc/9BCv/sJLLMEdlB/PyzenucBeXKfsfSU6+hYM14+gLp7+\nmOgoaM7F4UkqzJTRDQJnYo1NowRHjs0xHJoQoXzlV43ZkCmTwKtZ/9APLi060Md1\n+fDJX+yvxnZsY5hw6cYwC3C/axS9jq63oQ7i8FXwG/a0breCGu8=\n-----END RSA PRIVATE KEY-----"

我本想使用ruamel.yaml,但是在我只能使用默认的Python包的环境下必须运行此代码。


2
还添加了如何在PyYAML中执行此操作的代码,第一种选项。更新完成,回到工作。 - Anthon
1
好的,再次感谢 -- 这对下一个需要解决这个问题的人来说是一个非常好的参考。 - Alex Harvey

2

更好的选项来使用ruamel.yaml将多行字符串输出为块:

from ruamel.yaml.representer import RoundTripRepresenter
from ruamel.yaml import YAML

multiline_string = """\
-----BEGIN RSA PRIVATE KEY-----
MIIEogIBAAKCAQEA6oySC+8/N9VNpk0gJS7Gk8vn9sYN7FhjpAQnoHRqTN/Oaiyx
xk2AleP2vXpojA/DHldT1JO+o3j56AHD+yfNFFeYvgWKDY35g49HsZZhbyCEAB45
...
"""


def repr_str(dumper: RoundTripRepresenter, data: str):
    if '\n' in data:
        return dumper.represent_scalar('tag:yaml.org,2002:str', data, style='|')
    return dumper.represent_scalar('tag:yaml.org,2002:str', data)


yaml = YAML()
yaml.representer.add_representer(str, repr_str)

with open('file.yaml', 'w') as fp:
    yaml.dump({'a': 1, 'b': 'hello world', 'c': multiline_string}, fp)

输出:

a: 1
b: hello world
c: |
  -----BEGIN RSA PRIVATE KEY-----
  MIIEogIBAAKCAQEA6oySC+8/N9VNpk0gJS7Gk8vn9sYN7FhjpAQnoHRqTN/Oaiyx
  xk2AleP2vXpojA/DHldT1JO+o3j56AHD+yfNFFeYvgWKDY35g49HsZZhbyCEAB45
  ...

0
假设您有一个名为“privateKey.pem”的文件,其中包含您的密钥,您可以使用'|'和2个空格缩进将其提取到一个YAML多行块中:
# Convert To yaml multiline block
cat privateKey.pem | sed -e '1 i PRIVATE_KEY: |' -e 's#^#  #g' > parameters.yaml

cat <<_EOF_ >>parameters.yaml
SSH_HOST: 'host.name.or.ip'
SSH_USER: 'cloud-user'
_EOF_

你会得到类似这样的东西:
PRIVATE_KEY: |
  -----BEGIN OPENSSH PRIVATE KEY-----
  b3BlbnNzaC1rZXkt....
  ...
SSH_HOST: 'host.name.or.ip'

然后在你的 shell 脚本中读取 parameters.yaml 文件

# Read values From Yaml using “PyYAML”
FILE=parameters.yaml

KEY=PRIVATE_KEY
python3 -c "from yaml import load; f = open('$FILE'); y = load(f); print(y['$KEY'])" > priv.key
chmod 600 priv.key

KEY=SSH_HOST
ssh_host=$(python3 -c "from yaml import load; f = open('$FILE'); y = load(f); print(y['$KEY'])")

KEY=SSH_USER
ssh_user=$(python3 -c "from yaml import load; f = open('$FILE'); y = load(f); print(y['$KEY'])")

# test
ssh -i priv.key ${ssh_user}@${ssh_host} "hostname"

更多细节请参见 https://yaml-multiline.info/


网页内容由stack overflow 提供, 点击上面的
可以查看英文原文,
原文链接