Python - 根据魔数/长度将文件解析为输出

Question

Python - 根据魔数/长度将文件解析为输出

pythonstring

3

我是一个完全的编程新手 - 三周前才开始学习，目前只学了codecademy的Python课程 - 所以需要简单易懂的解释！

我正在尝试编写一个Python脚本，将文件作为HEX字符串读取，然后根据在HEX字符串中找到的“magic number”将文件解析成单独的输出文件。

例如：如果我的HEX字符串是“0011AABB00BBAACC00223344”，我可能想要根据魔术数字“00”将此字符串解析为新的输出文件，并告诉Python每个输出应为8个字符长。上面示例字符串的输出应该是包含HEX值的3个文件：

"0011AABB" "00BBAACC" "00223344"

这是我目前的代码（假设在此情况下上述字符串包含在“hextests”文件中）：

import os
import binascii

filename = "hextests"

# read file as a binary string
with open(filename, 'rb') as f:
    content = f.read()

# convert binary string to hex string
hexString = binascii.hexlify(content)

# define magic number as "00"
magic_N = "00"

# attempting to create a new substring called newFile that is equal to each instance magic_N repeats throughout the file for a length of 8 characters
for chars in hexString:
    newFile = ""
    if chars == magic_N:
        newFile += chars.len(9)

# attempting to create a series of new output files for each instance of newFile - while incrementing the output file name
    if newFile != "":
        i = 0
        while os.path.exists("output_file%s.xyz" % i):
          i += 1
        fh = with open("output_file%s.xyz" % i, "wb"):
            newFile

我相信这里有很多错误需要解决，而且可能比我想象的更加复杂...但我的主要问题与定义chars和newFile变量的正确方式有关。我非常确定python只会将chars视为字符串中的单个字符，因此它会失败，因为我试图搜索比一个字符长的magic_N。我是否正确地认为这是问题的一部分？

另外，如果您了解此脚本的主要目标，请问是否还有其他建议我应该做出改变？

非常感谢您的帮助！

- occvtech

对于字符匹配，尝试使用find()和切片方法，并迭代您的字符串并将结果附加到列表或字典中。 - Gahan

1个回答

网页内容由stack overflow 提供, 点击上面的

可以查看英文原文，
原文链接

- Nurjan · Accepted Answer

您可以尝试像这样做：

您可以尝试类似以下方式：

filename = "hextests"

# read file as a binary string
with open(filename, "rb") as f:
    content = f.read()

# You don't need this part if you want
# to parse the hex string as it is given in the file   
# convert binary string to hex string
# hexString = binascii.hexlify(content)

# Remove the newline at the end of the string
hexString = content.strip()


# define magic number as "00"
magic_N = "00"

i = 0
j = 0
while i < len(hexString) - 1:
    index = hexString.find(magic_N, i)

    # This is the part which was incorrect in your code.
    with open("output_file_%s.xyz" % j, "wb") as output:
        output.write(hexString[i:i+8])

    i += 8
    j += 1

请注意，您需要显式调用write方法将数据写入输出文件。

假设数据块恰好为8个十六进制符号长，并且始终以00开头。因此，这不是一种灵活的解决方案，但它可以让您了解如何解决问题。