Notepad++ 将多个文件转换为 UTF-8

Question

Notepad++ 将多个文件转换为 UTF-8

18

Notepad++ 的“转换为无 BOM 的 UTF-8”功能非常好用。但是我有200个文件，它们都需要被转换。因此，我找到了这个小的 Python 脚本：

import os;
import sys;
filePathSrc="C:\\Temp\\UTF8"
for root, dirs, files in os.walk(filePathSrc):
    for fn in files:
      if fn[-4:] != '.jar' and fn[-5:] != '.ear' and fn[-4:] != '.gif' and fn[-4:] != '.jpg' and fn[-5:] != '.jpeg' and fn[-4:] != '.xls' and fn[-4:] != '.GIF' and fn[-4:] != '.JPG' and fn[-5:] != '.JPEG' and fn[-4:] != '.XLS' and fn[-4:] != '.PNG' and fn[-4:] != '.png' and fn[-4:] != '.cab' and fn[-4:] != '.CAB' and fn[-4:] != '.ico':
        notepad.open(root + "\\" + fn)
        console.write(root + "\\" + fn + "\r\n")
        notepad.runMenuCommand("Encoding", "Convert to UTF-8 without BOM")
        notepad.save()
        notepad.close()

它遍历每个文件 -> 我可以看到这一点。但是在完成后，字符集仍然是ANSI，就像我的情况 :/

有人能帮我吗？

- Phil

1

有任何错误信息吗？您将其运行到“notepad++ Python脚本插件”中了吗？也许您可以检查一下编码菜单中是否真的有“转换为无BOM的UTF-8”。在我的notepad++中只有“转换为UTF-8”。更改字符串可能是值得的。 - Lars Fischer

没错，我使用这个插件。在我的记事本中有“转换为无BOM的UTF-8”和“转换为UTF-8” - 两者都有。 - Phil

4个回答

9

我犯了错误。我的记事本是德语的。所以当它叫做“编码”或者在我这个例子中叫做“Kodierung”，“将其转换为无BOM的UTF-8”应该是“Konvertiere zu UTF-8 ohne BOM”

这对我有帮助！

- Phil

5

您还可以在此处记录和回放宏。这对我很有用，因为插件管理器似乎出了问题，我无法使用 Python。

将一组文件（或全部文件 - 我认为最多可处理一定数量的文件）拖入Notepad ++
Macro -> Start recording
进行转换
保存文件
关闭文件
Macro -> Stop recording

您可以通过选择以下方式回放宏：

Macro -> Run a Macro Multiple Times
输入一个值，使所有文件都被处理

由于处理后文件会关闭，因此您会知道哪些文件尚未被处理。

- klaus

这就是它！比折腾一些Python脚本要简单得多。 - Armin Bu

我觉得这个不起作用，它没有改变文件。如果我只录制编码->UTF-8的步骤，并停止录制，它不会给我提供播放选项。 - user433342

0

使用Notepad++ Python脚本插件。将此代码复制到新脚本中：

# -*- coding: utf-8 -*-
from __future__ import print_function

from Npp import notepad
import os

uft8_bom = bytearray(b'\xEF\xBB\xBF')
top_level_dir = notepad.prompt('Paste path to top-level folder to process:', '', '')
if top_level_dir != None and len(top_level_dir) > 0:
    if not os.path.isdir(top_level_dir):
        print('bad input for top-level folder')
    else:
        for (root, dirs, files) in os.walk(top_level_dir):
            for file in files:
                full_path = os.path.join(root, file)
                print(full_path)
                with open(full_path, 'rb') as f: data = f.read()
                if len(data) > 0:
                    if ord(data[0]) != uft8_bom[0]:
                        try:
                            with open(full_path, 'wb') as f: f.write(uft8_bom + data)
                            print('added BOM:', full_path)
                        except IOError:
                            print("can't change - probably read-only?:", full_path)
                    else:
                        print('already has BOM:', full_path)

第二种解决方案是使用正则表达式，查找并替换：

在文件中查找：
搜索：\A
替换为：\x{FEFF} 过滤器 *.html（您必须从第一个给出确认，不要取消）

- Just Me

网页内容由stack overflow 提供, 点击上面的

可以查看英文原文，
原文链接

- Hrvoje · Accepted Answer

以下是我所采用的方法：

前往Notepad++ -> 插件 -> 插件管理器。

查找并安装Python脚本插件。

使用插件 -> Python脚本 -> 新建脚本创建新的Python脚本。

将以下代码插入您的脚本中：

import os;
import sys;
filePathSrc="C:\\Users\\YourUsername\\Desktop\\txtFolder"
for root, dirs, files in os.walk(filePathSrc):
    for fn in files:
      if fn[-4:] == '.txt' or fn[-4:] == '.csv':
        notepad.open(root + "\\" + fn)
        console.write(root + "\\" + fn + "\r\n")
        notepad.runMenuCommand("Encoding", "Convert to UTF-8")
        notepad.save()
        notepad.close()

将 C:\\Users\\YourUsername\\Desktop\\txtFolder 替换为您的Windows文件夹路径，该文件夹包含您的文件。

这个脚本可以处理 .txt 和 .csv 文件，并忽略文件夹中的其他所有文件。

使用 插件 -> Python 脚本 -> 脚本 -> 您的脚本名称 运行脚本。