在批处理中截取文件名的一部分

Question

在批处理中截取文件名的一部分

3

我需要帮助制作一个批处理代码（如果可能的话），以从文件名中获取子字符串。我的文件名可能是这样的（文件名长度在变化）：

7_D_D1_012345678-2015-07-07.pdf
8_A_087654321-2015-07-07.pdf
10_D_D1_011122558-2015-07-07.pdf
100_C_CCC1_C2_C3_C4_055555555-2015-07-07.pdf

文件编号 - 从左到第一个 _

id1 - 从1到n的字符串，使用下划线分隔符；例如 C_C1_C2_C3_C4

id2 - 总是9个数字；例如 011122558

日期 - 例如 2015-07-07

扩展名 .jpg

如何为文件夹中的所有文件循环子字符串（文件编号、id1、d2、日期）并将其放入我的代码中

convert - "file number" -annotate "id1" -annotate2 "id2" -annotate "date"

例如：

convert - "01" -annotate "C_C1" -annotate2 "012345678" -annotate "2015-07-07"

感谢您的帮助。

- Artec

老实说，我不会在批处理文件中这样做。我会执行dir folder >tmp.cmd ，然后使用具有正则表达式替换功能的文本编辑器（例如vim，emacs，Notepad Plus）将文件名转换为所需的命令。 - Ross Presser

我需要一个工具来自动生成代码并运行它。也许有其他方法可以代替使用文本编辑器。 - Artec

我想sed（或awk，或perl）可以解决问题。你有什么可用的工具？你愿意下载一些工具吗，还是必须直接使用批处理？那PowerShell呢？ - Ross Presser

我没有任何工具，但如果您给我建议，我可以下载一些。我需要一个在WIN 7下不需要管理员权限就能运行的工具。 - Artec

3个回答

2

既然你说你使用的是Windows 7，我知道你可以使用PowerShell。下面是一个PowerShell脚本:

$re = '^(\d+)_((?:(?:[a-zA-Z0-9]+)_?)+)_(\d{9})-(\d{4}-\d\d-\d\d)\.(\w+)$'
dir | ForEach-Object {$_ -replace $re, 'convert "$1" -annotate "$2" -annotate2 "$3" -annotate3 "-$4"'}

根据您在问题中提供的文件名

7_D_D1_012345678-2015-07-07.pdf
8_A_087654321-2015-07-07.pdf
10_D_D1_011122558-2015-07-07.pdf
100_C_CCC1_C2_C3_C4_055555555-2015-07-07.pdf

它会生成以下文本输出：

convert "100" -annotate "C_CCC1_C2_C3_C4" -annotate2 "055555555" -annotate4 "2015-07-07"
convert "10" -annotate "D_D1" -annotate2 "011122558" -annotate4 "2015-07-07"
convert "7" -annotate "D_D1" -annotate2 "012345678" -annotate4 "2015-07-07"
convert "8" -annotate "A" -annotate2 "087654321" -annotate4 "2015-07-07"

首先按文件名排序，因此以100开头的文件排在第一位，以8开头的文件排在最后。

通过将此文本输出重定向到.cmd文件中，您可以根据需要执行转换命令。

以下是该正则表达式的详细说明：

Beginning of line or string
[1]: A numbered capture group. [\d+]
    Any digit, one or more repetitions
_
[2]: A numbered capture group. [(?:(?:[a-zA-Z0-9]+)_?)+]
    Match expression but don't capture it. [(?:[a-zA-Z0-9]+)_?], one or more repetitions
        (?:[a-zA-Z0-9]+)_?
            Match expression but don't capture it. [[a-zA-Z0-9]+]
                Any character in this class: [a-zA-Z0-9], one or more repetitions
            _, zero or one repetitions
_
[3]: A numbered capture group. [\d{9}]
    Any digit, exactly 9 repetitions
-
[4]: A numbered capture group. [\d{4}-\d\d-\d\d]
    \d{4}-\d\d-\d\d
        Any digit, exactly 4 repetitions

- Ross Presser

真不敢相信我花了这么长时间，现在它甚至都不是最简单的答案。 - Ross Presser

感谢您的工作！我稍后会在我的WIN 7机器上尝试。 - Artec

1

@echo off
    setlocal enableextensions disabledelayedexpansion

    rem For each file
    for /r "x:\starting\folder" %%z in (*.pdf) do (
        rem Separate number part
        for /f "tokens=1,* delims=_" %%a in ("%%~nz") do (
            set "_number=%%~a"
            set "_file=%%~fz"

            rem Separate date and ids 
            for /f "tokens=1,* delims=-" %%c in ("%%~b") do (
                set "_date=%%~d"
                set "_ids=%%~c\."
            )
        )   

        rem Separate id1 from id2 handling the string as a path
        rem This way id2 is the last element and the path to it 
        rem is id1
        setlocal enabledelayedexpansion
        for /f "delims=" %%e in ("::!_ids:_=\!") do (
            endlocal
            set "_id2=%%~nxe"
            set "_id1=%%~pe"
        )

        rem Correct id1 contents (it is a path) changing backslashes 
        rem to underscores. As there are initial and ending backslashes,
        rem later we will remove the initial and ending underscores
        setlocal enabledelayedexpansion
        for /f "delims=" %%e in ("!_id1:\=_!") do (
            endlocal
            set "_id1=%%~e"
        )

        rem Execute final command 
        setlocal enabledelayedexpansion
        echo(
        echo file[!_file!] 
        echo convert - "!_number!" -annotate "!_id1:~1,-1!" -annotate2 "!_id2!" -annotate "!_date!"
        endlocal

    )

- MC ND

这段代码也可以工作！感谢你的出色工作。你认为从子文件夹中读取.pdf文件可能吗？子文件夹的名称为ID2。 - Artec

@Artec，回答已更新。只是在for循环的递归文件搜索中添加了/r "path"，并将检索文件名的行更改为%%~fz以获取文件的完整路径。 - MC ND

再次感谢。非常好用。 - Artec

网页内容由stack overflow 提供, 点击上面的

可以查看英文原文，
原文链接

- Stephan · Accepted Answer

纯批处理。简单的字符串操作与标记化混合使用。无需额外的工具。

(g.txt保存您的示例文件名；可以替换为'dir /b /a-d')

@echo off
for /f %%i in (g.txt) do call :process %%i
goto :eof

:process
set x=%1
set ext=%x:*.=%
for /f "delims=_" %%i in ("%x%") do set fileno=%%i
for /f "tokens=1,*delims=-" %%i in ("%x%") do (
  set x1=%%i
  set x2=%%j
)
for /f "tokens=1,* delims=." %%i in ("%x2%") do (
  set dat=%%i
  set ext=%%j
)
set id2=%x1:~-9%
for /f "tokens=1,* delims=_" %%i in ("%x1:~0,-10%") do set id1=%%j
echo filename   %x%
echo ------------------------
echo    Nr. %fileno%
echo    ID1 %id1%
echo    ID2 %id2%
echo    Date    %dat%
echo    Ext.    %ext%
echo ------------------------
echo convert - "%fileno%" -annotate "%id1%" -annotate2 "%id2% -annotate "%dat%"
echo(
echo(
goto :eof