在Perl中从字符串中移除CRLF（0D 0A）

Question

在Perl中从字符串中移除CRLF（0D 0A）

9

我有一个Perl脚本在Linux上消费XML文件，有时节点值中会有CRLF（Hex 0D0A，Dos新行）。生成XML文件的系统将其全部写成一行，但它偶尔会决定这太长了，并在其中一个数据元素中写入CRLF。不幸的是，我无法对提供系统进行任何更改。我只需要在处理之前从字符串中删除这些内容。我尝试过使用Perl字符类、十六进制值等各种正则表达式替换，但似乎都没有效果。我甚至在处理之前通过dos2unix运行了输入文件，仍然无法摆脱错误字符。有人有什么想法吗？非常感谢。

- HeHasMoments

3个回答

8

$output =~ tr/\x{d}\x{a}//d;

这两个字符都是空格字符，所以如果终止符总是在末尾，你可以使用右侧修剪函数：

$output =~ s/\s+\z//;

- Greg Bacon

1

几个选项：
1. 用lf替换所有cr/lf的出现： $output =~ s/\r\n/\n/g; #可能需要使用\012\015代替\r\n
2. 删除所有尾随空格： output =~ s/\s+$//g;
3. 吞下并分裂：

#!/usr/bin/perl -w  

use strict;  
use LWP::Simple;  

   sub main{  
      createfile();  
      outputfile();
   }

   main();

   sub createfile{
      (my $file = $0)=~ s/\.pl/\.txt/;

      open my $fh, ">", $file;
         print $fh "1\n2\r\n3\n4\r\n5";
      close $fh;
   }

   sub outputfile{
      (my $filei = $0)=~ s/\.pl/\.txt/;
      (my $fileo = $0)=~ s/\.pl/out\.txt/;

      open my $fin, "<", $filei;
         local $/;                                # slurp the file
         my $text = <$fin>;                       # store the text
         my @text = split(/(?:\r\n|\n)/, $text);  # split on dos or unix newlines
      close $fin;

      local $" = ", ";                            # change array scalar separator
      open my $fout, ">", $fileo;
         print $fout "@text";                     # should output numbers separated by comma space
      close $fout;
   }

- vol7ron

网页内容由stack overflow 提供, 点击上面的

可以查看英文原文，
原文链接

- HeHasMoments · Accepted Answer

典型的情况，经过大约两个小时的挣扎，我在提问后不到5分钟就解决了它。

$output =~ s/[\x0A\x0D]//g;

终于搞定了。