使用sed从文本文件中提取莫尔斯码

Question

使用sed从文本文件中提取莫尔斯码

3

我可以帮助您进行翻译。这段文字的意思是：“我有一个任务，需要使用'sed'从包含以下内容的文本文件中提取莫尔斯电码（短划线和点号）。”

A test to see if the morse code can be removed from a file. .--- -. ..
This is a test --. -.- .-- .. -.. --- .- .. of sorts and so on. Let's see if the code snippets can be found.
Also can they be .- . -.- removed and yet leave the periods at the end
of sentences alone. ---- -. There are also hyphenated words like the
following: Edgar-Jones. -.

我可以使用sed命令删除所有[a-z]和[A-Z]的字符，但问题是句子末尾的句号以及Edgar-Jones中的连字符也会被拾取。我找不到方法来一并排除它们...

非常感谢您的帮助。

感谢所有答案，每一个都有帮助。我选择了这个方案。

sed "s/[a-zA-Z][-.]//g;s/[a-zA-Z: ']*//g" file

它查找跟随字符的破折号或句号的实例，并删除第一个，这正是我遇到的问题。然后它会清理其余的字符、空格、冒号和撇号。谢谢！

- i_am_so_stupid

使用sed是必须的吗？ - Kent

是的，不幸的是必须使用sed完成。 - i_am_so_stupid

好的，我发布sed答案，其他工具可能更容易。 - Kent

@EdMorton，看起来很明显他想要删除非摩尔斯码的单词，但不知道如何删除可能包含点或破折号的单词。 - glenn jackman

4个回答

1

sed 's/\(^\|[[:blank:]]\)[^[:blank:]]*[^-.[:blank:]][^[:blank:]]*/ /g' file

               .--- -. ..
     --. -.- .-- .. -.. --- .- ..              
     .- . -.-         
    ---- -.       
   -.

这个正则表达式是：

行首或空格
一些非空白字符
后面跟着一个非空白字符或摩尔斯字符
接下来是一些非空白字符

这可以识别至少包含一个非摩尔斯字符的单词，并将它们替换为一个空格。

使用GNU grep更简单，可惜你不能使用它：

grep -oP '(?<=^|\s)[.-]+(?=\s|$)' file

- glenn jackman

0

这个sed单行命令应该能完成任务：

提取莫尔斯电码（连字符和点号）

在您的示例文件中：

sed "s/[a-zA-Z][-.]//g;s/[a-zA-Z: ']*//g" file

使用您的文件进行测试：

kent$  cat f1
A test to see if the morse code can be removed from a file. .--- -. ..
This is a test --. -.- .-- .. -.. --- .- .. of sorts and so on. Let's see if the code snippets can be found.
Also can they be .- . -.- removed and yet leave the periods at the end
of sentences alone. ---- -. There are also hyphenated words like the
following: Edgar-Jones. -.

kent$  sed "s/[a-zA-Z][-.]//g;s/[a-zA-Z: ']*//g" f1
.----...
--.-.-.--..-..---.-..
.-.-.-
-----.
-.

- Kent

@kent，从那个中应该如何提取单个摩尔斯信号字母？-.是“N”还是“TE”？我建议你在第二个正则表达式中去掉空格。 - glenn jackman

@glennjackman我认为应该去掉空格 提取莫尔斯码（破折号和点） - Kent

0

sed 's/\.$//
     s/\([^-[:space:].]\{1,\}[-.]\{0,1\}\)*//g
     s/\([[:space:]]\)\{2,\}/\1/g
     ' YourFile

将多个空格替换为1个
POSIX版本

- NeronLeVelu

网页内容由stack overflow 提供, 点击上面的

可以查看英文原文，
原文链接

- Jotne · Accepted Answer

这是一个可以解决这个问题的 awk。

awk '{for (i=1;i<=NF;i++) if ($i!~/[a-zA-Z0-9]/) printf "%s ",$i;print ""}' file
.--- -. ..
--. -.- .-- .. -.. --- .- ..
.- . -.-
---- -.
-.

这个测试检查每个字段，如果它包含字母 a-z，则不打印它。

或者如Glenn所评论的：

awk '{for (i=1;i<=NF;i++) if ($i~/^[.-]+$/) printf "%s ",$i;print ""}' file