Lisp - 将输入拆分为单独的字符串

23

我正在尝试获取用户输入并将其存储到列表中,但是不想得到一个由单个字符串组成的列表,而是希望每个被扫描的单词都成为自己的字符串。

例子:
> (input)
This is my input. Hopefully this works

会返回:

("this" "is" "my" "input" "hopefully" "this" "works")

请注意,我不希望在我的最终列表中出现任何空格或标点符号。

非常感谢您提供的任何输入。


1
请查看http://cl-cookbook.sourceforge.net/strings.html,它们有许多常见用例函数之一是简单的空格分割,您可以修改它以删除标点符号等。 - Daniel Williams
1
食谱继续在这里:https://lispcookbook.github.io/cl-cookbook/strings.html - Ehvince
5个回答

23
"

split-sequence 是现成的解决方案。

你也可以自己编写:

"
(defun my-split (string &key (delimiterp #'delimiterp))
  (loop :for beg = (position-if-not delimiterp string)
    :then (position-if-not delimiterp string :start (1+ end))
    :for end = (and beg (position-if delimiterp string :start beg))
    :when beg :collect (subseq string beg end)
    :while end))

delimiterp 检查您是否想要在此字符上进行拆分,例如:

(defun delimiterp (c) (or (char= c #\Space) (char= c #\,)))

或者
(defun delimiterp (c) (position c " ,.;/"))

PS. 看你期望的返回值,似乎需要在调用my-split之前调用string-downcase

PPS. 你可以很容易地修改my-split以接受:start:end:delimiterp等参数。

PPPS. 很抱歉第一和第二个版本的my-split中存在错误。请考虑这是一个指示器,表明不应该自己编写此函数的版本,而是使用现成的解决方案。


我在分割序列(split-sequence)方面找到了很多资料,但显然我需要导入cl-utilities这个包,但我就是搞不清楚该怎么做 =/ #imanewb - Sean Evans
2
@SeanEvans:小心!import是一个你在这里不想要的CL函数!你需要的是使用,例如,quicklisp安装包:(ql:quickload "split-sequence") - sds
@sds:你的编辑破坏了你的代码(例如,使用“”和“a”进行测试)。 - MicroVirus
澄清一下,第一段代码无法处理以分隔符结尾的字符串(例如 "abc "),而第二段代码大多数时候无法获取最后一个标记(例如 "ab cd" -> ("ab"))。 - MicroVirus
我想现在我修复了代码。对于那些错误,我很抱歉。 - sds
显示剩余5条评论

10

uiop:split-string很好用,但是很遗憾它不能按换行符进行分割。 - undefined

5

有一个cl-ppcre:split函数:

* (split "\\s+" "foo   bar baz
frob")
("foo" "bar" "baz" "frob")

* (split "\\s*" "foo bar   baz")
("f" "o" "o" "b" "a" "r" "b" "a" "z")

* (split "(\\s+)" "foo bar   baz")
("foo" "bar" "baz")

* (split "(\\s+)" "foo bar   baz" :with-registers-p t)
("foo" " " "bar" "   " "baz")

* (split "(\\s)(\\s*)" "foo bar   baz" :with-registers-p t)
("foo" " " "" "bar" " " "  " "baz")

* (split "(,)|(;)" "foo,bar;baz" :with-registers-p t)
("foo" "," NIL "bar" NIL ";" "baz")

* (split "(,)|(;)" "foo,bar;baz" :with-registers-p t :omit-unmatched-p t)
("foo" "," "bar" ";" "baz")

* (split ":" "a:b:c:d:e:f:g::")
("a" "b" "c" "d" "e" "f" "g")

* (split ":" "a:b:c:d:e:f:g::" :limit 1)
("a:b:c:d:e:f:g::")

* (split ":" "a:b:c:d:e:f:g::" :limit 2)
("a" "b:c:d:e:f:g::")

* (split ":" "a:b:c:d:e:f:g::" :limit 3)
("a" "b" "c:d:e:f:g::")

* (split ":" "a:b:c:d:e:f:g::" :limit 1000)
("a" "b" "c" "d" "e" "f" "g" "" "")

http://weitz.de/cl-ppcre/#split

对于常见情况,有一个(新的,“现代和一致的”)cl-str字符串操作库:

(str:words "a sentence    with   spaces") ; cut with spaces, returns words
(str:replace-all "," "sentence") ; to easily replace characters, and not treat them as regexps (cl-ppcr treats them as regexps)

您可以使用cl-slug工具来去除非ASCII字符和标点符号:

 (asciify "Eu André!") ; => "Eu Andre!"

还有 str:remove-punctuation(使用cl-change-case:no-case)。


0
; in AutoLisp usage (splitStr "get off of my cloud" " ") returns (get off of my cloud)

(defun splitStr (src delim / word letter)

  (setq wordlist (list))
  (setq cnt 1)
  (while (<= cnt (strlen src))

    (setq word "")

    (setq letter (substr src cnt 1))
    (while (and (/= letter delim) (<= cnt (strlen src)) ) ; endless loop if hits NUL
      (setq word (strcat word letter))
      (setq cnt (+ cnt 1))      
      (setq letter (substr src cnt 1))
    ) ; while

    (setq cnt (+ cnt 1))
    (setq wordlist (append wordlist (list word)))

  )

  (princ wordlist)

  (princ)

) ;defun

-1
(defun splitStr (src pat /)
    (setq wordlist (list))
    (setq len (strlen pat))
    (setq cnt 0)
    (setq letter cnt)
    (while (setq cnt (vl-string-search pat src letter))
        (setq word (substr src (1+ letter) (- cnt letter)))
        (setq letter (+ cnt len))
        (setq wordlist (append wordlist (list word)))
    )
    (setq wordlist (append wordlist (list (substr src (1+ letter)))))
)

3
虽然这可能回答了问题,但最好还是提供一下您的代码解释和任何有用的参考资料。请查看[答案]以获取有关回答问题的详细信息。 - Tim Hutchison

网页内容由stack overflow 提供, 点击上面的
可以查看英文原文,
原文链接