在Haskell中从字符串中删除字符

4
我正在创建一个程序,读取文本文件并将单词分割并存储在列表中。我一直在尝试创建一个函数,它接受一个字符串,该字符串是来自文件的整个文本字符串,并删除标点符号,例如“;”,“,”,“。”但是不幸的是,我还没有成功。如果包括它到(toWords fileContents)中,程序将不起作用。请有人看看我已经做了什么,并看看我做错了什么。
这是我目前的代码:
main = do  
       contents <- readFile "LargeTextFile.txt"
       let lowContents = map toLower contents
       let outStr = countWords (lowContents)
       let finalStr = sortOccurrences (outStr)
       let reversedStr = reverse finalStr
       putStrLn "Word | Occurrence "
       mapM_ (printList) reversedStr

-- Counts all the words.
countWords :: String -> [(String, Int)]
countWords fileContents = countOccurrences (toWords (removePunc fileContents))

-- Splits words and removes linking words.
toWords :: String -> [String]
toWords s = filter (\w -> w `notElem` ["an","the","for"]) (words s)

-- Remove punctuation from text String.
removePunc :: String -> String
removePunc xs = x | x <- xs, not (x `elem` ",.?!-:;\"\'")

-- Counts, how often each string in the given list appears.
countOccurrences :: [String] -> [(String, Int)]
countOccurrences xs = map (\xs -> (head xs, length xs)) . group . sort $ xs

-- Sort list in order of occurrences.
sortOccurrences :: [(String, Int)] -> [(String, Int)]
sortOccurrences sort = sortBy (comparing snd) sort

-- Prints the list in a format.
printList a = putStrLn((fst a) ++ " | " ++ (show $ snd a))
1个回答

9

您可能想要:

removePunc xs = [ x | x <- xs, not (x `elem` ",.?!-:;\"\'") ]

用括号包含起来。

13
另一个选项是只使用filter (not . (`elem` ",.?!-:;\"\'")) xs - bheklilr
1
那个可以了,谢谢!我只是漏掉了方括号。 - James Meade
2
另一个选项是修改@bheklilr的答案:filter (`notElem` ",.?!-:;\"\'") xs - Edwin Pratt

网页内容由stack overflow 提供, 点击上面的
可以查看英文原文,
原文链接