我希望从R中的文本中提取与保险服务相关的关键词。我创建了关键词列表,并使用了qdap
库中的常见函数。
bag <- bag_o_words(corpus)
b <- common(bag,keywords,overlap="all")
但是结果只是出现频率大于1的常见词语。
我还使用了RKEA
库。
keywords <- c("directasia", "directasia.com", "Frank", "frank", "OCBC", "NTUC",
"NTUC Income", "Frank by OCBC", "customer service", "atm",
"insurance", "claim", "agent", "premium", "policy", "customer care",
"customer", "draft", "account", "credit", "savings","debit","ivr",
"offer", "transacation", "banking", "website", "mobile", "i-safe",
"customer", "demat", "network", "phone", "interest", "loan",
"transfer", "deposit", "otp", "rewards", "redemption")
tmpdir <- tempfile()
dir.create(tmpdir)
model <- file.path(tmpdir, "crudeModel")
createModel(corpus,keywords,model)
extractKeywords(corpus, model)
然而,我遇到了以下错误:
并且createModel(corpus, keywords, model) 中的错误:文档数量和关键字不匹配
我认为第二个错误是因为“createModel”没有成功。.jcall(ke, "V", "extractKeyphrases", .jcall(ke,Ljava/util/Hashtable;", 中的错误:java.io.FileNotFoundException: C:\Users\Bitanshu\AppData\Local\Temp\RtmpEHu9uA\file14c4160f41c2\crudeModel(系统找不到指定的文件)
有人能提出如何纠正这个问题或另一种方法吗? 文本数据已从Twitter中提取。
devtools::install_github("kbenoit/quanteda")
。请参见https://github.com/kbenoit/quanteda。 - Ken Benoit