测试数据:
curl -XPUT 'localhost:9200/customer/external/1?pretty' -d '{ "body": "this is a test" }'
curl -XPUT 'localhost:9200/customer/external/2?pretty' -d '{ "body": "and this is another test" }'
curl -XPUT 'localhost:9200/customer/external/2?pretty' -d '{ "body": "this thing is a test" }'
我的目标是获得文档中一个短语的频率。
我知道如何获得文档中术语的频率:
curl -g "http://localhost:9200/customer/external/1/_termvectors?pretty" -d'
{
"fields": ["body"],
"term_statistics" : true
}'
我知道如何使用match_phrase或span_near查询来计算包含给定短语的文档数量:
curl -g "http://localhost:9200/customer/_count?pretty" -d'
{
"query": {
"match_phrase": {
"body" : "this is"
}
}
}'
我如何访问短语的频率?