您需要使用两个不同的字段才能实现您要寻找的内容。简而言之,像下面的用例一样,在
buyer
中使用
multi-fields。
映射:
PUT my_exact_match_exclude
{
"settings": {
"analysis": {
"normalizer": {
"my_normalizer": {
"type": "custom",
"char_filter": [],
"filter": ["lowercase"]
}
}
}
},
"mappings": {
"properties": {
"buyer": {
"type": "text",
"fields": {
"keyword": { <---- Note this
"type": "keyword",
"normalizer": "my_normalizer" <---- Note this. To take care of case sensitivity
}
}
}
}
}
}
请注意,城市的映射具有使用
多字段的
keyword
数据类型的兄弟字段。
此外,请阅读关于
规范化程序的内容,以及为什么我在
keyword
上应用它只是为了确保在进行精确匹配时考虑大小写不敏感性。
样例文档:
POST my_exact_match_exclude/_doc/1
{
"buyer": "Greater London Authority (GLA)"
}
POST my_exact_match_exclude/_doc/2
{
"buyer": "Greater London Authority"
}
POST my_exact_match_exclude/_doc/3
{
"buyer": "Greater London"
}
POST my_exact_match_exclude/_doc/4
{
"buyer": "London Authority"
}
POST my_exact_match_exclude/_doc/5
{
"buyer": "greater london authority (GLA)"
}
请注意,如果考虑不区分大小写,则第一个和最后一个文件完全相同。
样例查询:
POST my_exact_match_exclude/_search
{
"query": {
"bool": {
"must": [
{
"match": {
"buyer": "Greater London Authority (GLA)"
}
}
],
"must_not": [
{
"term": {
"buyer.keyword": "Greater London Authority (GLA)".
}
}
]
}
}
}
请注意,我正在对
buyer.keyword
字段应用
must_not
,以避免所有精确匹配的术语。
示例响应:
{
"took" : 0,
"timed_out" : false,
"_shards" : {
"total" : 1,
"successful" : 1,
"skipped" : 0,
"failed" : 0
},
"hits" : {
"total" : {
"value" : 3,
"relation" : "eq"
},
"max_score" : 0.66237557,
"hits" : [
{
"_index" : "my_exact_match_exclude",
"_type" : "_doc",
"_id" : "2",
"_score" : 0.66237557,
"_source" : {
"buyer" : "Greater London Authority"
}
},
{
"_index" : "my_exact_match_exclude",
"_type" : "_doc",
"_id" : "3",
"_score" : 0.4338556,
"_source" : {
"buyer" : "Greater London"
}
},
{
"_index" : "my_exact_match_exclude",
"_type" : "_doc",
"_id" : "4",
"_score" : 0.4338556,
"_source" : {
"buyer" : "London Authority"
}
}
]
}
}
正如预期的那样,文档1和5没有返回结果,因为它们是完全匹配的。
您可以在代码中以类似的方式使用上述查询。
希望这可以帮助您!