如何使用 Spring Data Elasticsearch 实现高亮显示

5

看起来SpringData ES没有提供类来获取ES返回的高亮部分。Spring Data可以返回对象列表,但是ES返回的JSON中的高亮部分在一个单独的部分中,这个部分不被“ElasticSearchTemplate”类处理。

代码示例:

QueryBuilder query = QueryBuilders.matchQuery("name","tom"); 
SearchQuery searchQuery =new NativeSearchQueryBuilder().withQuery(query).
                               with HighlightFields(new Field("name")).build();
List<ESDocument> publications = elasticsearchTemplate.queryForList
                                                (searchQuery, ESDocument.class);

我可能错了,但我不知道如何只使用SpringDataES。有人可以发布一个示例,展示如何使用Spring Data ES获取高亮显示吗?

提前感谢!

4个回答

6

在spring data elasticsearch的测试用例中,我找到了解决方案:

这可能会有所帮助。

@Test
public void shouldReturnHighlightedFieldsForGivenQueryAndFields() {

    //given
    String documentId = randomNumeric(5);
    String actualMessage = "some test message";
    String highlightedMessage = "some <em>test</em> message";

    SampleEntity sampleEntity = SampleEntity.builder().id(documentId)
            .message(actualMessage)
            .version(System.currentTimeMillis()).build();

    IndexQuery indexQuery = getIndexQuery(sampleEntity);

    elasticsearchTemplate.index(indexQuery);
    elasticsearchTemplate.refresh(SampleEntity.class);

    SearchQuery searchQuery = new NativeSearchQueryBuilder()
            .withQuery(termQuery("message", "test"))
            .withHighlightFields(new HighlightBuilder.Field("message"))
            .build();

    Page<SampleEntity> sampleEntities = elasticsearchTemplate.queryForPage(searchQuery, SampleEntity.class, new SearchResultMapper() {
        @Override
        public <T> Page<T> mapResults(SearchResponse response, Class<T> clazz, Pageable pageable) {
            List<SampleEntity> chunk = new ArrayList<SampleEntity>();
            for (SearchHit searchHit : response.getHits()) {
                if (response.getHits().getHits().length <= 0) {
                    return null;
                }
                SampleEntity user = new SampleEntity();
                user.setId(searchHit.getId());
                user.setMessage((String) searchHit.getSource().get("message"));
                user.setHighlightedMessage(searchHit.getHighlightFields().get("message").fragments()[0].toString());
                chunk.add(user);
            }
            if (chunk.size() > 0) {
                return new PageImpl<T>((List<T>) chunk);
            }
            return null;
        }
    });

    assertThat(sampleEntities.getContent().get(0).getHighlightedMessage(), is(highlightedMessage));
}

4

Spring Data Elasticsearch 4.0现在有了SearchPage结果类型,如果我们需要返回高亮结果,这会使事情变得更加容易:

以下是一个工作示例:

    String query = "(id:123 OR id:456) AND (database:UCLF) AND (services:(sealer?), services:electronic*)"
    
    NativeSearchQuery searchQuery = new NativeSearchQueryBuilder()
            .withPageable(pageable)
            .withQuery(queryStringQuery(query))
            .withSourceFilter(sourceFilter)
            .withHighlightFields(new HighlightBuilder.Field("goodsAndServices"))
            .build();
    
    
    SearchHits<Trademark> searchHits = template.search(searchQuery, Trademark.class, IndexCoordinates.of("trademark"));
    SearchPage<Trademark> page = SearchHitSupport.searchPageFor(searchHits, searchQuery.getPageable());
    return (Page<Trademark>) SearchHitSupport.unwrapSearchHits(page);

这是来自页面对象的JSON响应:
{
    "content": [
        {
            "id": "123",
            "score": 12.10748,
            "sortValues": [],
            "content": {
                "_id": "1P0XzXIBdRyrchmFplEA",
                "trademarkIdentifier": "abc234",
                "goodsAndServices": null,
                "language": "EN",
                "niceClass": "2",
                "sequence": null,
                "database": "UCLF",
                "taggedResult": null
            },
            "highlightFields": {
                "goodsAndServices": [
                    "VARNISHES, <em>SEALERS</em>, AND NATURAL WOOD FINISHES"
                ]
            }
        }
    ],
    "pageable": {
        "sort": {
            "unsorted": true,
            "sorted": false,
            "empty": true
        },
        "offset": 0,
        "pageNumber": 0,
        "pageSize": 20,
        "unpaged": false,
        "paged": true
    },
    "searchHits": {
        "totalHits": 1,
        "totalHitsRelation": "EQUAL_TO",
        "maxScore": 12.10748,
        "scrollId": null,
        "searchHits": [
            {
                "id": "123",
                "score": 12.10748,
                "sortValues": [],
                "content": {
                    "_id": "1P0XzXIBdRyrchmFplEA",
                    "trademarkIdentifier": "abc234",
                    "goodsAndServices": null,
                    "language": "EN",
                    "niceClass": "2",
                    "sequence": null,
                    "database": "UCLF",
                    "taggedResult": null
                },
                "highlightFields": {
                    "goodsAndServices": [
                        "VARNISHES, <em>SEALERS</em>, AND NATURAL WOOD FINISHES"
                    ]
                }
            }
        ],
        "aggregations": null,
        "empty": false
    },
    "totalPages": 1,
    "totalElements": 1,
    "size": 20,
    "number": 0,
    "numberOfElements": 1,
    "last": true,
    "first": true,
    "sort": {
        "unsorted": true,
        "sorted": false,
        "empty": true
    },
    "empty": false
}

为什么要解包任何东西?SearchHit类有直接访问高亮信息的方法。 - P.J.Meisch
API 的界面看起来很丑......你可以看到返回的 JSON 数据中有两份页面数据,重复了。 - Eric

2

实际上,您可以通过自定义ResultExtractor执行以下操作:

QueryBuilder query = QueryBuilders.matchQuery("name", "tom"); 
SearchQuery searchQuery = new NativeSearchQueryBuilder()
                           .withQuery(query)
                           .withHighlightFields(new Field("name")).build();
return elasticsearchTemplate.query(searchQuery.build(), new CustomResultExtractor());

然后

public class CustomResultExtractor implements ResultsExtractor<List<MyClass>> {

private final DefaultEntityMapper defaultEntityMapper;

public CustomResultExtractor() {
    defaultEntityMapper = new DefaultEntityMapper();
}


@Override
public List<MyClass> extract(SearchResponse response) {
    return StreamSupport.stream(response.getHits().spliterator(), false) 
        .map(this::searchHitToMyClass) 
        .collect(Collectors.toList());
}

private MyClass searchHitToMyClass(SearchHit searchHit) {
    MyElasticSearchObject myObject;
    try {
        myObject = defaultEntityMapper.mapToObject(searchHit.getSourceAsString(), MyElasticSearchObject.class);
    } catch (IOException e) {
        throw new ElasticsearchException("failed to map source [ " + searchHit.getSourceAsString() + "] to class " + MyElasticSearchObject.class.getSimpleName(), e);
    }
    List<String> highlights = searchHit.getHighlightFields().values()
        .stream() 
        .flatMap(highlightField -> Arrays.stream(highlightField.fragments())) 
        .map(Text::string) 
        .collect(Collectors.toList());
    // Or whatever you want to do with the highlights
    return new MyClass(myObject, highlights);
}}

请注意,我使用了列表但是您也可以使用其他可迭代的数据结构。此外,您还可以对高亮部分进行其他操作。在这里,我只是简单地列出它们。


1

https://dev59.com/BZbfa4cB1Zd3GeqPycVQ#37163711 第一个答案确实有效,但我发现它返回的结果存在一些分页问题,显示的总元素和总页数是错误的。在查看了DefaultResultMapper的实现后,应该返回return new AggregatedPageImpl((List<T>) chunk, pageable, totalHits, response.getAggregations(), response.getScrollId(), maxScore);,然后就可以进行分页了。希望我能帮到你们~ original answer


网页内容由stack overflow 提供, 点击上面的
可以查看英文原文,
原文链接