Elasticsearch Deep Pagination : Search_After

Those who use elasticsearch for their data storage or data search purposes, sometimes need pagination. Like when we search in google, we see the first page on first result, then below the search result, we can see other search results in different pages which are mentioned below with page number in google's search result page. We can navigate to and from clicking on the pageNumber we are shown.
For elasticsearch who has ever implemented pagination in traditional way providing from and size parameters, knows that elasticsearch does not allow more than 10k records for a query.  Those who are interested to have a short brief about different type of ES pagination, can check this Elasticsearch, how we paginated over 10 000 items.  I am not going to describe full of it rather  a short brief here as the topic suggests.

Elastic search provides a mechanism of pagination for more than 10k records and one of them is by search_after method which is much better for some context than the others[Again no details why!].
But this method has a fallback! We will not be able to go to previous pages and jump like other pagination methods!  Today we are going to solve how to go previous page and forward page, in a word back-And-forth pagination without from and size.  We will do just little tweak and will see working code here.


  1. we know for search-after method, we can't provide this param for the first query.
  2. For the next subsequent query, we will extract the getSortValues() which contains the search_after param value for next call.
  3. for forward pagination, we will sort the data sin descending order and use that search_after param each time.
  4. for backward pagination, we just revert the sorted array while querying for those data and our purpose will be served.
Here is the code snippet:



        RestHighLevelClient client = new RestHighLevelClient(
                RestClient.builder(
                       // new HttpHost("es1", 9200, "http"), //if you have es cluster
                      //  new HttpHost("es2", 9200, "http"),
                        new HttpHost("127.0.0.1", 9200, "http")));

        SearchSourceBuilder searchSourceBuilder = new SearchSourceBuilder();
        if (value > 0) { //after the first call
            searchSourceBuilder.searchAfter(new Object[]{value});
        } else { //for first query when still don't have value for search_after param
            searchSourceBuilder.from(0);
        }

        searchSourceBuilder.size(3); //size per request/page
        if(direction==1) //if previous page is requested
            searchSourceBuilder.sort("createTimeInMs", SortOrder.ASC); //please check this
        else //when next page is requested one after another
            searchSourceBuilder.sort("createTimeInMs", SortOrder.DESC); //please check this too
        searchSourceBuilder.query(QueryBuilders.termQuery("search_field", "28")); //field we are querying with

        SearchRequest searchRequest = new SearchRequest("my_index");
        searchRequest.types("my_type");
        searchRequest.source(searchSourceBuilder);

        Object pos = 0; //initial position of ES record pointer

        try
        {
            SearchResponse searchResponse = client.search(searchRequest, RequestOptions.DEFAULT);
            SearchHits hits = searchResponse.getHits();

            if (hits.getHits().length >0)
                pos = hits.getAt(hits.getHits().length-1).getSortValues(); //update with latest record pointer

            List<MyBean> dataList =  new ArrayList<MyBean>();
            searchResponse.getHits().forEach(hit -> dataList.add(JSON.parseObject(hit.getSourceAsString(), (Type) MyBean.class)));

            SearchAfterResponse response=new SearchAfterResponse();
            response.setData(dataList);
            response.setSearchNext(pos);
            response.setTotalHits(hits.getTotalHits());

            client.close();

            return ResponseEntity.ok(response);
        }
        catch (Exception e)
        {
            logger.error("after err:"+ e.getMessage());
        }
       return ResponseEntity.ok(response);

I have omitted some of boilerplate code for brevity and tried to explain line by line in the snippet. The code is self explanatory now. Hope this helps.
Recap:

  • for previous page/backward the direction will be ==1 and  and direction==0 otherwise
  • for first call set  value=0 and for subsequent use the pos value
This is how I solved my problem!
Happy coding!

Comments

  1. This comment has been removed by the author.

    ReplyDelete
  2. You can optimize the code, what I wrote here is just the conceptual code which is tested and working.

    ReplyDelete

Post a Comment

Popular posts from this blog

Java with MINIO file operations: upload, download, delete

Spring Boot Scheduler for Distributed System: Using shedlock

Kafka Stream API: MySQL CDC to apache Kafka with debezium