DynamoDB pagination - last evaluated key is not null on last page

KevKosDev :

I'm trying to implement pagination in DynamoDB with the Java SDK.

I have simple data model with a HashKey id and a date as RangeKey. I want to query for all dates after a given one. This works so far but the problem is the pagination part using the last evaluated key.

When querying the last page, the lastEvaluatedKeyis not null, it still points to the last item of the last page queried. Another query with this key set as èxclusiveStartKey then returns 0 results with a null lastEvaluatedKey.

My code looks like the following:

var query = new DynamoDBQueryExpression<DynamoModel>();
var keyCondition = ImmutableMap.<String, AttributeValue>builder()
        .put(":v_userid", new AttributeValue().withS(userId))
        .put(":v_date", new AttributeValue().withS(date.toString()))
        .build();

if (!StringUtils.isEmpty(lastKey)) {
    query.setExclusiveStartKey(ImmutableMap.<String, AttributeValue>builder()
            .put("userId", new AttributeValue().withS(userId))
            .put("date", new AttributeValue().withS(lastKey)).build());
}

query.withKeyConditionExpression("userId = :v_userid AND date >= :v_date");
query.withExpressionAttributeValues(keyCondition);
query.setLimit(2);

QueryResultPage<DynamoModel> resultPage = mapper.queryPage(DynamoModel.class, query);

Does anybody know why the lastEvaluatedKey is not null when reaching the last item matching the KeyCondition? When I only save items that match the condition, the LastEvaluatedKey is null as expected.

Matthew Pope :

This is the expected behavior of DynamoDB.

If LastEvaluatedKey is not empty, it does not necessarily mean that there is more data in the result set. The only way to know when you have reached the end of the result set is when LastEvaluatedKey is empty. (source)

This is a design decision by AWS. The most likely explanation I can think of is that in order to have a LastEvaluatedKey iff there are more items, they would have the keep scanning to find more items, and if you’re using a filter expression, they might have to scan the rest of the partition to determine whether or not there are more items. It’s a choice that helps to minimize the latency of the query (and scan) operation.

Guess you like

Origin http://43.154.161.224:23101/article/api/json?id=77772&siteId=1