This is a fully managed NoSQL database from AWS. I once used DynamoDB to store IoT data from electric motorcycles. It turned out to be quite expensive, because in that project we often pulled data from DynamoDB to build analytics for the motorcycles.
DynamoDB has a scan operation to retrieve data, but it’s very slow because it works like a full table scan. Instead of using scan, DynamoDB recommends using a query operation with a KeyConditionExpression when fetching data.
However, KeyConditionExpression only works if a Global Secondary Index (GSI) is already set up. Without a GSI, the query cannot run. This works like creating a sub table in DynamoDB, and it also increases the cost.
response = table.query(
KeyConditionExpression=Key('vin').eq(vin) &
Key('timestamp').between(int(startTime), int(endTime)),
ScanIndexForward=False)The query operation also has a limit. It cannot return all data in a single response. The maximum size per response is 1 MB, so we need to implement pagination logic using LastEvaluatedKey.
response = table.query(
KeyConditionExpression=Key('vin').eq(vin) &
Key('timestamp').between(int(startTime), int(endTime)),
ScanIndexForward=False)
result = response['Items']
while 'LastEvaluatedKey' in response:
response = table.query(
KeyConditionExpression=Key('vin').eq(vin) &
Key('timestamp').between(int(startTime), int(endTime)),
ScanIndexForward=False,
ExclusiveStartKey=response['LastEvaluatedKey'])
result.extend(response['Items'])
In my case, I built an API using API Gateway integrated with a Lambda function, I used the query operation when retrieving data.
We only used this services for about 3–4 months, because the cost went over the limit. In the end, we moved the incoming data to TimescaleDB instead.