I am trying to write a solution in order to avoid writes on database in a system that consumes events in any order.
I have Elasticsearch as a final database, there are documents containing a lot of properties, imagine that:
{"property01": "value01","property02": "value02","property03": "value03","property04": "value04", ..."lastUpdate": "datetime"}
The lastUpdate will always be updated when document change.
But I do not consume all data at once, some properties are consumed by separated events.Like this:
event01
{"key": "myKey","property01": "value01","property02": "value02","eventDate": "datetime"}
event02
{"key": "myKey","property03": "value03","property04": "value04","eventDate": "datetime"}
- We fetch a batch of those json at once, usually 10.
- We are using AWS Lambda do process it, so we can have multiples lambdas fetching data.
- We can have in the queue, two (or more) events data from the same event+key in different order.
What I thought
Using redis as database to store the key of the event and eventData:A pseudocode:
var newEvent = {"key": "myKey","property03": "valuex","property04": "valuex","eventDate": "2022-08""eventName": "event01"}
class ConcurrencyOperation { List<String> toCommit; List<String> toDiscard;}public Operations check(List<String> events) { var concurrencyOperation = new ConcurrencyOperation(); for (String newEvent : events) { var currentEventDate = newEvent.eventDate; var currentEventKey = newEvent.eventName +"-" + newEvent.key; // get the last event info var lastEventKeyDate = redis.get(currentEventKey); if (lastEventKeyDate == null || currentEventDate > lastEventKeyDate) { redis.set(currentEventKey, currentEventDate); concurrencyOperation.addToCommit(newEvent); } else { concurrencyOperation.addToDiscard(newEvent); } }}
I know that calling a get to compare I can have race problems.
I really need to "lock" using the pattern eventKey, so I can check by event+key.
QuestionIs it better to use redis with lock? Will be possible to do something like that in order to avoid unorder events but even using parallels processing?
Note: I would like to use Java because is the most common language for us, and we can reuse the solution. Java + Redisson.