Tuesday, 29 April 2025

Restricting concurrent updates

Introduction

Just jotting down some thoughts about what might be involved for addressing an issue being faced in my last project.

The problematic situation

Multiple sources of updates arriving in concurrently, being picked up for inclusion in an aggregated representation of the data.
There are multiple worker processes, each with multiple worker threads that pick up changes and apply them without any awareness or consideration of what other work is underway.

Potential solution approaches

Debouncing of updates

Using Redis as a store for keeping track of the identifier of records that are currently being processed has been successfully applied for reducing race conditions for updates of some attributes, so the pattern could be applied more broadly.

Partitioning workload processing

This would most likely involve needing to switch to a different message processing technology to enable isolation of workers and having a single update per identifier at a time.
If Kafka was applied here, we would need to have more awareness and consideration for balancing of keys across partitions to ensure that we preserve scalability of throughput.
To benefit from the change the processing would probably also need to switch to being a single thread per partition to achieve the goal of eliminating concurrent updating of records that have the same identifier.
In my opinion, this would be more trouble than it is worth.

No comments:

Post a Comment