With steady traffic (even with low rates like 20 r/s), we'll experience exponentially growing synchronization delays for inline updates. The updates do not get lost, but they can take minutes to synchronize on all clients.
Our infrastructure runs inside Azure App Service to host the OPAL Server as well as the OPAL Client/OPA and Azure Managed Redis for the broadcast channel.
Observations
- Increase update frequency to more than >20r/s.
2 The synchronizations inside all clients will slow down to ~5 to 15 per second, even though more updates are received.
- This means the queue is growing faster than the updates are processed.
- After the traffic drops significantly (e.g., below 1 r/s), the consumers will keep synchronizing at this steady slow rate for some time.
- Then, at a random point (often after about one minute), they'll speed up to >300 updates/second and resolve the queue in seconds.
We've tried a lot of configuration changes, like different numbers of workers, but nothing makes a real difference.
It's also 100% reproducible and always happens exactly as described.
Any hints about what the issue could be?
There's no spike in CPU or memory at any point.
With steady traffic (even with low rates like 20 r/s), we'll experience exponentially growing synchronization delays for inline updates. The updates do not get lost, but they can take minutes to synchronize on all clients.
Our infrastructure runs inside Azure App Service to host the OPAL Server as well as the OPAL Client/OPA and Azure Managed Redis for the broadcast channel.
Observations
2 The synchronizations inside all clients will slow down to ~5 to 15 per second, even though more updates are received.
We've tried a lot of configuration changes, like different numbers of workers, but nothing makes a real difference.
It's also 100% reproducible and always happens exactly as described.
Any hints about what the issue could be?
There's no spike in CPU or memory at any point.