Description
Description
The state manager will help to overcome the challenges when using db-migrate in an automated and integrated fashion. It solves problems with concurrency and allows to continue from aborted states, or rollback these without the usual pains involved when using a database with exclusively non transactional DDL like MySQL.
Tasks
- State management basis
- Read and rollback from state
- Tick for progressing activity check in
- Authenticator for node selection on concurrency
Implementation details
The state manager is basically a json document that is being exchanged over the most obvious communication channel already existent. The db that is being migrated, more options could be added in the future, but that is it for now. This db will be a simple blob column, or better said text and will include either a json payload or jsonic.
The lock aquisition will happen either through one of the following scenarios
Scenario 1 - Update WHERE or CAS
Update exclusively on an try that has no current lock or a current explicitly expired one.
Scenario 2 - Randomized node name, update and check
We generate a randomized node name, or possibly a user presented key of the current node that he is responsible for, as he ensures that it may not be used by any other node.
Next we straightforwardly update the column and refetching it after a short timeout to reensure that we got the lock and not any other node. This strategy is not entirely safe and should only be used in scenarios where CAS or UPDATE where clauses are not available that can guarantee only a single update being made.
An alternative would also be the user again to provide to whom he may provide the lock.
Decided implementation
To decide the worker node to execute the migrations every node will generate a UUID
and check the current lock state. If there was no activity for the current lock for 30 seconds, it will be assumed dead. Next step will insert a lock request with the nodes UUID and the execution date (database time). After inserting we retrieve the lock requests again and check if there is more than one request. If there is more than one request and they can't be differentiated by the date the smallest UUID will take the lock and the others cease their request. The lock holder will start with the normal process and start ticking, the failed requestors will go back to the watching state.
Relevant Issues
Refers to #464
Want to back this issue? Post a bounty on it! We accept bounties via Bountysource.