docs: add ADR for course authoring automatic migration#251
docs: add ADR for course authoring automatic migration#251BryanttV wants to merge 1 commit intoopenedx:mainfrom
Conversation
|
Thanks for the pull request, @BryanttV! This repository is currently maintained by Once you've gone through the following steps feel free to tag them in a comment and let them know that your changes are ready for engineering review. 🔘 Get product approvalIf you haven't already, check this list to see if your contribution needs to go through the product review process.
🔘 Provide contextTo help your reviewers and other members of the community understand the purpose and larger context of your changes, feel free to add as much of the following information to the PR description as you can:
🔘 Get a green buildIf one or more checks are failing, continue working on your changes until this is no longer the case and your build turns green. DetailsWhere can I find more information?If you'd like to get more details on all aspects of the review process for open source pull requests (OSPRs), check out the following resources: When can I expect my changes to be merged?Our goal is to get community contributions seen and reviewed as efficiently as possible. However, the amount of time that it takes to review and merge a PR can vary significantly based on factors such as:
💡 As a result it may take up to several weeks or months to complete a review and merge your PR. |
rodmgwgu
left a comment
There was a problem hiding this comment.
Looking good, just some comments, thanks!
| - ``authz_rollback_course_authoring`` (rollback migration) | ||
|
|
||
| In `ADR 0011`_ and `ADR 0010`_ it was established that migration must occur automatically when | ||
| the feature flag ``authz.enable_course_authoring`` changes state, but they deferred the definition of |
There was a problem hiding this comment.
nit: "but the definition of the specific mechanism was deferred"
|
|
||
| .. code:: python | ||
|
|
||
| ENABLE_AUTOMATIC_COURSE_AUTHORING_MIGRATION = False |
There was a problem hiding this comment.
I like this approach, however I'm wondering if we should be more specific in the setting name so it's clear that it relates to Authz? like ENABLE_AUTOMATIC_AUTHZ_COURSE_AUTHORING_MIGRATION.
|
|
||
| .. code:: python | ||
|
|
||
| class CourseAuthoringMigrationRun(models.Model): |
There was a problem hiding this comment.
Same here, should we be more specific like "AuthzCourseAuthoringMigrationRun"?
|
|
||
| lock_key = f"authz_migration:{scope_type}:{scope_key}" | ||
|
|
||
| The lock is acquired using ``cache.add()``, which is an atomic operation. The default TTL |
There was a problem hiding this comment.
What happens when there is a lock and we change the flag again? would it enter a queue and execute when the lock is freed?
| - **No concurrency protection**: Multiple concurrent flag changes can trigger overlapping | ||
| migrations, leading to race conditions and data corruption. |
There was a problem hiding this comment.
This is a bit contradictory, given that we currently don't have this automation and the ADR proposes it. This concurrency issue arises only with the automation.
| the feature flag ``authz.enable_course_authoring`` changes state, but they deferred the definition of | ||
| the specific mechanism. This ADR addresses that gap. | ||
|
|
||
| The current manual approach presents the following risks: |
There was a problem hiding this comment.
I want to make sure I understand the need for a different mechanism than a 1-time migration or a more controlled migration that on-demand job operators could use. I'm thinking of a mechanism like in forums V2, when the storage backend is changed:
- If the flag is on during initialization (tutor), then the migration is executed
- If not then not much happens
- If there's a failure during the migration, then automatically rollback
Why is this not an acceptable solution, given that it directly impacts operators and that they can manage this kind of controlled migration better than in a live environment?
| The existing utility functions ``migrate_legacy_course_roles_to_authz`` and | ||
| ``migrate_authz_to_legacy_course_roles`` will be modified to incorporate the locking strategy |
There was a problem hiding this comment.
I think these functions should stay agnostic of the locking strategy, no? Their only job is to migrate data like any other migration command. Maybe we can encapsulate the utility functions in another function that ensures these migrations run asynchronously and 1 at a time. What do you think?
nit: also, this I don't think should be utilities anymore so I don't know if we should move them elsewhere
There was a problem hiding this comment.
We could have a safe migration with the lock that calls this utility functions and manages the locks, and auditability so these functions stay with a single responsibility.
| 2. Migration Trigger (Django Signals) | ||
| ------------------------------------- | ||
|
|
||
| ``pre_save`` signal handlers are attached to ``WaffleFlagCourseOverrideModel`` and |
There was a problem hiding this comment.
Why not post_save instead?
| migration_type = models.CharField(max_length=20) # forward / rollback | ||
| scope_type = models.CharField(max_length=20) # course / org | ||
| scope_key = models.CharField(max_length=255) | ||
| status = models.CharField(max_length=20) # pending, running, completed, skipped |
There was a problem hiding this comment.
If it failed how would I get the exact log / error?
| unexpectedly on instances where operators have not explicitly accepted the risks. | ||
|
|
||
| Negative consequences / risks | ||
| ============================== |
There was a problem hiding this comment.
The main risk for me here still is a data migration in a live instance which is not a controlled environment.
Related issue: #223
Description
This PR adds ADR 0013 - Course Authoring Automatic Migration, proposing an automatic and asynchronous migration mechanism triggered by changes in the
authz.enable_course_authoringfeature flag,Merge checklist
Check off if complete or not applicable: