-
Notifications
You must be signed in to change notification settings - Fork 831
Active-active domain support - Part 1/N #6799
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Active-active domain support - Part 1/N #6799
Conversation
RegionInformation struct { | ||
// InitialFailoverVersion is the identifier of each region. | ||
// It is used for active-active domains to determine the region of workflows which don't have an external entity mapping. (origin stickyness) | ||
InitialFailoverVersion int64 `yaml:"initialFailoverVersion"` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is its relationship to the initialFailoverVersion in ClusterInformation? If they're completely unrelated, can we use a different name?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can check an example config in docs/design/active-active/active-active.md. This semantically means the same thing but per region.
Notice that regions also have a failover version now. This will be used to determine the active cluster of a workflow based on following lookups: | ||
- Workflow maps to an entity (or directly to a region). This is static and cannot change over time. | ||
- Entity maps to a region. This is dynamic and can change over time. | ||
- Region maps to a cluster. This is static and cannot change over time. Note that there can be more than one cluster in a region but an active-active domain can only have one active cluster per region. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we add this constraint to the list? This should be more explicit.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Where is this information stored?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's already there. There can be only one active cluster in a region for an active-active domain.
ActiveClusters
will be a new field of domain record in DB.
Cluster/region information is available in static config yml
|
||
Workflow start request determines which cluster selection strategy to be used. | ||
|
||
| Has active-region.lookup-key | Has active-region.origin | Strategy | |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't understand why the 3rd case is type 2
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I mixed up the conditions. Fixed it now.
Type 1 -> No lookup key specified cases
Type 2 -> Lookup key is specified
|
||
| EntityType | EntityKey | Region | Failover Version | LastUpdated | | ||
|------------|-----------|--------|------------------|-------------| | ||
| user-location | seattle | us-west | 1 | | |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When and how are these records inserted?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's briefly explained in the bullet points above. For each such 3rd party entity type there needs to be a watcher implemented which populates this table. e.g. user-location entity type would be managed by a new UserLocationWatcher which runs in primary cluster as a global singleton
7a7be06
to
eb16946
Compare
What changed?
The prototype implementation in #6724 seems to work so I will be breaking it apart and sending smaller PRs.
Changes:
Misc change: