-
Notifications
You must be signed in to change notification settings - Fork 3.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix Task Decorator to Work With and Without Feature Flag (AAP-41775) #15911
Fix Task Decorator to Work With and Without Feature Flag (AAP-41775) #15911
Conversation
…node_heartbeat Extract common heartbeat logic into helper functions: _heartbeat_instance_management: consolidates instance management, health checks, and lost-instance detection. _heartbeat_check_versions: compares instance versions and initiates shutdown when necessary. _heartbeat_handle_lost_instances: reaps jobs and marks lost instances offline. Refactor the original cluster_node_heartbeat to use these helpers and retain legacy behavior (using bind_kwargs). Introduce adispatch_cluster_node_heartbeat for dispatcherd: uses the control API to retrieve running tasks and reaps them. Link the two implementations by attaching adispatch_cluster_node_heartbeat as the _new_method on cluster_node_heartbeat.
…implementation Update apply_async to check at runtime if FEATURE_NEW_DISPATCHER is enabled. When the task is cluster_node_heartbeat and a _new_method is attached, delegate the task submission to the new dispatcherd implementation. Preserve the original behavior for all other tasks and fallback on error.
…per function Improves readability of adispatch_cluster_node_heartbeat by extracting the complex UUID parsing logic into a dedicated helper function. Adds clearer error handling and follows established code patterns.
…re flag Implemented a new approach for handling task execution with feature flags by attaching alternative implementations to apply_async._new_method. This allows cluster_node_heartbeat to work correctly with both the legacy and new dispatcher systems without modifying core decorator logic. AAP-41775
…mplementation - Add error handling when attaching alternative dispatcher implementation - Fix method self-reference in apply_async to properly use cls.apply_async - Document limitations of this targeted approach for specific tasks - Add logging for better debugging of dispatcher selection - Ensure decorator timing by keeping method attachment after function definitions This completes the robust implementation for switching between dispatcher implementations based on feature flags. AAP-41775
…ag compatibility Replaces direct method attribute assignment with a global registry for alternative implementations. The original approach tried to attach new methods directly to apply_async bound methods, which fails because bound methods don't support attribute assignment in Python. The registry pattern: - Creates a global ALTERNATIVE_TASK_IMPLEMENTATIONS dict in publish.py - Registers alternative implementations by task name - Modifies apply_async to check the registry when feature flag is enabled - Adds extensive logging throughout the process for debugging This enables cluster_node_heartbeat to work correctly with both the legacy and new dispatcher implementations based on the FEATURE_NEW_DISPATCHER flag. AAP-41775
…entation Reduces verbose debugging logs while maintaining essential logging for critical operations. Preserves: - Task implementation selection based on feature flag - Registration success/failure messages - Critical error reporting Removed: - Registry content debugging messages - Repetitive task diagnostics - Non-essential information logging AAP-41775
yeah, switched from using a EDIT: Oh now I know what the issue is. It was in-place modifications I had done diff --git a/awx/settings/defaults.py b/awx/settings/defaults.py
index 43cbab180a..d2b242eadf 100644
--- a/awx/settings/defaults.py
+++ b/awx/settings/defaults.py
@@ -456,7 +456,7 @@ CELERYBEAT_SCHEDULE = {
DISPATCHER_SCHEDULE = {}
for options in CELERYBEAT_SCHEDULE.values():
task_name = options['task']
- DISPATCHER_SCHEDULE[task_name] = options
+ DISPATCHER_SCHEDULE[task_name] = options.copy()
DISPATCHER_SCHEDULE[task_name]['schedule'] = options['schedule'].total_seconds()
# Django Caching Configuration
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
With the 1 modification for scheduled jobs, able to run the Demo Job Template and schedules run.
This resolves "AttributeError: 'float' object has no attribute 'total_seconds'" errors when the dispatcher is restarted. Refs: AAP-41775
@AlanCoding I've added the .copy() method to create a proper copy in this PR, if you're okay with it
|
|
SUMMARY
Implemented a solution that enables AWX's task system to work with both the legacy dispatcher and the new dispatcherd library based on the
FEATURE_NEW_DISPATCHER
feature flag. Created a registry pattern instead of trying to modify method objects directly.ISSUE TYPE
COMPONENT NAME
AWX VERSION
ADDITIONAL INFORMATION
Problem:
When initially tried to modify methods directly with
cluster_node_heartbeat.apply_async._new_method = ...
, I encounteredAttributeError: 'method' object has no attribute '_new_method'
because Python bound methods don't support attribute assignment.With the feature flag disabled, I also encountered this scheduler error (probably known):
Solution:
In
publish.py
:ALTERNATIVE_TASK_IMPLEMENTATIONS
registry at module levelapply_async
to check the registry when the feature flag is enabledIn
system.py
:cluster_node_heartbeat
into two implementations:bind_kwargs
bind=True
_get_active_task_ids_from_dispatcherd()
to retrieve running tasks from the new dispatcherOther minor adjustments to ensure compatibility with both systems
Testing Steps:
For more extensive in-the-code debugging logs, checkout the revision (commit):
fix(dispatcher): Implement registry pattern for dispatcher feature flag compatibility