-
Notifications
You must be signed in to change notification settings - Fork 19
Feature/schedule management UI change #566
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: feature/schedule-management
Are you sure you want to change the base?
Feature/schedule management UI change #566
Conversation
…le-management-ui-change
Summary of ChangesThis pull request implements comprehensive schedule management features for model auto-scaling across the entire application stack. Below is an overview of the major changes: Backend ImplementationSchedule Management Core
New Functions and Modules
Data Structure Changes
Validation and Error Handling
Infrastructure ChangesCDK Stack Updates
Constants Refactoring
Frontend UI ImplementationNew Components
Features
Type Definitions
API Integration
Testing
All changes maintain backward compatibility through optional scheduling fields. |
| logger.error(f"Failed to check ASG state: {e}") | ||
| raise ValueError(f"Failed to check ASG {asg_name}: {str(e)}") | ||
|
|
||
| # Update model status |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Inconsistent indentation detected. This line has extra leading whitespace:
| # Update model status | |
| # Update model status |
| try: | ||
| response = model_table.scan( | ||
| FilterExpression="autoScalingGroup = :asg_name", | ||
| FilterExpression="auto_scaling_group = :asg_name", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The field name was changed from autoScalingGroup to auto_scaling_group. Verify that this matches the actual DynamoDB schema. If the schema uses snake_case, this is correct, but if it uses camelCase, this will cause the scan to fail silently and return no results.
| def update_model_status(model_id: str, new_status: ModelStatus, reason: str) -> None: | ||
| """Update model status in DynamoDB""" | ||
| try: | ||
| # Convert enum to string value for DynamoDB | ||
| status_str = new_status.value if hasattr(new_status, "value") else str(new_status) | ||
|
|
||
| model_table.update_item( | ||
| Key={"model_id": model_id}, | ||
| UpdateExpression="SET #status = :status, lastStatusUpdate = :timestamp, statusReason = :reason", | ||
| ExpressionAttributeNames={"#status": "status"}, | ||
| UpdateExpression="SET model_status = :status, lastStatusUpdate = :timestamp, statusReason = :reason", | ||
| ExpressionAttributeValues={ | ||
| ":status": new_status, | ||
| ":status": status_str, | ||
| ":timestamp": datetime.now(dt_timezone.utc).isoformat(), | ||
| ":reason": reason, | ||
| }, | ||
| ) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Inconsistent indentation detected throughout this function. Lines 321-335 have extra leading whitespace:
| def update_model_status(model_id: str, new_status: ModelStatus, reason: str) -> None: | |
| """Update model status in DynamoDB""" | |
| try: | |
| # Convert enum to string value for DynamoDB | |
| status_str = new_status.value if hasattr(new_status, "value") else str(new_status) | |
| model_table.update_item( | |
| Key={"model_id": model_id}, | |
| UpdateExpression="SET #status = :status, lastStatusUpdate = :timestamp, statusReason = :reason", | |
| ExpressionAttributeNames={"#status": "status"}, | |
| UpdateExpression="SET model_status = :status, lastStatusUpdate = :timestamp, statusReason = :reason", | |
| ExpressionAttributeValues={ | |
| ":status": new_status, | |
| ":status": status_str, | |
| ":timestamp": datetime.now(dt_timezone.utc).isoformat(), | |
| ":reason": reason, | |
| }, | |
| ) | |
| def update_model_status(model_id: str, new_status: ModelStatus, reason: str) -> None: | |
| """Update model status in DynamoDB""" | |
| try: | |
| # Convert enum to string value for DynamoDB | |
| status_str = new_status.value if hasattr(new_status, "value") else str(new_status) | |
| model_table.update_item( | |
| Key={"model_id": model_id}, | |
| UpdateExpression="SET model_status = :status, lastStatusUpdate = :timestamp, statusReason = :reason", | |
| ExpressionAttributeValues={ | |
| ":status": status_str, | |
| ":timestamp": datetime.now(dt_timezone.utc).isoformat(), | |
| ":reason": reason, | |
| }, | |
| ) | |
| logger.info(f"Updated model {model_id} model_status to {status_str}: {reason}") |
| Key={"model_id": model_id}, | ||
| UpdateExpression="SET #status = :status, lastStatusUpdate = :timestamp, statusReason = :reason", | ||
| ExpressionAttributeNames={"#status": "status"}, | ||
| UpdateExpression="SET model_status = :status, lastStatusUpdate = :timestamp, statusReason = :reason", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The DynamoDB field name was changed from status to model_status. Verify that this matches the actual DynamoDB schema. The old code used ExpressionAttributeNames to handle the reserved keyword status, but the new field name model_status is not a reserved word. Ensure this field exists in your DynamoDB table schema.
| def register_litellm(model_id: str) -> None: | ||
| """Register model with LiteLLM if missing""" | ||
| try: | ||
| model_key = {"model_id": model_id} | ||
| ddb_item = model_table.get_item(Key=model_key, ConsistentRead=True).get("Item") | ||
|
|
||
| if not ddb_item: | ||
| logger.warning(f"Model {model_id} not found in DynamoDB") | ||
| return | ||
|
|
||
| # Check if already registered | ||
| existing_litellm_id = ddb_item.get("litellm_id") | ||
| if existing_litellm_id: | ||
| logger.info(f"Model {model_id} already registered with LiteLLM") | ||
| return | ||
|
|
||
| model_url = ddb_item.get("model_url") | ||
| if not model_url: | ||
| logger.warning(f"Model {model_id} has no model_url") | ||
| return | ||
|
|
||
| # Initialize LiteLLM client | ||
| secrets_manager = boto3.client("secretsmanager", region_name=os.environ["AWS_REGION"], config=retry_config) | ||
| iam_client = boto3.client("iam", region_name=os.environ["AWS_REGION"], config=retry_config) | ||
|
|
||
| litellm_client = LiteLLMClient( | ||
| base_uri=get_rest_api_container_endpoint(), | ||
| verify=get_cert_path(iam_client), | ||
| headers={ | ||
| "Authorization": secrets_manager.get_secret_value( | ||
| SecretId=os.environ.get("MANAGEMENT_KEY_NAME"), VersionStage="AWSCURRENT" | ||
| )["SecretString"], | ||
| "Content-Type": "application/json", | ||
| }, | ||
| ) | ||
|
|
||
| litellm_config_str = os.environ.get("LITELLM_CONFIG_OBJ", json.dumps({})) | ||
| try: | ||
| litellm_params = json.loads(litellm_config_str) | ||
| litellm_params = litellm_params.get("litellm_settings", {}) | ||
| except json.JSONDecodeError: | ||
| litellm_params = {} | ||
|
|
||
| model_name = ddb_item["model_config"]["modelName"] | ||
| litellm_params["model"] = f"openai/{model_name}" | ||
| litellm_params["api_base"] = model_url | ||
|
|
||
| # Register with LiteLLM | ||
| litellm_response = litellm_client.add_model( | ||
| model_name=model_id, | ||
| litellm_params=litellm_params, | ||
| ) | ||
|
|
||
| litellm_id = litellm_response["model_info"]["id"] | ||
| logger.info(f"Registered model {model_id} with LiteLLM: {litellm_id}") | ||
|
|
||
| # Update DynamoDB with new litellm_id | ||
| model_table.update_item( | ||
| Key=model_key, | ||
| UpdateExpression="SET litellm_id = :lid", | ||
| ExpressionAttributeValues={":lid": litellm_id}, | ||
| ) | ||
|
|
||
| except Exception as e: | ||
| logger.error(f"Failed to register {model_id} with LiteLLM: {e}", exc_info=True) | ||
|
|
||
|
|
||
| def remove_litellm(model_id: str) -> None: | ||
| """Remove model from LiteLLM if registered""" | ||
| try: | ||
| model_key = {"model_id": model_id} | ||
| ddb_item = model_table.get_item(Key=model_key, ConsistentRead=True).get("Item") | ||
|
|
||
| if not ddb_item: | ||
| logger.warning(f"Model {model_id} not found in DynamoDB") | ||
| return | ||
|
|
||
| litellm_id = ddb_item.get("litellm_id") | ||
| if not litellm_id: | ||
| logger.info(f"Model {model_id} has no LiteLLM registration to remove") | ||
| return | ||
|
|
||
| try: | ||
| # Initialize LiteLLM client | ||
| secrets_manager = boto3.client("secretsmanager", region_name=os.environ["AWS_REGION"], config=retry_config) | ||
| iam_client = boto3.client("iam", region_name=os.environ["AWS_REGION"], config=retry_config) | ||
|
|
||
| litellm_client = LiteLLMClient( | ||
| base_uri=get_rest_api_container_endpoint(), | ||
| verify=get_cert_path(iam_client), | ||
| headers={ | ||
| "Authorization": secrets_manager.get_secret_value( | ||
| SecretId=os.environ.get("MANAGEMENT_KEY_NAME"), VersionStage="AWSCURRENT" | ||
| )["SecretString"], | ||
| "Content-Type": "application/json", | ||
| }, | ||
| ) | ||
|
|
||
| # Remove from LiteLLM | ||
| litellm_client.delete_model(identifier=litellm_id) | ||
| logger.info(f"Removed model {model_id} from LiteLLM: {litellm_id}") | ||
|
|
||
| # Clear litellm_id from DynamoDB | ||
| model_table.update_item( | ||
| Key=model_key, | ||
| UpdateExpression="SET litellm_id = :li", | ||
| ExpressionAttributeValues={":li": None}, | ||
| ) | ||
|
|
||
| except Exception as e: | ||
| logger.error(f"Failed to remove {model_id} from LiteLLM: {e}", exc_info=True) | ||
|
|
||
| except Exception as e: | ||
| logger.error(f"Error in remove_litellm for {model_id}: {e}", exc_info=True) | ||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The register_litellm() and remove_litellm() functions are new and contain significant logic for LiteLLM client initialization. Consider extracting the common LiteLLM client initialization code (lines 458-470 and 520-532) into a shared helper function to reduce duplication:
| def register_litellm(model_id: str) -> None: | |
| """Register model with LiteLLM if missing""" | |
| try: | |
| model_key = {"model_id": model_id} | |
| ddb_item = model_table.get_item(Key=model_key, ConsistentRead=True).get("Item") | |
| if not ddb_item: | |
| logger.warning(f"Model {model_id} not found in DynamoDB") | |
| return | |
| # Check if already registered | |
| existing_litellm_id = ddb_item.get("litellm_id") | |
| if existing_litellm_id: | |
| logger.info(f"Model {model_id} already registered with LiteLLM") | |
| return | |
| model_url = ddb_item.get("model_url") | |
| if not model_url: | |
| logger.warning(f"Model {model_id} has no model_url") | |
| return | |
| # Initialize LiteLLM client | |
| secrets_manager = boto3.client("secretsmanager", region_name=os.environ["AWS_REGION"], config=retry_config) | |
| iam_client = boto3.client("iam", region_name=os.environ["AWS_REGION"], config=retry_config) | |
| litellm_client = LiteLLMClient( | |
| base_uri=get_rest_api_container_endpoint(), | |
| verify=get_cert_path(iam_client), | |
| headers={ | |
| "Authorization": secrets_manager.get_secret_value( | |
| SecretId=os.environ.get("MANAGEMENT_KEY_NAME"), VersionStage="AWSCURRENT" | |
| )["SecretString"], | |
| "Content-Type": "application/json", | |
| }, | |
| ) | |
| litellm_config_str = os.environ.get("LITELLM_CONFIG_OBJ", json.dumps({})) | |
| try: | |
| litellm_params = json.loads(litellm_config_str) | |
| litellm_params = litellm_params.get("litellm_settings", {}) | |
| except json.JSONDecodeError: | |
| litellm_params = {} | |
| model_name = ddb_item["model_config"]["modelName"] | |
| litellm_params["model"] = f"openai/{model_name}" | |
| litellm_params["api_base"] = model_url | |
| # Register with LiteLLM | |
| litellm_response = litellm_client.add_model( | |
| model_name=model_id, | |
| litellm_params=litellm_params, | |
| ) | |
| litellm_id = litellm_response["model_info"]["id"] | |
| logger.info(f"Registered model {model_id} with LiteLLM: {litellm_id}") | |
| # Update DynamoDB with new litellm_id | |
| model_table.update_item( | |
| Key=model_key, | |
| UpdateExpression="SET litellm_id = :lid", | |
| ExpressionAttributeValues={":lid": litellm_id}, | |
| ) | |
| except Exception as e: | |
| logger.error(f"Failed to register {model_id} with LiteLLM: {e}", exc_info=True) | |
| def remove_litellm(model_id: str) -> None: | |
| """Remove model from LiteLLM if registered""" | |
| try: | |
| model_key = {"model_id": model_id} | |
| ddb_item = model_table.get_item(Key=model_key, ConsistentRead=True).get("Item") | |
| if not ddb_item: | |
| logger.warning(f"Model {model_id} not found in DynamoDB") | |
| return | |
| litellm_id = ddb_item.get("litellm_id") | |
| if not litellm_id: | |
| logger.info(f"Model {model_id} has no LiteLLM registration to remove") | |
| return | |
| try: | |
| # Initialize LiteLLM client | |
| secrets_manager = boto3.client("secretsmanager", region_name=os.environ["AWS_REGION"], config=retry_config) | |
| iam_client = boto3.client("iam", region_name=os.environ["AWS_REGION"], config=retry_config) | |
| litellm_client = LiteLLMClient( | |
| base_uri=get_rest_api_container_endpoint(), | |
| verify=get_cert_path(iam_client), | |
| headers={ | |
| "Authorization": secrets_manager.get_secret_value( | |
| SecretId=os.environ.get("MANAGEMENT_KEY_NAME"), VersionStage="AWSCURRENT" | |
| )["SecretString"], | |
| "Content-Type": "application/json", | |
| }, | |
| ) | |
| # Remove from LiteLLM | |
| litellm_client.delete_model(identifier=litellm_id) | |
| logger.info(f"Removed model {model_id} from LiteLLM: {litellm_id}") | |
| # Clear litellm_id from DynamoDB | |
| model_table.update_item( | |
| Key=model_key, | |
| UpdateExpression="SET litellm_id = :li", | |
| ExpressionAttributeValues={":li": None}, | |
| ) | |
| except Exception as e: | |
| logger.error(f"Failed to remove {model_id} from LiteLLM: {e}", exc_info=True) | |
| except Exception as e: | |
| logger.error(f"Error in remove_litellm for {model_id}: {e}", exc_info=True) | |
| def _get_litellm_client() -> LiteLLMClient: | |
| """Initialize and return a LiteLLM client""" | |
| secrets_manager = boto3.client("secretsmanager", region_name=os.environ["AWS_REGION"], config=retry_config) | |
| iam_client = boto3.client("iam", region_name=os.environ["AWS_REGION"], config=retry_config) | |
| return LiteLLMClient( | |
| base_uri=get_rest_api_container_endpoint(), | |
| verify=get_cert_path(iam_client), | |
| headers={ | |
| "Authorization": secrets_manager.get_secret_value( | |
| SecretId=os.environ.get("MANAGEMENT_KEY_NAME"), VersionStage="AWSCURRENT" | |
| )["SecretString"], | |
| "Content-Type": "application/json", | |
| }, | |
| ) |
| def adjust_initial_capacity_for_schedule(prepared_event: Dict[str, Any]) -> None: | ||
| """Adjust Auto Scaling Group initial capacity based on schedule configuration""" | ||
| try: | ||
| # Check if scheduling is configured | ||
| auto_scaling_config = prepared_event.get("autoScalingConfig", {}) | ||
| scheduling_config = auto_scaling_config.get("scheduling") | ||
|
|
||
| if not scheduling_config or not scheduling_config.get("scheduleEnabled"): | ||
| logger.info("No scheduling configured - using original capacity settings") | ||
| return | ||
|
|
||
| schedule_type = scheduling_config.get("scheduleType") | ||
| timezone_name = scheduling_config.get("timezone") | ||
|
|
||
| try: | ||
| tz = ZoneInfo(timezone_name) | ||
| now = datetime.now(tz) | ||
| current_time = now.time() | ||
| current_day = now.strftime("%A").lower() | ||
| is_within_schedule = False | ||
|
|
||
| if schedule_type == "RECURRING" and scheduling_config.get("dailySchedule"): | ||
| # Daily recurring schedule | ||
| daily_schedule = scheduling_config["dailySchedule"] | ||
| start_time_str = daily_schedule.get("startTime") | ||
| stop_time_str = daily_schedule.get("stopTime") | ||
|
|
||
| if start_time_str and stop_time_str: | ||
| # Parse times | ||
| start_hour, start_minute = map(int, start_time_str.split(":")) | ||
| stop_hour, stop_minute = map(int, stop_time_str.split(":")) | ||
|
|
||
| start_time_obj = datetime.min.time().replace(hour=start_hour, minute=start_minute) | ||
| stop_time_obj = datetime.min.time().replace(hour=stop_hour, minute=stop_minute) | ||
|
|
||
| # Check if current time is within the schedule | ||
| if start_time_obj <= stop_time_obj: | ||
| # Normal schedule within same day | ||
| is_within_schedule = start_time_obj <= current_time <= stop_time_obj | ||
| else: | ||
| # Schedule crosses midnight | ||
| is_within_schedule = current_time >= start_time_obj or current_time <= stop_time_obj | ||
|
|
||
| elif schedule_type == "DAILY" and scheduling_config.get("weeklySchedule"): | ||
| # Daily schedule | ||
| weekly_schedule = scheduling_config["weeklySchedule"] | ||
| today_schedule = weekly_schedule.get(current_day) | ||
|
|
||
| if today_schedule and today_schedule.get("startTime") and today_schedule.get("stopTime"): | ||
| start_time_str = today_schedule["startTime"] | ||
| stop_time_str = today_schedule["stopTime"] | ||
|
|
||
| # Parse times | ||
| start_hour, start_minute = map(int, start_time_str.split(":")) | ||
| stop_hour, stop_minute = map(int, stop_time_str.split(":")) | ||
|
|
||
| start_time_obj = datetime.min.time().replace(hour=start_hour, minute=start_minute) | ||
| stop_time_obj = datetime.min.time().replace(hour=stop_hour, minute=stop_minute) | ||
|
|
||
| # Check if current time is within the schedule | ||
| if start_time_obj <= stop_time_obj: | ||
| # Normal schedule within same day | ||
| is_within_schedule = start_time_obj <= current_time <= stop_time_obj | ||
| else: | ||
| # Schedule crosses midnight | ||
| is_within_schedule = current_time >= start_time_obj or current_time <= stop_time_obj | ||
|
|
||
| # Adjust capacity based on schedule | ||
| if is_within_schedule: | ||
| logger.info(f"Current time {current_time} ({timezone_name}) is within scheduled hours") | ||
| # Keep original capacity settings | ||
| else: | ||
| logger.info(f"Current time {current_time} ({timezone_name}) is outside scheduled hours") | ||
| # Set desired capacity to 0 for deployment outside scheduled hours | ||
| auto_scaling_config["minCapacity"] = 0 | ||
| # Keep maxCapacity at original value - CloudFormation requires maxCapacity > 0 | ||
| auto_scaling_config["desiredCapacity"] = 0 | ||
|
|
||
| except Exception as time_error: | ||
| logger.error(f"Error processing schedule time logic: {time_error}", exc_info=True) | ||
| # If we can't determine the schedule, use default capacity to be safe | ||
| logger.info("Using original capacity settings due to schedule processing error") | ||
|
|
||
| except Exception as e: | ||
| logger.error(f"Error adjusting initial capacity for schedule: {e}", exc_info=True) | ||
| # If scheduling logic fails, proceed with original capacity settings | ||
| logger.info("Using original capacity settings due to scheduling error") | ||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The adjust_initial_capacity_for_schedule() function has several issues:
-
Time parsing vulnerability: Lines 111-112 and 135-136 use
split(":")without validation. If the time format is invalid (e.g., "25:70"), theint()conversion will fail. The PR description mentions "24-hour time format validation" but this function doesn't validate the format before parsing. -
Incorrect time object creation: Lines 114-115 and 138-139 use
datetime.min.time().replace()which creates time objects at midnight (00:00:00). This is correct for time-only objects, but the logic is unclear. Consider usingtime()constructor directly:
| def adjust_initial_capacity_for_schedule(prepared_event: Dict[str, Any]) -> None: | |
| """Adjust Auto Scaling Group initial capacity based on schedule configuration""" | |
| try: | |
| # Check if scheduling is configured | |
| auto_scaling_config = prepared_event.get("autoScalingConfig", {}) | |
| scheduling_config = auto_scaling_config.get("scheduling") | |
| if not scheduling_config or not scheduling_config.get("scheduleEnabled"): | |
| logger.info("No scheduling configured - using original capacity settings") | |
| return | |
| schedule_type = scheduling_config.get("scheduleType") | |
| timezone_name = scheduling_config.get("timezone") | |
| try: | |
| tz = ZoneInfo(timezone_name) | |
| now = datetime.now(tz) | |
| current_time = now.time() | |
| current_day = now.strftime("%A").lower() | |
| is_within_schedule = False | |
| if schedule_type == "RECURRING" and scheduling_config.get("dailySchedule"): | |
| # Daily recurring schedule | |
| daily_schedule = scheduling_config["dailySchedule"] | |
| start_time_str = daily_schedule.get("startTime") | |
| stop_time_str = daily_schedule.get("stopTime") | |
| if start_time_str and stop_time_str: | |
| # Parse times | |
| start_hour, start_minute = map(int, start_time_str.split(":")) | |
| stop_hour, stop_minute = map(int, stop_time_str.split(":")) | |
| start_time_obj = datetime.min.time().replace(hour=start_hour, minute=start_minute) | |
| stop_time_obj = datetime.min.time().replace(hour=stop_hour, minute=stop_minute) | |
| # Check if current time is within the schedule | |
| if start_time_obj <= stop_time_obj: | |
| # Normal schedule within same day | |
| is_within_schedule = start_time_obj <= current_time <= stop_time_obj | |
| else: | |
| # Schedule crosses midnight | |
| is_within_schedule = current_time >= start_time_obj or current_time <= stop_time_obj | |
| elif schedule_type == "DAILY" and scheduling_config.get("weeklySchedule"): | |
| # Daily schedule | |
| weekly_schedule = scheduling_config["weeklySchedule"] | |
| today_schedule = weekly_schedule.get(current_day) | |
| if today_schedule and today_schedule.get("startTime") and today_schedule.get("stopTime"): | |
| start_time_str = today_schedule["startTime"] | |
| stop_time_str = today_schedule["stopTime"] | |
| # Parse times | |
| start_hour, start_minute = map(int, start_time_str.split(":")) | |
| stop_hour, stop_minute = map(int, stop_time_str.split(":")) | |
| start_time_obj = datetime.min.time().replace(hour=start_hour, minute=start_minute) | |
| stop_time_obj = datetime.min.time().replace(hour=stop_hour, minute=stop_minute) | |
| # Check if current time is within the schedule | |
| if start_time_obj <= stop_time_obj: | |
| # Normal schedule within same day | |
| is_within_schedule = start_time_obj <= current_time <= stop_time_obj | |
| else: | |
| # Schedule crosses midnight | |
| is_within_schedule = current_time >= start_time_obj or current_time <= stop_time_obj | |
| # Adjust capacity based on schedule | |
| if is_within_schedule: | |
| logger.info(f"Current time {current_time} ({timezone_name}) is within scheduled hours") | |
| # Keep original capacity settings | |
| else: | |
| logger.info(f"Current time {current_time} ({timezone_name}) is outside scheduled hours") | |
| # Set desired capacity to 0 for deployment outside scheduled hours | |
| auto_scaling_config["minCapacity"] = 0 | |
| # Keep maxCapacity at original value - CloudFormation requires maxCapacity > 0 | |
| auto_scaling_config["desiredCapacity"] = 0 | |
| except Exception as time_error: | |
| logger.error(f"Error processing schedule time logic: {time_error}", exc_info=True) | |
| # If we can't determine the schedule, use default capacity to be safe | |
| logger.info("Using original capacity settings due to schedule processing error") | |
| except Exception as e: | |
| logger.error(f"Error adjusting initial capacity for schedule: {e}", exc_info=True) | |
| # If scheduling logic fails, proceed with original capacity settings | |
| logger.info("Using original capacity settings due to scheduling error") | |
| start_time_obj = time(start_hour, start_minute) | |
| stop_time_obj = time(stop_hour, stop_minute) |
-
Missing import: The code uses
time()implicitly but doesn't import it. Addfrom datetime import timeto the imports. -
Inconsistent error handling: The nested try-except blocks (lines 96-163) catch all exceptions broadly. The inner exception at line 160 logs and continues, but the outer exception at line 165 also logs and continues, making it unclear which error path is taken. Consider more specific exception handling.
-
Logic issue with schedule type check: Line 103 checks
schedule_type == "RECURRING"but line 125 checksschedule_type == "DAILY". According to the PR changes summary, the enum was renamed fromRECURRING_DAILY→RECURRINGandEACH_DAY→DAILY. Verify these string values match the frontend enum values being sent.
| # Adjust initial capacity based on schedule if scheduling is configured | ||
| adjust_initial_capacity_for_schedule(prepared_event) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The call to adjust_initial_capacity_for_schedule() modifies prepared_event in-place. Ensure this function is called at the right point in the workflow—after all necessary configuration is set but before the event is used downstream. Verify that modifying minCapacity and desiredCapacity at this stage doesn't conflict with any subsequent CloudFormation stack creation logic.
| # Remove scheduling configuration from autoScalingConfig before sending to ECS deployer | ||
| if "autoScalingConfig" in prepared_event and "scheduling" in prepared_event["autoScalingConfig"]: | ||
| del prepared_event["autoScalingConfig"]["scheduling"] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The scheduling configuration is removed from autoScalingConfig before invoking the ECS deployer. This is intentional per the PR description, but ensure the ECS deployer Lambda doesn't expect this field. If it does, this could cause silent failures. Consider adding a log statement to confirm the removal:
| # Remove scheduling configuration from autoScalingConfig before sending to ECS deployer | |
| if "autoScalingConfig" in prepared_event and "scheduling" in prepared_event["autoScalingConfig"]: | |
| del prepared_event["autoScalingConfig"]["scheduling"] | |
| if "autoScalingConfig" in prepared_event and "scheduling" in prepared_event["autoScalingConfig"]: | |
| logger.info("Removing scheduling configuration before ECS deployer invocation") | |
| del prepared_event["autoScalingConfig"]["scheduling"] |
| try: | ||
| response = lambdaClient.invoke( | ||
| FunctionName=os.environ["ECS_MODEL_DEPLOYER_FN_ARN"], | ||
| Payload=json.dumps({"modelConfig": prepared_event}), | ||
| ) | ||
|
|
||
| except Exception as invoke_error: | ||
| raise StackFailedToCreateException( | ||
| json.dumps( | ||
| { | ||
| "error": f"Failed to invoke ECS Model Deployer Lambda: {str(invoke_error)}", | ||
| "event": event, | ||
| "invoke_error": str(invoke_error), | ||
| } | ||
| ) | ||
| ) | ||
|
|
||
| try: | ||
| payload = response["Payload"].read() | ||
| payload = json.loads(payload) | ||
| except Exception as parse_error: | ||
| raise StackFailedToCreateException( | ||
| json.dumps( | ||
| { | ||
| "error": f"Failed to parse ECS Model Deployer response: {str(parse_error)}", | ||
| "event": event, | ||
| "raw_response": str(response["Payload"].read()), | ||
| "parse_error": str(parse_error), | ||
| } | ||
| ) | ||
| ) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The error handling for the ECS deployer invocation has been improved with try-except blocks, but there's a potential issue: at line 398, response["Payload"].read() is called again after already being read at line 390. The stream is consumed after the first read, so the second read will return an empty result. This will cause the error message to show an empty string. Remove the second read:
| try: | |
| response = lambdaClient.invoke( | |
| FunctionName=os.environ["ECS_MODEL_DEPLOYER_FN_ARN"], | |
| Payload=json.dumps({"modelConfig": prepared_event}), | |
| ) | |
| except Exception as invoke_error: | |
| raise StackFailedToCreateException( | |
| json.dumps( | |
| { | |
| "error": f"Failed to invoke ECS Model Deployer Lambda: {str(invoke_error)}", | |
| "event": event, | |
| "invoke_error": str(invoke_error), | |
| } | |
| ) | |
| ) | |
| try: | |
| payload = response["Payload"].read() | |
| payload = json.loads(payload) | |
| except Exception as parse_error: | |
| raise StackFailedToCreateException( | |
| json.dumps( | |
| { | |
| "error": f"Failed to parse ECS Model Deployer response: {str(parse_error)}", | |
| "event": event, | |
| "raw_response": str(response["Payload"].read()), | |
| "parse_error": str(parse_error), | |
| } | |
| ) | |
| ) | |
| except Exception as parse_error: | |
| raise StackFailedToCreateException( | |
| json.dumps( | |
| { | |
| "error": f"Failed to parse ECS Model Deployer response: {str(parse_error)}", | |
| "event": event, | |
| "raw_response": payload, | |
| "parse_error": str(parse_error), | |
| } | |
| ) | |
| ) |
| stack_name = payload.get("stackName", None) | ||
|
|
||
| if not stack_name: | ||
| # Log the full payload for debugging | ||
| logger.error(f"ECS Model Deployer response: {payload}") | ||
| error_message = payload.get("errorMessage", "Unknown error") | ||
| error_type = payload.get("errorType", "Unknown error type") | ||
|
|
||
| error_message = payload.get("errorMessage") | ||
| error_type = payload.get("errorType") | ||
| trace = payload.get("trace", []) | ||
| raise StackFailedToCreateException( | ||
| json.dumps( | ||
| { | ||
| "error": f"Failed to create Model CloudFormation Stack. {error_type}: {error_message}", | ||
| "event": event, | ||
| "deployer_response": payload, | ||
| "debug_info": { | ||
| "error_type": error_type, | ||
| "error_message": error_message, | ||
| "stack_trace": trace, | ||
| "full_response": payload, | ||
| }, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Indentation is inconsistent in the new hunk. Lines 404-410 have incorrect indentation (extra spaces). Ensure proper Python indentation alignment with the surrounding code.
| from decimal import Decimal | ||
| from functools import cache | ||
| from typing import Any, Callable, cast, Dict, Optional, TypeVar, Union | ||
| from typing import Any, Callable, cast, Dict, List, Optional, TypeVar, Union |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The import statement adds List to the typing imports. Verify that List is actually used in this module. If it's not used, consider removing it to keep imports minimal. If it is used, ensure it's being used instead of the built-in list type hint (which is preferred in Python 3.9+).
|
|
||
| response = lambda_client.invoke( | ||
| FunctionName=os.environ.get("SCHEDULE_MANAGEMENT_FUNCTION_NAME", "LISA-ScheduleManagement"), | ||
| FunctionName=os.environ.get("SCHEDULE_MANAGEMENT_FUNCTION_NAME"), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Missing default fallback value for SCHEDULE_MANAGEMENT_FUNCTION_NAME environment variable. If this environment variable is not set, os.environ.get() will return None, causing the Lambda invocation to fail. Consider adding a sensible default value or ensuring this variable is always defined in the deployment configuration.
| FunctionName=os.environ.get("SCHEDULE_MANAGEMENT_FUNCTION_NAME"), | |
| FunctionName=os.environ.get("SCHEDULE_MANAGEMENT_FUNCTION_NAME", "LISA-ScheduleManagement"), |
|
|
||
| response = lambda_client.invoke( | ||
| FunctionName=os.environ.get("SCHEDULE_MANAGEMENT_FUNCTION_NAME", "LISA-ScheduleManagement"), | ||
| FunctionName=os.environ.get("SCHEDULE_MANAGEMENT_FUNCTION_NAME"), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Missing default fallback value for SCHEDULE_MANAGEMENT_FUNCTION_NAME environment variable. If this environment variable is not set, os.environ.get() will return None, causing the Lambda invocation to fail. Consider adding a sensible default value or ensuring this variable is always defined in the deployment configuration.
| FunctionName=os.environ.get("SCHEDULE_MANAGEMENT_FUNCTION_NAME"), | |
| FunctionName=os.environ.get("SCHEDULE_MANAGEMENT_FUNCTION_NAME", "LISA-ScheduleManagement"), |
| const scheduleMonitoringLambda = new Function(this, 'ScheduleMonitoring', { | ||
| runtime: getDefaultRuntime(), | ||
| handler: 'models.scheduling.schedule_monitoring.lambda_handler', | ||
| code: Code.fromAsset(lambdaPath), | ||
| layers: lambdaLayers, | ||
| environment: { | ||
| MODEL_TABLE_NAME: modelTable.tableName, | ||
| ECS_CLUSTER_NAME: `${config.deploymentPrefix}-ECS-Cluster`, | ||
| LISA_API_URL_PS_NAME: lisaServeEndpointUrlPs.parameterName, | ||
| MANAGEMENT_KEY_NAME: managementKeyName, | ||
| REST_API_VERSION: 'v2', | ||
| }, | ||
| role: stateMachinesLambdaRole, | ||
| vpc: vpc.vpc, | ||
| securityGroups: securityGroups, | ||
| timeout: Duration.minutes(5), | ||
| description: 'Processes Auto Scaling Group CloudWatch events to update model status', | ||
| }); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The scheduleMonitoringLambda environment variable ECS_CLUSTER_NAME is set to ${config.deploymentPrefix}-ECS-Cluster, but according to the PR description, the monitoring logic has been refactored to use Auto Scaling Group state checking instead of ECS-based status tracking. Verify that this environment variable is still needed or remove it if it is no longer used.
| public addScheduleAwareService ( | ||
| taskName: ECSTasks, | ||
| taskDefinition: TaskDefinition, | ||
| scheduleConfig?: { | ||
| scheduleEnabled: boolean; | ||
| scheduleType: string; | ||
| timezone: string; | ||
| } | ||
| ): { service: Ec2Service; targetGroup?: ApplicationTargetGroup } { | ||
| const result = this.addTask(taskName, taskDefinition); | ||
| const { service } = result; | ||
|
|
||
| // Add schedule-related tags to the service | ||
| if (scheduleConfig?.scheduleEnabled) { | ||
| Tags.of(service).add('ScheduleEnabled', 'true'); | ||
| Tags.of(service).add('ScheduleType', scheduleConfig.scheduleType); | ||
| Tags.of(service).add('Timezone', scheduleConfig.timezone); | ||
| Tags.of(service).add('ScheduleManaged', 'true'); | ||
| } else { | ||
| Tags.of(service).add('ScheduleEnabled', 'false'); | ||
| Tags.of(service).add('RunsAllTime', 'true'); | ||
| } | ||
|
|
||
| // Add schedule-related environment variables | ||
| if (scheduleConfig?.scheduleEnabled) { | ||
| const container = service.taskDefinition.findContainer(createCdkId([taskName, 'Container'])); | ||
| if (container) { | ||
| container.addEnvironment('SCHEDULE_ENABLED', 'true'); | ||
| container.addEnvironment('SCHEDULE_TYPE', scheduleConfig.scheduleType); | ||
| container.addEnvironment('SCHEDULE_TIMEZONE', scheduleConfig.timezone); | ||
| } | ||
| } | ||
|
|
||
| return result; | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The addScheduleAwareService() method adds schedule-related tags and environment variables to services, but there's a potential issue with the implementation:
-
The method modifies the task definition by adding environment variables directly to the container (lines 501-506). This happens after the service is created via
this.addTask(), which may have already started the task. Consider whether these environment variables should be set before service creation. -
The method searches for a container using
createCdkId([taskName, 'Container'])(line 501), but there's no validation that this container exists or that the naming convention matches. If the container isn't found, the environment variables silently fail to be added. Consider adding explicit error handling or logging.
| public addScheduleAwareService ( | |
| taskName: ECSTasks, | |
| taskDefinition: TaskDefinition, | |
| scheduleConfig?: { | |
| scheduleEnabled: boolean; | |
| scheduleType: string; | |
| timezone: string; | |
| } | |
| ): { service: Ec2Service; targetGroup?: ApplicationTargetGroup } { | |
| const result = this.addTask(taskName, taskDefinition); | |
| const { service } = result; | |
| // Add schedule-related tags to the service | |
| if (scheduleConfig?.scheduleEnabled) { | |
| Tags.of(service).add('ScheduleEnabled', 'true'); | |
| Tags.of(service).add('ScheduleType', scheduleConfig.scheduleType); | |
| Tags.of(service).add('Timezone', scheduleConfig.timezone); | |
| Tags.of(service).add('ScheduleManaged', 'true'); | |
| } else { | |
| Tags.of(service).add('ScheduleEnabled', 'false'); | |
| Tags.of(service).add('RunsAllTime', 'true'); | |
| } | |
| // Add schedule-related environment variables | |
| if (scheduleConfig?.scheduleEnabled) { | |
| const container = service.taskDefinition.findContainer(createCdkId([taskName, 'Container'])); | |
| if (container) { | |
| container.addEnvironment('SCHEDULE_ENABLED', 'true'); | |
| container.addEnvironment('SCHEDULE_TYPE', scheduleConfig.scheduleType); | |
| container.addEnvironment('SCHEDULE_TIMEZONE', scheduleConfig.timezone); | |
| } | |
| } | |
| return result; | |
| } | |
| public addScheduleAwareService ( | |
| taskName: ECSTasks, | |
| taskDefinition: TaskDefinition, | |
| scheduleConfig?: { | |
| scheduleEnabled: boolean; | |
| scheduleType: string; | |
| timezone: string; | |
| } | |
| ): { service: Ec2Service; targetGroup?: ApplicationTargetGroup } { | |
| // Add schedule-related environment variables before service creation | |
| if (scheduleConfig?.scheduleEnabled) { | |
| const container = taskDefinition.findContainer(createCdkId([taskName, 'Container'])); | |
| if (container) { | |
| container.addEnvironment('SCHEDULE_ENABLED', 'true'); | |
| container.addEnvironment('SCHEDULE_TYPE', scheduleConfig.scheduleType); | |
| container.addEnvironment('SCHEDULE_TIMEZONE', scheduleConfig.timezone); | |
| } | |
| } | |
| const result = this.addTask(taskName, taskDefinition); | |
| const { service } = result; | |
| // Add schedule-related tags to the service | |
| if (scheduleConfig?.scheduleEnabled) { | |
| Tags.of(service).add('ScheduleEnabled', 'true'); | |
| Tags.of(service).add('ScheduleType', scheduleConfig.scheduleType); | |
| Tags.of(service).add('Timezone', scheduleConfig.timezone); | |
| Tags.of(service).add('ScheduleManaged', 'true'); | |
| } else { | |
| Tags.of(service).add('ScheduleEnabled', 'false'); | |
| Tags.of(service).add('RunsAllTime', 'true'); | |
| } | |
| return result; | |
| } |
| public createScheduledAction ( | ||
| actionName: string, | ||
| schedule: string, | ||
| minSize?: number, | ||
| maxSize?: number, | ||
| desiredCapacity?: number, | ||
| timezone?: string | ||
| ): CfnScheduledAction { | ||
| const scheduledAction = new CfnScheduledAction(this, createCdkId([this.identifier, actionName, 'ScheduledAction']), { | ||
| autoScalingGroupName: this.autoScalingGroup.autoScalingGroupName, | ||
| recurrence: schedule, | ||
| ...(minSize !== undefined && { minSize }), | ||
| ...(maxSize !== undefined && { maxSize }), | ||
| ...(desiredCapacity !== undefined && { desiredCapacity }), | ||
| ...(timezone && { timeZone: timezone }) | ||
| }); | ||
|
|
||
| // Add tags to track the scheduled action | ||
| Tags.of(scheduledAction).add('ActionType', 'Schedule'); | ||
| Tags.of(scheduledAction).add('LISACluster', this.identifier); | ||
| Tags.of(scheduledAction).add('CreatedBy', 'LISA-ScheduleManagement'); | ||
|
|
||
| return scheduledAction; | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The createScheduledAction() method uses conditional spread operators to optionally include capacity parameters (lines 445-447). However, AWS CloudFormation requires at least one of minSize, maxSize, or desiredCapacity to be specified for a scheduled action to be meaningful. Consider adding validation to ensure at least one capacity parameter is provided:
| public createScheduledAction ( | |
| actionName: string, | |
| schedule: string, | |
| minSize?: number, | |
| maxSize?: number, | |
| desiredCapacity?: number, | |
| timezone?: string | |
| ): CfnScheduledAction { | |
| const scheduledAction = new CfnScheduledAction(this, createCdkId([this.identifier, actionName, 'ScheduledAction']), { | |
| autoScalingGroupName: this.autoScalingGroup.autoScalingGroupName, | |
| recurrence: schedule, | |
| ...(minSize !== undefined && { minSize }), | |
| ...(maxSize !== undefined && { maxSize }), | |
| ...(desiredCapacity !== undefined && { desiredCapacity }), | |
| ...(timezone && { timeZone: timezone }) | |
| }); | |
| // Add tags to track the scheduled action | |
| Tags.of(scheduledAction).add('ActionType', 'Schedule'); | |
| Tags.of(scheduledAction).add('LISACluster', this.identifier); | |
| Tags.of(scheduledAction).add('CreatedBy', 'LISA-ScheduleManagement'); | |
| return scheduledAction; | |
| } | |
| public createScheduledAction ( | |
| actionName: string, | |
| schedule: string, | |
| minSize?: number, | |
| maxSize?: number, | |
| desiredCapacity?: number, | |
| timezone?: string | |
| ): CfnScheduledAction { | |
| if (minSize === undefined && maxSize === undefined && desiredCapacity === undefined) { | |
| throw new Error('At least one of minSize, maxSize, or desiredCapacity must be specified for a scheduled action'); | |
| } | |
| const scheduledAction = new CfnScheduledAction(this, createCdkId([this.identifier, actionName, 'ScheduledAction']), { | |
| autoScalingGroupName: this.autoScalingGroup.autoScalingGroupName, | |
| recurrence: schedule, | |
| ...(minSize !== undefined && { minSize }), | |
| ...(maxSize !== undefined && { maxSize }), | |
| ...(desiredCapacity !== undefined && { desiredCapacity }), | |
| ...(timezone && { timeZone: timezone }) | |
| }); | |
| // Add tags to track the scheduled action | |
| Tags.of(scheduledAction).add('ActionType', 'Schedule'); | |
| Tags.of(scheduledAction).add('LISACluster', this.identifier); | |
| Tags.of(scheduledAction).add('CreatedBy', 'LISA-ScheduleManagement'); | |
| return scheduledAction; | |
| } |
| try: | ||
| result = schedule_management.delete_schedule(payload) | ||
|
|
||
| result = json.loads(response["Payload"].read()) | ||
| if result.get("statusCode") != 200: | ||
| error_message = result.get("body", {}).get("message", "Unknown error") | ||
| raise ValueError(f"Failed to delete schedule: {error_message}") | ||
|
|
||
| if response["StatusCode"] != 200 or result.get("statusCode") != 200: | ||
| error_message = result.get("body", {}).get("message", "Unknown error") | ||
| raise ValueError(f"Failed to delete schedule: {error_message}") | ||
| except Exception as e: | ||
| raise ValueError(f"Failed to delete schedule: {str(e)}") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Similar to the UpdateScheduleHandler, verify that schedule_management.delete_schedule() returns a dict with a statusCode key. The error handling assumes this structure, but if the function raises exceptions instead, the try-except should be adjusted accordingly.
| import boto3 | ||
| from botocore.config import Config | ||
|
|
||
| from ..scheduling import schedule_management |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The import of schedule_management module is added here, but the old code removed the lambda_client initialization. Ensure that schedule_management module is properly available in the deployment package and that all its functions (update_schedule, delete_schedule) are correctly implemented to handle the payloads being passed.
| # Call schedule management function directly | ||
| payload = { | ||
| "operation": "update", | ||
| "modelId": model_id, | ||
| "scheduleConfig": scheduling_config, | ||
| "autoScalingGroup": auto_scaling_group, | ||
| } | ||
|
|
||
| response = lambda_client.invoke( | ||
| FunctionName=os.environ.get("SCHEDULE_MANAGEMENT_FUNCTION_NAME"), | ||
| InvocationType="RequestResponse", | ||
| Payload=json.dumps(payload), | ||
| ) | ||
|
|
||
| result = json.loads(response["Payload"].read()) | ||
| result = schedule_management.update_schedule(payload) | ||
|
|
||
| if result.get("statusCode") == 200: | ||
| result_body = json.loads(result["body"]) | ||
| result_body = json.loads(result["body"]) if isinstance(result["body"], str) else result["body"] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The refactoring changes from Lambda invocation to direct function calls. However, there's an inconsistency in result handling:
- Line 69 checks
if result.get("statusCode") == 200:but direct function calls may not return a dict withstatusCodekey - The result parsing on line 69 assumes the result structure matches Lambda response format
Verify that schedule_management.update_schedule() returns a dict with statusCode and body keys, or adjust the result handling accordingly.
| payload = { | ||
| "operation": "update", | ||
| "modelId": model_id, | ||
| "scheduleConfig": new_scheduling_config, | ||
| "autoScalingGroup": auto_scaling_group, | ||
| } | ||
|
|
||
| response = lambda_client.invoke( | ||
| FunctionName=os.environ.get("SCHEDULE_MANAGEMENT_FUNCTION_NAME", "LISA-ScheduleManagement"), | ||
| InvocationType="RequestResponse", | ||
| Payload=json.dumps(payload), | ||
| ) | ||
|
|
||
| result = json.loads(response["Payload"].read()) | ||
| result = schedule_management.update_schedule(payload) | ||
|
|
||
| if result.get("statusCode") == 200: | ||
| result_body = json.loads(result["body"]) | ||
| result_body = json.loads(result["body"]) if isinstance(result["body"], str) else result["body"] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same issue as the previous hunk: the result handling assumes a Lambda response format with statusCode and body keys. Ensure schedule_management.update_schedule() returns the expected structure, or update the result parsing logic to match the actual return type.
| payload = {"operation": "delete", "modelId": model_id} | ||
|
|
||
| response = lambda_client.invoke( | ||
| FunctionName=os.environ.get("SCHEDULE_MANAGEMENT_FUNCTION_NAME", "LISA-ScheduleManagement"), | ||
| InvocationType="RequestResponse", | ||
| Payload=json.dumps(payload), | ||
| ) | ||
|
|
||
| result = json.loads(response["Payload"].read()) | ||
| result = schedule_management.delete_schedule(payload) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The result handling on line 164 (not shown but implied) checks result.get("statusCode") == 200:. Verify that schedule_management.delete_schedule() returns a dict with a statusCode key, or adjust the result handling to match the actual return type of the direct function call.
| // Add permission for state machine lambdas to invoke the ScheduleManagement lambda | ||
| const scheduleManagementPermission = new Policy(this, 'ScheduleManagementInvokePerms', { | ||
| statements: [ | ||
| new PolicyStatement({ | ||
| effect: Effect.ALLOW, | ||
| actions: [ | ||
| 'lambda:InvokeFunction', | ||
| ], | ||
| resources: [ | ||
| scheduleManagementLambda.functionArn, | ||
| ], | ||
| }) | ||
| ] | ||
| }); | ||
| stateMachinesLambdaRole.attachInlinePolicy(scheduleManagementPermission); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The inline policy ScheduleManagementInvokePerms grants permission to invoke scheduleManagementLambda, but there's no corresponding permission for the CloudWatch Events rule to invoke scheduleMonitoringLambda. Add a Lambda invoke permission for the CloudWatch Events rule:
| // Add permission for state machine lambdas to invoke the ScheduleManagement lambda | |
| const scheduleManagementPermission = new Policy(this, 'ScheduleManagementInvokePerms', { | |
| statements: [ | |
| new PolicyStatement({ | |
| effect: Effect.ALLOW, | |
| actions: [ | |
| 'lambda:InvokeFunction', | |
| ], | |
| resources: [ | |
| scheduleManagementLambda.functionArn, | |
| ], | |
| }) | |
| ] | |
| }); | |
| stateMachinesLambdaRole.attachInlinePolicy(scheduleManagementPermission); | |
| // Add permission for state machine lambdas to invoke the ScheduleManagement lambda | |
| const scheduleManagementPermission = new Policy(this, 'ScheduleManagementInvokePerms', { | |
| statements: [ | |
| new PolicyStatement({ | |
| effect: Effect.ALLOW, | |
| actions: [ | |
| 'lambda:InvokeFunction', | |
| ], | |
| resources: [ | |
| scheduleManagementLambda.functionArn, | |
| ], | |
| }) | |
| ] | |
| }); | |
| stateMachinesLambdaRole.attachInlinePolicy(scheduleManagementPermission); | |
| scheduleMonitoringLambda.grantInvoke(new ServicePrincipal('events.amazonaws.com')); |
| // Check if this is a scheduling-only update | ||
| const isSchedulingOnlyUpdate = updateFields.autoScalingConfig?.scheduling && | ||
| Object.keys(updateFields).length === 2 && // modelId + autoScalingConfig | ||
| Object.keys(updateFields.autoScalingConfig).length === 1 && // only scheduling | ||
| Object.keys(updateFields.autoScalingConfig)[0] === 'scheduling'; | ||
|
|
||
| if (isSchedulingOnlyUpdate) { | ||
| // Use separate schedule API for scheduling-only updates | ||
| resetScheduleUpdate(); | ||
| updateScheduleMutation({ | ||
| modelId: props.selectedItems[0].modelId, | ||
| scheduleConfig: state.form.autoScalingConfig.scheduling | ||
| }); | ||
| } else { | ||
| // Handle autoScalingConfig if present (non-scheduling changes) | ||
| if (updateFields.autoScalingConfig) { | ||
| // Only pick instance-specific fields for autoScalingInstanceConfig | ||
| const instanceConfigFields = ['minCapacity', 'maxCapacity', 'desiredCapacity', 'cooldown', 'defaultInstanceWarmup']; | ||
| const autoScalingInstanceConfig = _.pick(updateFields.autoScalingConfig, instanceConfigFields); | ||
| const filteredInstanceConfig = _.pickBy(autoScalingInstanceConfig, (value) => value !== undefined); | ||
|
|
||
| // Only include autoScalingInstanceConfig if it has at least one instance-specific property | ||
| if (!_.isEmpty(filteredInstanceConfig)) { | ||
| updateRequest.autoScalingInstanceConfig = filteredInstanceConfig; | ||
| } | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The logic for detecting scheduling-only updates has a potential issue. The condition checks if updateFields.autoScalingConfig has only a scheduling key, but this doesn't account for cases where other fields in autoScalingConfig might be undefined or null. Consider using a more robust check:
| // Check if this is a scheduling-only update | |
| const isSchedulingOnlyUpdate = updateFields.autoScalingConfig?.scheduling && | |
| Object.keys(updateFields).length === 2 && // modelId + autoScalingConfig | |
| Object.keys(updateFields.autoScalingConfig).length === 1 && // only scheduling | |
| Object.keys(updateFields.autoScalingConfig)[0] === 'scheduling'; | |
| if (isSchedulingOnlyUpdate) { | |
| // Use separate schedule API for scheduling-only updates | |
| resetScheduleUpdate(); | |
| updateScheduleMutation({ | |
| modelId: props.selectedItems[0].modelId, | |
| scheduleConfig: state.form.autoScalingConfig.scheduling | |
| }); | |
| } else { | |
| // Handle autoScalingConfig if present (non-scheduling changes) | |
| if (updateFields.autoScalingConfig) { | |
| // Only pick instance-specific fields for autoScalingInstanceConfig | |
| const instanceConfigFields = ['minCapacity', 'maxCapacity', 'desiredCapacity', 'cooldown', 'defaultInstanceWarmup']; | |
| const autoScalingInstanceConfig = _.pick(updateFields.autoScalingConfig, instanceConfigFields); | |
| const filteredInstanceConfig = _.pickBy(autoScalingInstanceConfig, (value) => value !== undefined); | |
| // Only include autoScalingInstanceConfig if it has at least one instance-specific property | |
| if (!_.isEmpty(filteredInstanceConfig)) { | |
| updateRequest.autoScalingInstanceConfig = filteredInstanceConfig; | |
| } | |
| } | |
| // Check if this is a scheduling-only update | |
| const isSchedulingOnlyUpdate = updateFields.autoScalingConfig?.scheduling && | |
| Object.keys(updateFields).length === 2 && // modelId + autoScalingConfig | |
| Object.keys(updateFields.autoScalingConfig).filter(key => updateFields.autoScalingConfig[key] !== undefined).length === 1 && // only scheduling | |
| updateFields.autoScalingConfig.scheduling !== undefined; |
Alternatively, explicitly check that other ASG fields are undefined:
| // Check if this is a scheduling-only update | |
| const isSchedulingOnlyUpdate = updateFields.autoScalingConfig?.scheduling && | |
| Object.keys(updateFields).length === 2 && // modelId + autoScalingConfig | |
| Object.keys(updateFields.autoScalingConfig).length === 1 && // only scheduling | |
| Object.keys(updateFields.autoScalingConfig)[0] === 'scheduling'; | |
| if (isSchedulingOnlyUpdate) { | |
| // Use separate schedule API for scheduling-only updates | |
| resetScheduleUpdate(); | |
| updateScheduleMutation({ | |
| modelId: props.selectedItems[0].modelId, | |
| scheduleConfig: state.form.autoScalingConfig.scheduling | |
| }); | |
| } else { | |
| // Handle autoScalingConfig if present (non-scheduling changes) | |
| if (updateFields.autoScalingConfig) { | |
| // Only pick instance-specific fields for autoScalingInstanceConfig | |
| const instanceConfigFields = ['minCapacity', 'maxCapacity', 'desiredCapacity', 'cooldown', 'defaultInstanceWarmup']; | |
| const autoScalingInstanceConfig = _.pick(updateFields.autoScalingConfig, instanceConfigFields); | |
| const filteredInstanceConfig = _.pickBy(autoScalingInstanceConfig, (value) => value !== undefined); | |
| // Only include autoScalingInstanceConfig if it has at least one instance-specific property | |
| if (!_.isEmpty(filteredInstanceConfig)) { | |
| updateRequest.autoScalingInstanceConfig = filteredInstanceConfig; | |
| } | |
| } | |
| // Check if this is a scheduling-only update | |
| const isSchedulingOnlyUpdate = updateFields.autoScalingConfig?.scheduling && | |
| Object.keys(updateFields).length === 2 && // modelId + autoScalingConfig | |
| updateFields.autoScalingConfig.minCapacity === undefined && | |
| updateFields.autoScalingConfig.maxCapacity === undefined && | |
| updateFields.autoScalingConfig.desiredCapacity === undefined && | |
| updateFields.autoScalingConfig.cooldown === undefined && | |
| updateFields.autoScalingConfig.defaultInstanceWarmup === undefined; |
| useEffect(() => { | ||
| if (!isScheduleUpdating && isScheduleUpdateSuccess) { | ||
| notificationService.generateNotification(`Successfully updated schedule: ${state.form.modelId}`, 'success'); | ||
| props.setVisible(false); | ||
| props.setIsEdit(false); | ||
| props.setSelectedItems([]); | ||
| resetState(); | ||
| } | ||
| // eslint-disable-next-line react-hooks/exhaustive-deps | ||
| }, [isScheduleUpdating, isScheduleUpdateSuccess]); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The useEffect hook for schedule update success is missing scheduleUpdateError and scheduleUpdateSuccess in its dependency array. While the eslint-disable comment suppresses the warning, this could lead to stale closures. Consider adding these to the dependency array or ensure they're properly memoized:
| useEffect(() => { | |
| if (!isScheduleUpdating && isScheduleUpdateSuccess) { | |
| notificationService.generateNotification(`Successfully updated schedule: ${state.form.modelId}`, 'success'); | |
| props.setVisible(false); | |
| props.setIsEdit(false); | |
| props.setSelectedItems([]); | |
| resetState(); | |
| } | |
| // eslint-disable-next-line react-hooks/exhaustive-deps | |
| }, [isScheduleUpdating, isScheduleUpdateSuccess]); | |
| useEffect(() => { | |
| if (!isScheduleUpdating && isScheduleUpdateSuccess) { | |
| notificationService.generateNotification(`Successfully updated schedule: ${state.form.modelId}`, 'success'); | |
| props.setVisible(false); | |
| props.setIsEdit(false); | |
| props.setSelectedItems([]); | |
| resetState(); | |
| } | |
| // eslint-disable-next-line react-hooks/exhaustive-deps | |
| }, [isScheduleUpdating, isScheduleUpdateSuccess, notificationService]); |
| const reviewError = normalizeError('Model', | ||
| isCreateError ? createError : | ||
| isUpdateError ? updateError : | ||
| isScheduleUpdateError ? scheduleUpdateError : undefined); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The error normalization logic now includes scheduleUpdateError, but there's no corresponding error handling UI or user feedback mechanism shown for schedule update failures. Ensure that scheduleUpdateError is properly displayed to the user when a schedule update fails, similar to how other errors are handled.
| model_config = model_item.get("model_config", {}) | ||
| auto_scaling_config = model_config.get("autoScalingConfig", {}) | ||
| scheduling_config = auto_scaling_config.get("scheduling", {}) | ||
|
|
||
| # Calculate next scheduled action if schedule is active | ||
| next_action = None | ||
| if scheduling_config.get("scheduleEnabled", False): | ||
| next_action = calculate_next_scheduled_action(scheduling_config) | ||
|
|
||
| return { | ||
| "statusCode": 200, | ||
| "body": json.dumps( | ||
| {"modelId": model_id, "scheduling": scheduling_config, "nextScheduledAction": next_action}, default=str | ||
| ), | ||
| "body": json.dumps({"modelId": model_id, "scheduling": scheduling_config}, default=str), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The indentation is inconsistent in this block. Lines 169 and 171-173 have extra leading spaces that don't align with the surrounding code structure. This should be corrected to maintain consistent indentation.
| model_config = model_item.get("model_config", {}) | |
| auto_scaling_config = model_config.get("autoScalingConfig", {}) | |
| scheduling_config = auto_scaling_config.get("scheduling", {}) | |
| # Calculate next scheduled action if schedule is active | |
| next_action = None | |
| if scheduling_config.get("scheduleEnabled", False): | |
| next_action = calculate_next_scheduled_action(scheduling_config) | |
| return { | |
| "statusCode": 200, | |
| "body": json.dumps( | |
| {"modelId": model_id, "scheduling": scheduling_config, "nextScheduledAction": next_action}, default=str | |
| ), | |
| "body": json.dumps({"modelId": model_id, "scheduling": scheduling_config}, default=str), | |
| model_item = response["Item"] | |
| model_config = model_item.get("model_config", {}) | |
| auto_scaling_config = model_config.get("autoScalingConfig", {}) | |
| scheduling_config = auto_scaling_config.get("scheduling", {}) | |
| return { | |
| "statusCode": 200, | |
| "body": json.dumps({"modelId": model_id, "scheduling": scheduling_config}, default=str), | |
| } |
| if schedule_config.scheduleType == ScheduleType.RECURRING: | ||
| # Create daily recurring schedule | ||
| if not schedule_config.dailySchedule: | ||
| raise ValueError("dailySchedule required for RECURRING_DAILY type") | ||
| raise ValueError("dailySchedule required for RECURRING type") | ||
|
|
||
| scheduled_action_arns.extend( | ||
| create_daily_scheduled_actions( | ||
| model_id, auto_scaling_group, schedule_config.dailySchedule, schedule_config.timezone | ||
| ) | ||
| ) | ||
|
|
||
| elif schedule_config.scheduleType == ScheduleType.EACH_DAY: | ||
| elif schedule_config.scheduleType == ScheduleType.DAILY: | ||
| # Create individual day schedules | ||
| if not schedule_config.weeklySchedule: | ||
| raise ValueError("weeklySchedule required for EACH_DAY type") | ||
| raise ValueError("weeklySchedule required for DAILY type") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The indentation is inconsistent in this block. Lines 186-188 and 197-199 have extra leading spaces. Additionally, the logic appears reversed: ScheduleType.RECURRING should use weeklySchedule (for different times per day across the week), while ScheduleType.DAILY should use dailySchedule (for the same time daily). The current implementation has these swapped.
| if schedule_config.scheduleType == ScheduleType.RECURRING: | |
| # Create daily recurring schedule | |
| if not schedule_config.dailySchedule: | |
| raise ValueError("dailySchedule required for RECURRING_DAILY type") | |
| raise ValueError("dailySchedule required for RECURRING type") | |
| scheduled_action_arns.extend( | |
| create_daily_scheduled_actions( | |
| model_id, auto_scaling_group, schedule_config.dailySchedule, schedule_config.timezone | |
| ) | |
| ) | |
| elif schedule_config.scheduleType == ScheduleType.EACH_DAY: | |
| elif schedule_config.scheduleType == ScheduleType.DAILY: | |
| # Create individual day schedules | |
| if not schedule_config.weeklySchedule: | |
| raise ValueError("weeklySchedule required for EACH_DAY type") | |
| raise ValueError("weeklySchedule required for DAILY type") | |
| if schedule_config.scheduleType == ScheduleType.RECURRING: | |
| # Create daily recurring schedule | |
| if not schedule_config.dailySchedule: | |
| raise ValueError("dailySchedule required for RECURRING type") | |
| scheduled_action_arns.extend( | |
| create_daily_scheduled_actions( | |
| model_id, auto_scaling_group, schedule_config.dailySchedule, schedule_config.timezone | |
| ) | |
| ) | |
| elif schedule_config.scheduleType == ScheduleType.DAILY: | |
| # Create individual day schedules | |
| if not schedule_config.weeklySchedule: | |
| raise ValueError("weeklySchedule required for DAILY type") |
| # Create start action | ||
| start_cron = convert_to_utc_cron(day_schedule.startTime, timezone_name) | ||
| start_cron = time_to_cron(day_schedule.startTime) | ||
| start_action_name = f"{model_id}-daily-start" | ||
|
|
||
| try: | ||
| autoscaling_client.put_scheduled_update_group_action( | ||
| AutoScalingGroupName=auto_scaling_group, | ||
| ScheduledActionName=start_action_name, | ||
| Recurrence=start_cron, | ||
| TimeZone=timezone_name, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The indentation is inconsistent in this block. Lines 289-291 and 293 have extra leading spaces that don't align with the surrounding code.
| # Create start action | |
| start_cron = convert_to_utc_cron(day_schedule.startTime, timezone_name) | |
| start_cron = time_to_cron(day_schedule.startTime) | |
| start_action_name = f"{model_id}-daily-start" | |
| try: | |
| autoscaling_client.put_scheduled_update_group_action( | |
| AutoScalingGroupName=auto_scaling_group, | |
| ScheduledActionName=start_action_name, | |
| Recurrence=start_cron, | |
| TimeZone=timezone_name, | |
| # Create start action | |
| start_cron = time_to_cron(day_schedule.startTime) | |
| start_action_name = f"{model_id}-daily-start" | |
| try: | |
| autoscaling_client.put_scheduled_update_group_action( | |
| AutoScalingGroupName=auto_scaling_group, | |
| ScheduledActionName=start_action_name, | |
| Recurrence=start_cron, | |
| TimeZone=timezone_name, |
| stop_action_name = f"{model_id}-daily-stop" | ||
|
|
||
| try: | ||
| autoscaling_client.put_scheduled_update_group_action( | ||
| AutoScalingGroupName=auto_scaling_group, | ||
| ScheduledActionName=stop_action_name, | ||
| Recurrence=stop_cron, | ||
| TimeZone=timezone_name, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The indentation is inconsistent in this block. Lines 314-315 and 316 have extra leading spaces that don't align with the surrounding code.
| stop_action_name = f"{model_id}-daily-stop" | |
| try: | |
| autoscaling_client.put_scheduled_update_group_action( | |
| AutoScalingGroupName=auto_scaling_group, | |
| ScheduledActionName=stop_action_name, | |
| Recurrence=stop_cron, | |
| TimeZone=timezone_name, | |
| # Create stop action | |
| stop_cron = time_to_cron(day_schedule.stopTime) | |
| stop_action_name = f"{model_id}-daily-stop" | |
| try: | |
| autoscaling_client.put_scheduled_update_group_action( | |
| AutoScalingGroupName=auto_scaling_group, | |
| ScheduledActionName=stop_action_name, | |
| Recurrence=stop_cron, | |
| TimeZone=timezone_name, |
| start_action_name = f"{model_id}-{day_name}-start" | ||
|
|
||
| try: | ||
| autoscaling_client.put_scheduled_update_group_action( | ||
| AutoScalingGroupName=auto_scaling_group, | ||
| ScheduledActionName=start_action_name, | ||
| Recurrence=start_cron, | ||
| TimeZone=timezone_name, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The indentation is inconsistent in this block. Lines 370-371 and 372 have extra leading spaces that don't align with the surrounding code.
| start_action_name = f"{model_id}-{day_name}-start" | |
| try: | |
| autoscaling_client.put_scheduled_update_group_action( | |
| AutoScalingGroupName=auto_scaling_group, | |
| ScheduledActionName=start_action_name, | |
| Recurrence=start_cron, | |
| TimeZone=timezone_name, | |
| # Create start action for this day | |
| start_cron = time_to_cron_with_day(day_schedule.startTime, day_num) | |
| start_action_name = f"{model_id}-{day_name}-start" | |
| try: | |
| autoscaling_client.put_scheduled_update_group_action( | |
| AutoScalingGroupName=auto_scaling_group, | |
| ScheduledActionName=start_action_name, | |
| Recurrence=start_cron, | |
| TimeZone=timezone_name, |
| stop_action_name = f"{model_id}-{day_name}-stop" | ||
|
|
||
| try: | ||
| autoscaling_client.put_scheduled_update_group_action( | ||
| AutoScalingGroupName=auto_scaling_group, | ||
| ScheduledActionName=stop_action_name, | ||
| Recurrence=stop_cron, | ||
| TimeZone=timezone_name, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The indentation is inconsistent in this block. Lines 394-395 and 396 have extra leading spaces that don't align with the surrounding code.
| stop_action_name = f"{model_id}-{day_name}-stop" | |
| try: | |
| autoscaling_client.put_scheduled_update_group_action( | |
| AutoScalingGroupName=auto_scaling_group, | |
| ScheduledActionName=stop_action_name, | |
| Recurrence=stop_cron, | |
| TimeZone=timezone_name, | |
| # Create stop action for this day | |
| stop_cron = time_to_cron_with_day(day_schedule.stopTime, day_num) | |
| stop_action_name = f"{model_id}-{day_name}-stop" | |
| try: | |
| autoscaling_client.put_scheduled_update_group_action( | |
| AutoScalingGroupName=auto_scaling_group, | |
| ScheduledActionName=stop_action_name, | |
| Recurrence=stop_cron, | |
| TimeZone=timezone_name, |
| hour, minute = map(int, time_str.split(":")) | ||
| return f"{minute} {hour} * * *" | ||
|
|
||
| # Create timezone-aware datetime for today | ||
| tz = ZoneInfo(timezone_name) | ||
| today = datetime.now(tz).date() | ||
| local_dt = datetime.combine(today, datetime.min.time().replace(hour=hour, minute=minute), tzinfo=tz) | ||
|
|
||
| # Convert to UTC | ||
| utc_dt = local_dt.astimezone(dt_timezone.utc) | ||
|
|
||
| # Return cron expression (minute hour * * *) | ||
| return f"{utc_dt.minute} {utc_dt.hour} * * *" | ||
|
|
||
|
|
||
| def convert_to_utc_cron_weekdays(time_str: str, timezone_name: str) -> str: | ||
| """Convert local time to UTC cron expression for weekdays only (Mon-Fri)""" | ||
| from zoneinfo import ZoneInfo | ||
|
|
||
| # Parse time | ||
| def time_to_cron_with_day(time_str: str, day_of_week: int) -> str: | ||
| """Convert time string (HH:MM) to cron expression with day""" | ||
| hour, minute = map(int, time_str.split(":")) | ||
|
|
||
| # Create timezone-aware datetime | ||
| tz = ZoneInfo(timezone_name) | ||
| today = datetime.now(tz).date() | ||
| local_dt = datetime.combine(today, datetime.min.time().replace(hour=hour, minute=minute), tzinfo=tz) | ||
|
|
||
| # Convert to UTC | ||
| utc_dt = local_dt.astimezone(dt_timezone.utc) | ||
|
|
||
| # Return cron expression with weekdays (minute hour * * 1-5) | ||
| return f"{utc_dt.minute} {utc_dt.hour} * * 1-5" | ||
|
|
||
|
|
||
| def convert_to_utc_cron_with_day(time_str: str, timezone_name: str, day_of_week: int) -> str: | ||
| """Convert local time to UTC cron expression with specific day of week""" | ||
| from zoneinfo import ZoneInfo | ||
|
|
||
| # Parse time | ||
| hour, minute = map(int, time_str.split(":")) | ||
|
|
||
| # Create timezone-aware datetime | ||
| tz = ZoneInfo(timezone_name) | ||
| today = datetime.now(tz).date() | ||
| local_dt = datetime.combine(today, datetime.min.time().replace(hour=hour, minute=minute), tzinfo=tz) | ||
|
|
||
| # Convert to UTC | ||
| utc_dt = local_dt.astimezone(dt_timezone.utc) | ||
|
|
||
| # Return cron expression with day of week (minute hour * * day) | ||
| return f"{utc_dt.minute} {utc_dt.hour} * * {day_of_week}" | ||
| return f"{minute} {hour} * * {day_of_week}" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The indentation is inconsistent in this block. Lines 421 and 427 have extra leading spaces that don't align with the function body.
| hour, minute = map(int, time_str.split(":")) | |
| return f"{minute} {hour} * * *" | |
| # Create timezone-aware datetime for today | |
| tz = ZoneInfo(timezone_name) | |
| today = datetime.now(tz).date() | |
| local_dt = datetime.combine(today, datetime.min.time().replace(hour=hour, minute=minute), tzinfo=tz) | |
| # Convert to UTC | |
| utc_dt = local_dt.astimezone(dt_timezone.utc) | |
| # Return cron expression (minute hour * * *) | |
| return f"{utc_dt.minute} {utc_dt.hour} * * *" | |
| def convert_to_utc_cron_weekdays(time_str: str, timezone_name: str) -> str: | |
| """Convert local time to UTC cron expression for weekdays only (Mon-Fri)""" | |
| from zoneinfo import ZoneInfo | |
| # Parse time | |
| def time_to_cron_with_day(time_str: str, day_of_week: int) -> str: | |
| """Convert time string (HH:MM) to cron expression with day""" | |
| hour, minute = map(int, time_str.split(":")) | |
| # Create timezone-aware datetime | |
| tz = ZoneInfo(timezone_name) | |
| today = datetime.now(tz).date() | |
| local_dt = datetime.combine(today, datetime.min.time().replace(hour=hour, minute=minute), tzinfo=tz) | |
| # Convert to UTC | |
| utc_dt = local_dt.astimezone(dt_timezone.utc) | |
| # Return cron expression with weekdays (minute hour * * 1-5) | |
| return f"{utc_dt.minute} {utc_dt.hour} * * 1-5" | |
| def convert_to_utc_cron_with_day(time_str: str, timezone_name: str, day_of_week: int) -> str: | |
| """Convert local time to UTC cron expression with specific day of week""" | |
| from zoneinfo import ZoneInfo | |
| # Parse time | |
| hour, minute = map(int, time_str.split(":")) | |
| # Create timezone-aware datetime | |
| tz = ZoneInfo(timezone_name) | |
| today = datetime.now(tz).date() | |
| local_dt = datetime.combine(today, datetime.min.time().replace(hour=hour, minute=minute), tzinfo=tz) | |
| # Convert to UTC | |
| utc_dt = local_dt.astimezone(dt_timezone.utc) | |
| # Return cron expression with day of week (minute hour * * day) | |
| return f"{utc_dt.minute} {utc_dt.hour} * * {day_of_week}" | |
| return f"{minute} {hour} * * {day_of_week}" | |
| def time_to_cron(time_str: str) -> str: | |
| """Convert time string (HH:MM) to cron expression""" | |
| hour, minute = map(int, time_str.split(":")) | |
| return f"{minute} {hour} * * *" | |
| def time_to_cron_with_day(time_str: str, day_of_week: int) -> str: | |
| """Convert time string (HH:MM) to cron expression with day""" | |
| hour, minute = map(int, time_str.split(":")) | |
| return f"{minute} {hour} * * {day_of_week}" |
| # Check if model_config.autoScalingConfig exists first | ||
| response = model_table.get_item(Key={"model_id": model_id}) | ||
| if "Item" not in response: | ||
| raise ValueError(f"Model {model_id} not found") | ||
|
|
||
| model_item = response["Item"] | ||
| auto_scaling_config_exists = "autoScalingConfig" in model_item | ||
| model_config_exists = "model_config" in model_item | ||
| auto_scaling_config_exists = model_config_exists and "autoScalingConfig" in model_item.get("model_config", {}) | ||
|
|
||
| if scheduling_config: | ||
| # Prepare the scheduling configuration for storage | ||
| schedule_data = scheduling_config.model_dump() | ||
| schedule_data["scheduledActionArns"] = scheduled_action_arns | ||
| schedule_data["scheduleEnabled"] = enabled | ||
| schedule_data["lastScheduleUpdate"] = datetime.now(dt_timezone.utc).isoformat() | ||
|
|
||
| schedule_data["scheduleConfigured"] = enabled | ||
| schedule_data["lastScheduleFailed"] = False | ||
|
|
||
| # Calculate next scheduled action | ||
| if enabled: | ||
| next_action = calculate_next_scheduled_action(schedule_data) | ||
| if next_action: | ||
| schedule_data["nextScheduledAction"] = next_action | ||
|
|
||
| if auto_scaling_config_exists: | ||
| # Update existing autoScalingConfig.scheduling | ||
| # Update existing model_config.autoScalingConfig.scheduling | ||
| model_table.update_item( | ||
| Key={"model_id": model_id}, | ||
| UpdateExpression="SET autoScalingConfig.scheduling = :scheduling", | ||
| UpdateExpression="SET model_config.autoScalingConfig.scheduling = :scheduling", | ||
| ExpressionAttributeValues={":scheduling": schedule_data}, | ||
| ) | ||
| else: | ||
| # Create autoScalingConfig with scheduling | ||
| # Create model_config.autoScalingConfig with scheduling | ||
| model_table.update_item( | ||
| Key={"model_id": model_id}, | ||
| UpdateExpression="SET autoScalingConfig = :autoScalingConfig", | ||
| UpdateExpression="SET model_config.autoScalingConfig = :autoScalingConfig", | ||
| ExpressionAttributeValues={":autoScalingConfig": {"scheduling": schedule_data}}, | ||
| ) | ||
| else: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The indentation is inconsistent throughout this block. Lines 544-546, 548, 552-559, 561-567, 569-574 have extra leading spaces that don't align with the surrounding code structure.
| # Check if model_config.autoScalingConfig exists first | |
| response = model_table.get_item(Key={"model_id": model_id}) | |
| if "Item" not in response: | |
| raise ValueError(f"Model {model_id} not found") | |
| model_item = response["Item"] | |
| auto_scaling_config_exists = "autoScalingConfig" in model_item | |
| model_config_exists = "model_config" in model_item | |
| auto_scaling_config_exists = model_config_exists and "autoScalingConfig" in model_item.get("model_config", {}) | |
| if scheduling_config: | |
| # Prepare the scheduling configuration for storage | |
| schedule_data = scheduling_config.model_dump() | |
| schedule_data["scheduledActionArns"] = scheduled_action_arns | |
| schedule_data["scheduleEnabled"] = enabled | |
| schedule_data["lastScheduleUpdate"] = datetime.now(dt_timezone.utc).isoformat() | |
| schedule_data["scheduleConfigured"] = enabled | |
| schedule_data["lastScheduleFailed"] = False | |
| # Calculate next scheduled action | |
| if enabled: | |
| next_action = calculate_next_scheduled_action(schedule_data) | |
| if next_action: | |
| schedule_data["nextScheduledAction"] = next_action | |
| if auto_scaling_config_exists: | |
| # Update existing autoScalingConfig.scheduling | |
| # Update existing model_config.autoScalingConfig.scheduling | |
| model_table.update_item( | |
| Key={"model_id": model_id}, | |
| UpdateExpression="SET autoScalingConfig.scheduling = :scheduling", | |
| UpdateExpression="SET model_config.autoScalingConfig.scheduling = :scheduling", | |
| ExpressionAttributeValues={":scheduling": schedule_data}, | |
| ) | |
| else: | |
| # Create autoScalingConfig with scheduling | |
| # Create model_config.autoScalingConfig with scheduling | |
| model_table.update_item( | |
| Key={"model_id": model_id}, | |
| UpdateExpression="SET autoScalingConfig = :autoScalingConfig", | |
| UpdateExpression="SET model_config.autoScalingConfig = :autoScalingConfig", | |
| ExpressionAttributeValues={":autoScalingConfig": {"scheduling": schedule_data}}, | |
| ) | |
| else: | |
| # Check if model_config.autoScalingConfig exists first | |
| response = model_table.get_item(Key={"model_id": model_id}) | |
| if "Item" not in response: | |
| raise ValueError(f"Model {model_id} not found") | |
| model_item = response["Item"] | |
| model_config_exists = "model_config" in model_item | |
| auto_scaling_config_exists = model_config_exists and "autoScalingConfig" in model_item.get("model_config", {}) | |
| if scheduling_config: | |
| # Prepare the scheduling configuration for storage | |
| schedule_data = scheduling_config.model_dump() | |
| schedule_data["scheduledActionArns"] = scheduled_action_arns | |
| schedule_data["scheduleEnabled"] = enabled | |
| schedule_data["lastScheduleUpdate"] = datetime.now(dt_timezone.utc).isoformat() | |
| schedule_data["scheduleConfigured"] = enabled | |
| schedule_data["lastScheduleFailed"] = False | |
| if auto_scaling_config_exists: | |
| # Update existing model_config.autoScalingConfig.scheduling | |
| model_table.update_item( | |
| Key={"model_id": model_id}, | |
| UpdateExpression="SET model_config.autoScalingConfig.scheduling = :scheduling", | |
| ExpressionAttributeValues={":scheduling": schedule_data}, | |
| ) | |
| else: | |
| # Create model_config.autoScalingConfig with scheduling | |
| model_table.update_item( | |
| Key={"model_id": model_id}, | |
| UpdateExpression="SET model_config.autoScalingConfig = :autoScalingConfig", | |
| ExpressionAttributeValues={":autoScalingConfig": {"scheduling": schedule_data}}, | |
| ) | |
| else: | |
| # Remove scheduling configuration for always run behavior | |
| if auto_scaling_config_exists: | |
| model_table.update_item( | |
| Key={"model_id": model_id}, | |
| UpdateExpression="REMOVE model_config.autoScalingConfig.scheduling", | |
| ) |
| model_config = model_item.get("model_config", {}) | ||
| auto_scaling_config = model_config.get("autoScalingConfig", {}) | ||
| scheduling_config = auto_scaling_config.get("scheduling", {}) | ||
|
|
||
| return scheduling_config.get("scheduledActionArns", []) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The indentation is inconsistent in this block. Line 527 has extra leading spaces that don't align with the surrounding code.
| model_config = model_item.get("model_config", {}) | |
| auto_scaling_config = model_config.get("autoScalingConfig", {}) | |
| scheduling_config = auto_scaling_config.get("scheduling", {}) | |
| return scheduling_config.get("scheduledActionArns", []) | |
| model_item = response["Item"] | |
| model_config = model_item.get("model_config", {}) | |
| auto_scaling_config = model_config.get("autoScalingConfig", {}) | |
| scheduling_config = auto_scaling_config.get("scheduling", {}) |
| url: '/models/metadata/instances' | ||
| }) | ||
| }), | ||
| updateSchedule: builder.mutation<{message: string, modelId: string, scheduleEnabled: boolean}, {modelId: string, scheduleConfig: any}>({ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The scheduleConfig parameter is typed as any. Consider using a more specific type definition (e.g., IScheduleConfig or similar) to improve type safety and enable better IDE support and compile-time validation.
| updateSchedule: builder.mutation<{message: string, modelId: string, scheduleEnabled: boolean}, {modelId: string, scheduleConfig: any}>({ | |
| updateSchedule: builder.mutation<{message: string, modelId: string, scheduleEnabled: boolean}, {modelId: string, scheduleConfig: IScheduleConfig}>({} |
| transformErrorResponse: (baseQueryReturnValue) => { | ||
| return { | ||
| name: 'Update Schedule Error', | ||
| message: baseQueryReturnValue.data?.type === 'RequestValidationError' ? | ||
| baseQueryReturnValue.data.detail.map((error) => error.msg).join(', ') : | ||
| baseQueryReturnValue.data.message | ||
| }; | ||
| }, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The error transformation logic assumes baseQueryReturnValue.data exists and has specific properties. Add a null/undefined check to prevent potential runtime errors when the response structure is unexpected.
| transformErrorResponse: (baseQueryReturnValue) => { | |
| return { | |
| name: 'Update Schedule Error', | |
| message: baseQueryReturnValue.data?.type === 'RequestValidationError' ? | |
| baseQueryReturnValue.data.detail.map((error) => error.msg).join(', ') : | |
| baseQueryReturnValue.data.message | |
| }; | |
| }, | |
| transformErrorResponse: (baseQueryReturnValue) => { | |
| const data = baseQueryReturnValue.data; | |
| return { | |
| name: 'Update Schedule Error', | |
| message: data?.type === 'RequestValidationError' ? | |
| data.detail?.map((error) => error.msg).join(', ') ?? | |
| 'Validation error occurred' : | |
| data?.message ?? 'An error occurred while updating the schedule' | |
| }; | |
| }, |
| const validateDailySchedule = (dailySchedule?: IDaySchedule): string | undefined => { | ||
| if (!dailySchedule) return 'Recurring schedule must be configured when selected.'; | ||
|
|
||
| const { startTime, stopTime } = dailySchedule; | ||
|
|
||
| if (!startTime && !stopTime) { | ||
| return 'Recurring schedule must have both start and stop times.'; | ||
| } | ||
|
|
||
| if (!startTime) return 'Start time is required for recurring schedule.'; | ||
| if (!stopTime) return 'Stop time is required for recurring schedule.'; | ||
|
|
||
| if (!isValidTimeFormat(startTime)) return 'Start time must be in HH:MM format (24-hour).'; | ||
| if (!isValidTimeFormat(stopTime)) return 'Stop time must be in HH:MM format (24-hour).'; | ||
|
|
||
| // Use enhanced validation for time pair | ||
| const timePairError = validateTimePair(startTime, stopTime); | ||
| if (timePairError) return timePairError; | ||
|
|
||
| return undefined; | ||
| }; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The function name and error messages are inconsistent. The function is named validateDailySchedule but it validates a recurring schedule (same time daily). The error messages reference "Recurring schedule" when they should reference "Daily schedule" or vice versa. This naming mismatch contradicts the PR description which states RECURRING is for same time daily.
| const validateDailySchedule = (dailySchedule?: IDaySchedule): string | undefined => { | |
| if (!dailySchedule) return 'Recurring schedule must be configured when selected.'; | |
| const { startTime, stopTime } = dailySchedule; | |
| if (!startTime && !stopTime) { | |
| return 'Recurring schedule must have both start and stop times.'; | |
| } | |
| if (!startTime) return 'Start time is required for recurring schedule.'; | |
| if (!stopTime) return 'Stop time is required for recurring schedule.'; | |
| if (!isValidTimeFormat(startTime)) return 'Start time must be in HH:MM format (24-hour).'; | |
| if (!isValidTimeFormat(stopTime)) return 'Stop time must be in HH:MM format (24-hour).'; | |
| // Use enhanced validation for time pair | |
| const timePairError = validateTimePair(startTime, stopTime); | |
| if (timePairError) return timePairError; | |
| return undefined; | |
| }; | |
| const validateRecurringSchedule = (dailySchedule?: IDaySchedule): string | undefined => { | |
| if (!dailySchedule) return 'Daily schedule must be configured when selected.'; | |
| const { startTime, stopTime } = dailySchedule; | |
| if (!startTime && !stopTime) { | |
| return 'Daily schedule must have both start and stop times.'; | |
| } | |
| if (!startTime) return 'Start time is required for daily schedule.'; | |
| if (!stopTime) return 'Stop time is required for daily schedule.'; | |
| if (!isValidTimeFormat(startTime)) return 'Start time must be in HH:MM format (24-hour).'; | |
| if (!isValidTimeFormat(stopTime)) return 'Stop time must be in HH:MM format (24-hour).'; | |
| // Use enhanced validation for time pair | |
| const timePairError = validateTimePair(startTime, stopTime); | |
| if (timePairError) return timePairError; | |
| return undefined; | |
| }; |
| const validateWeeklySchedule = (weeklySchedule?: IWeeklySchedule): { [key: string]: string } => { | ||
| const errors: { [key: string]: string } = {}; | ||
|
|
||
| if (!weeklySchedule) { | ||
| errors.general = 'Daily schedule must be configured when selected.'; | ||
| return errors; | ||
| } | ||
|
|
||
| const daysOfWeek: (keyof IWeeklySchedule)[] = ['monday', 'tuesday', 'wednesday', 'thursday', 'friday', 'saturday', 'sunday']; | ||
| let hasAtLeastOneDay = false; | ||
|
|
||
| daysOfWeek.forEach((day) => { | ||
| const daySchedule = weeklySchedule[day]; | ||
| if (daySchedule && daySchedule.startTime && daySchedule.stopTime) { | ||
| hasAtLeastOneDay = true; | ||
|
|
||
| const { startTime, stopTime } = daySchedule; | ||
|
|
||
| // Validate time formats | ||
| if (startTime && !isValidTimeFormat(startTime)) { | ||
| errors[`${day}_startTime`] = 'Start time must be in HH:MM format (24-hour).'; | ||
| } | ||
|
|
||
| if (stopTime && !isValidTimeFormat(stopTime)) { | ||
| errors[`${day}_stopTime`] = 'Stop time must be in HH:MM format (24-hour).'; | ||
| } | ||
|
|
||
| // Use enhanced validation for time pair with clearer error messages | ||
| if (startTime && stopTime && isValidTimeFormat(startTime) && isValidTimeFormat(stopTime)) { | ||
| const timePairError = validateTimePair(startTime, stopTime); | ||
| if (timePairError) { | ||
| errors[`${day}_times`] = timePairError; | ||
| } | ||
| } | ||
| } else if (daySchedule && (daySchedule.startTime || daySchedule.stopTime)) { | ||
| // Partial schedule validation | ||
| if (daySchedule.startTime && !daySchedule.stopTime) { | ||
| errors[`${day}_stopTime`] = 'Stop time is required when Start time is provided.'; | ||
| } | ||
| if (daySchedule.stopTime && !daySchedule.startTime) { | ||
| errors[`${day}_startTime`] = 'Start time is required when Stop time is provided.'; | ||
| } | ||
| } | ||
| }); | ||
|
|
||
| if (!hasAtLeastOneDay) { | ||
| errors.general = 'At least one day must have a schedule configured.'; | ||
| } | ||
|
|
||
| return errors; | ||
| }; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The function name and error messages are inconsistent. The function is named validateWeeklySchedule but it validates a daily schedule (different times per day). The error messages reference "Daily schedule" when they should reference "Weekly schedule" or vice versa. This naming mismatch contradicts the PR description which states DAILY is for different times per day.
| const validateWeeklySchedule = (weeklySchedule?: IWeeklySchedule): { [key: string]: string } => { | |
| const errors: { [key: string]: string } = {}; | |
| if (!weeklySchedule) { | |
| errors.general = 'Daily schedule must be configured when selected.'; | |
| return errors; | |
| } | |
| const daysOfWeek: (keyof IWeeklySchedule)[] = ['monday', 'tuesday', 'wednesday', 'thursday', 'friday', 'saturday', 'sunday']; | |
| let hasAtLeastOneDay = false; | |
| daysOfWeek.forEach((day) => { | |
| const daySchedule = weeklySchedule[day]; | |
| if (daySchedule && daySchedule.startTime && daySchedule.stopTime) { | |
| hasAtLeastOneDay = true; | |
| const { startTime, stopTime } = daySchedule; | |
| // Validate time formats | |
| if (startTime && !isValidTimeFormat(startTime)) { | |
| errors[`${day}_startTime`] = 'Start time must be in HH:MM format (24-hour).'; | |
| } | |
| if (stopTime && !isValidTimeFormat(stopTime)) { | |
| errors[`${day}_stopTime`] = 'Stop time must be in HH:MM format (24-hour).'; | |
| } | |
| // Use enhanced validation for time pair with clearer error messages | |
| if (startTime && stopTime && isValidTimeFormat(startTime) && isValidTimeFormat(stopTime)) { | |
| const timePairError = validateTimePair(startTime, stopTime); | |
| if (timePairError) { | |
| errors[`${day}_times`] = timePairError; | |
| } | |
| } | |
| } else if (daySchedule && (daySchedule.startTime || daySchedule.stopTime)) { | |
| // Partial schedule validation | |
| if (daySchedule.startTime && !daySchedule.stopTime) { | |
| errors[`${day}_stopTime`] = 'Stop time is required when Start time is provided.'; | |
| } | |
| if (daySchedule.stopTime && !daySchedule.startTime) { | |
| errors[`${day}_startTime`] = 'Start time is required when Stop time is provided.'; | |
| } | |
| } | |
| }); | |
| if (!hasAtLeastOneDay) { | |
| errors.general = 'At least one day must have a schedule configured.'; | |
| } | |
| return errors; | |
| }; | |
| const validateDailySchedule = (weeklySchedule?: IWeeklySchedule): { [key: string]: string } => { | |
| const errors: { [key: string]: string } = {}; | |
| if (!weeklySchedule) { | |
| errors.general = 'Weekly schedule must be configured when selected.'; | |
| return errors; | |
| } | |
| const daysOfWeek: (keyof IWeeklySchedule)[] = ['monday', 'tuesday', 'wednesday', 'thursday', 'friday', 'saturday', 'sunday']; | |
| let hasAtLeastOneDay = false; | |
| daysOfWeek.forEach((day) => { | |
| const daySchedule = weeklySchedule[day]; | |
| if (daySchedule && daySchedule.startTime && daySchedule.stopTime) { | |
| hasAtLeastOneDay = true; | |
| const { startTime, stopTime } = daySchedule; | |
| // Validate time formats | |
| if (startTime && !isValidTimeFormat(startTime)) { | |
| errors[`${day}_startTime`] = 'Start time must be in HH:MM format (24-hour).'; | |
| } | |
| if (stopTime && !isValidTimeFormat(stopTime)) { | |
| errors[`${day}_stopTime`] = 'Stop time must be in HH:MM format (24-hour).'; | |
| } | |
| // Use enhanced validation for time pair with clearer error messages | |
| if (startTime && stopTime && isValidTimeFormat(startTime) && isValidTimeFormat(stopTime)) { | |
| const timePairError = validateTimePair(startTime, stopTime); | |
| if (timePairError) { | |
| errors[`${day}_times`] = timePairError; | |
| } | |
| } | |
| } else if (daySchedule && (daySchedule.startTime || daySchedule.stopTime)) { | |
| // Partial schedule validation | |
| if (daySchedule.startTime && !daySchedule.stopTime) { | |
| errors[`${day}_stopTime`] = 'Stop time is required when Start time is provided.'; | |
| } | |
| if (daySchedule.stopTime && !daySchedule.startTime) { | |
| errors[`${day}_startTime`] = 'Start time is required when Stop time is provided.'; | |
| } | |
| } | |
| }); | |
| if (!hasAtLeastOneDay) { | |
| errors.general = 'At least one day must have a schedule configured.'; | |
| } | |
| return errors; | |
| }; |
| const showDailySchedule = showScheduleOptions && scheduleType === ScheduleType.RECURRING; | ||
| const showWeeklySchedule = showScheduleOptions && scheduleType === ScheduleType.DAILY; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The logic for determining which schedule UI to show is reversed. Line 179 shows daily schedule when scheduleType === ScheduleType.RECURRING, but based on the PR description, RECURRING means same time daily (which should show the daily schedule UI). Line 180 shows weekly schedule when scheduleType === ScheduleType.DAILY, but DAILY means different times per day (which should show the weekly schedule UI). These assignments are backwards.
| const showDailySchedule = showScheduleOptions && scheduleType === ScheduleType.RECURRING; | |
| const showWeeklySchedule = showScheduleOptions && scheduleType === ScheduleType.DAILY; | |
| const showDailySchedule = showScheduleOptions && scheduleType === ScheduleType.DAILY; | |
| const showWeeklySchedule = showScheduleOptions && scheduleType === ScheduleType.RECURRING; |
| if (scheduleType === ScheduleType.RECURRING) { | ||
| const dailyError = validateDailySchedule(props.item.dailySchedule); | ||
| if (dailyError) { | ||
| errors.dailySchedule = dailyError; | ||
| } | ||
| } else if (scheduleType === ScheduleType.DAILY) { | ||
| const weeklyErrors = validateWeeklySchedule(props.item.weeklySchedule); | ||
| Object.assign(errors, weeklyErrors); | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The validation logic is calling the wrong validation functions for each schedule type. Line 199 calls validateDailySchedule when scheduleType === ScheduleType.RECURRING, but based on the corrected function names, it should call validateRecurringSchedule. Line 204 calls validateWeeklySchedule when scheduleType === ScheduleType.DAILY, but it should call validateDailySchedule.
| if (scheduleType === ScheduleType.RECURRING) { | |
| const dailyError = validateDailySchedule(props.item.dailySchedule); | |
| if (dailyError) { | |
| errors.dailySchedule = dailyError; | |
| } | |
| } else if (scheduleType === ScheduleType.DAILY) { | |
| const weeklyErrors = validateWeeklySchedule(props.item.weeklySchedule); | |
| Object.assign(errors, weeklyErrors); | |
| } | |
| if (scheduleType === ScheduleType.RECURRING) { | |
| const recurringError = validateRecurringSchedule(props.item.dailySchedule); | |
| if (recurringError) { | |
| errors.dailySchedule = recurringError; | |
| } | |
| } else if (scheduleType === ScheduleType.DAILY) { | |
| const dailyErrors = validateDailySchedule(props.item.weeklySchedule); | |
| Object.assign(errors, dailyErrors); | |
| } |
…anagement-ui-change
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Files selected (11)
- ecs_model_deployer/src/lib/ecs-model.ts (2)
- lib/core/apiBaseConstruct.ts (2)
- lib/mcp/mcp-server-api.ts (2)
- lib/models/model-api.ts (8)
- lib/rag/ragConstruct.ts (2)
- lib/schema/constants.ts (1)
- lib/schema/index.ts (1)
- lib/serve/serveApplicationConstruct.ts (2)
- test/lambda/test_schedule_handlers.py (14)
- test/lambda/test_schedule_management.py (14)
- test/lambda/test_state_machine_schedule_handlers.py (12)
Files skipped from review due to trivial changes (1)
- lib/schema/constants.ts
Review comments generated (33)
- Review: 33
- LGTM: 0
Tips
Chat with AI reviewer (/reviewbot)
- Reply on review comments left by this bot to ask follow-up questions. A review comment is a comment on a diff or a file.
- Invite the bot into a review comment chain by tagging
/reviewbotin a reply.
Code suggestions
- The bot may make code suggestions, but please review them carefully before committing since the line number ranges may be misaligned.
- You can edit the comment made by the bot and manually tweak the suggestion if it is slightly off.
Pausing incremental reviews
- Add
/reviewbot: ignoreanywhere in the PR description to pause further reviews from the bot.
| new Rule(this, 'AutoScalingEventsRule', { | ||
| eventPattern: { | ||
| source: ['aws.autoscaling'], | ||
| detailType: [ | ||
| 'EC2 Instance Launch Successful', | ||
| 'EC2 Instance Terminate Successful' | ||
| ] | ||
| }, | ||
| targets: [new LambdaFunction(scheduleMonitoringLambda)], | ||
| description: 'Triggers ScheduleMonitoring Lambda when Auto Scaling Group instances launch or terminate successfully', | ||
| }); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The CloudWatch Events rule is created but never stored or exported. If other constructs need to reference this rule (e.g., for adding additional targets or modifying it), consider storing it as a class property:
| new Rule(this, 'AutoScalingEventsRule', { | |
| eventPattern: { | |
| source: ['aws.autoscaling'], | |
| detailType: [ | |
| 'EC2 Instance Launch Successful', | |
| 'EC2 Instance Terminate Successful' | |
| ] | |
| }, | |
| targets: [new LambdaFunction(scheduleMonitoringLambda)], | |
| description: 'Triggers ScheduleMonitoring Lambda when Auto Scaling Group instances launch or terminate successfully', | |
| }); | |
| this.autoScalingEventsRule = new Rule(this, 'AutoScalingEventsRule', { | |
| eventPattern: { | |
| source: ['aws.autoscaling'], | |
| detailType: [ | |
| 'EC2 Instance Launch Successful', | |
| 'EC2 Instance Terminate Successful' | |
| ] | |
| }, | |
| targets: [new LambdaFunction(scheduleMonitoringLambda)], | |
| description: 'Triggers ScheduleMonitoring Lambda when Auto Scaling Group instances launch or terminate successfully', | |
| }); |
| const scheduleManagementLambda = new Function(this, 'ScheduleManagement', { | ||
| runtime: getDefaultRuntime(), | ||
| handler: 'models.scheduling.schedule_management.lambda_handler', | ||
| code: Code.fromAsset(lambdaPath), | ||
| layers: lambdaLayers, | ||
| environment: { | ||
| MODEL_TABLE_NAME: modelTable.tableName, | ||
| }, | ||
| role: stateMachinesLambdaRole, | ||
| vpc: vpc.vpc, | ||
| securityGroups: securityGroups, | ||
| timeout: Duration.minutes(5), | ||
| description: 'Manages Auto Scaling scheduled actions for LISA model scheduling', | ||
| }); | ||
|
|
||
| const scheduleMonitoringLambda = new Function(this, 'ScheduleMonitoring', { | ||
| runtime: getDefaultRuntime(), | ||
| handler: 'models.scheduling.schedule_monitoring.lambda_handler', | ||
| code: Code.fromAsset(lambdaPath), | ||
| layers: lambdaLayers, | ||
| environment: { | ||
| MODEL_TABLE_NAME: modelTable.tableName, | ||
| ECS_CLUSTER_NAME: `${config.deploymentPrefix}-ECS-Cluster`, | ||
| LISA_API_URL_PS_NAME: lisaServeEndpointUrlPs.parameterName, | ||
| MANAGEMENT_KEY_NAME: managementKeyName, | ||
| REST_API_VERSION: 'v2', | ||
| }, | ||
| role: stateMachinesLambdaRole, | ||
| vpc: vpc.vpc, | ||
| securityGroups: securityGroups, | ||
| timeout: Duration.minutes(5), | ||
| description: 'Processes Auto Scaling Group CloudWatch events to update model status', | ||
| }); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Both scheduleManagementLambda and scheduleMonitoringLambda are created but only scheduleManagementLambda is passed to the state machine. Consider storing both as class properties to enable external access and potential future integrations:
| const scheduleManagementLambda = new Function(this, 'ScheduleManagement', { | |
| runtime: getDefaultRuntime(), | |
| handler: 'models.scheduling.schedule_management.lambda_handler', | |
| code: Code.fromAsset(lambdaPath), | |
| layers: lambdaLayers, | |
| environment: { | |
| MODEL_TABLE_NAME: modelTable.tableName, | |
| }, | |
| role: stateMachinesLambdaRole, | |
| vpc: vpc.vpc, | |
| securityGroups: securityGroups, | |
| timeout: Duration.minutes(5), | |
| description: 'Manages Auto Scaling scheduled actions for LISA model scheduling', | |
| }); | |
| const scheduleMonitoringLambda = new Function(this, 'ScheduleMonitoring', { | |
| runtime: getDefaultRuntime(), | |
| handler: 'models.scheduling.schedule_monitoring.lambda_handler', | |
| code: Code.fromAsset(lambdaPath), | |
| layers: lambdaLayers, | |
| environment: { | |
| MODEL_TABLE_NAME: modelTable.tableName, | |
| ECS_CLUSTER_NAME: `${config.deploymentPrefix}-ECS-Cluster`, | |
| LISA_API_URL_PS_NAME: lisaServeEndpointUrlPs.parameterName, | |
| MANAGEMENT_KEY_NAME: managementKeyName, | |
| REST_API_VERSION: 'v2', | |
| }, | |
| role: stateMachinesLambdaRole, | |
| vpc: vpc.vpc, | |
| securityGroups: securityGroups, | |
| timeout: Duration.minutes(5), | |
| description: 'Processes Auto Scaling Group CloudWatch events to update model status', | |
| }); | |
| this.scheduleManagementLambda = new Function(this, 'ScheduleManagement', { | |
| runtime: getDefaultRuntime(), | |
| handler: 'models.scheduling.schedule_management.lambda_handler', | |
| code: Code.fromAsset(lambdaPath), | |
| layers: lambdaLayers, | |
| environment: { | |
| MODEL_TABLE_NAME: modelTable.tableName, | |
| }, | |
| role: stateMachinesLambdaRole, | |
| vpc: vpc.vpc, | |
| securityGroups: securityGroups, | |
| timeout: Duration.minutes(5), | |
| description: 'Manages Auto Scaling scheduled actions for LISA model scheduling', | |
| }); | |
| this.scheduleMonitoringLambda = new Function(this, 'ScheduleMonitoring', { | |
| runtime: getDefaultRuntime(), | |
| handler: 'models.scheduling.schedule_monitoring.lambda_handler', | |
| code: Code.fromAsset(lambdaPath), | |
| layers: lambdaLayers, | |
| environment: { | |
| MODEL_TABLE_NAME: modelTable.tableName, | |
| ECS_CLUSTER_NAME: `${config.deploymentPrefix}-ECS-Cluster`, | |
| LISA_API_URL_PS_NAME: lisaServeEndpointUrlPs.parameterName, | |
| MANAGEMENT_KEY_NAME: managementKeyName, | |
| REST_API_VERSION: 'v2', | |
| }, | |
| role: stateMachinesLambdaRole, | |
| vpc: vpc.vpc, | |
| securityGroups: securityGroups, | |
| timeout: Duration.minutes(5), | |
| description: 'Processes Auto Scaling Group CloudWatch events to update model status', | |
| }); |
| @patch("models.scheduling.schedule_management.calculate_next_scheduled_action") | ||
| def test_update_operation_success( | ||
| self, mock_calculate_next, mock_autoscaling_client, mock_model_table, lambda_context, sample_schedule_config | ||
| self, mock_autoscaling_client, mock_model_table, lambda_context, sample_schedule_config |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The @patch decorator for calculate_next_scheduled_action was removed from the method signature. Verify that the test logic no longer depends on this mock. If the function is still being called internally, this could cause the test to fail or behave unexpectedly.
| # Mock successful Auto Scaling operations | ||
| mock_autoscaling_client.put_scheduled_update_group_action.return_value = {} | ||
| mock_autoscaling_client.describe_scheduled_actions.return_value = {"ScheduledUpdateGroupActions": []} | ||
| mock_autoscaling_client.describe_auto_scaling_groups.return_value = { | ||
| "AutoScalingGroups": [{"MinSize": 1, "MaxSize": 10, "DesiredCapacity": 3}] | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The mock setup was simplified by removing mock_calculate_next and describe_scheduled_actions mocks. Ensure that the actual implementation of lambda_handler doesn't call describe_scheduled_actions or that it's mocked elsewhere. If it does, this test may not properly validate the behavior.
| class TestHelperFunctions: | ||
| """Test helper functions.""" | ||
|
|
||
| result = calculate_next_scheduled_action(schedule_config) | ||
| def test_time_to_cron(self): | ||
| """Test time_to_cron function.""" | ||
| from models.scheduling.schedule_management import time_to_cron | ||
|
|
||
| # Just verify it returns a result | ||
| assert result is not None | ||
| assert result["action"] in ["START", "STOP"] | ||
|
|
||
| def test_timezone_conversion(self): | ||
| """Test timezone conversion in schedule calculation.""" | ||
| from models.scheduling.schedule_management import calculate_next_scheduled_action | ||
|
|
||
| schedule_config = { | ||
| "scheduleType": "RECURRING_DAILY", | ||
| "timezone": "America/New_York", | ||
| "dailySchedule": {"startTime": "09:00", "stopTime": "17:00"}, | ||
| } | ||
|
|
||
| result = calculate_next_scheduled_action(schedule_config) | ||
|
|
||
| # Just verify it returns a result | ||
| assert result is not None | ||
| assert result["action"] in ["START", "STOP"] | ||
| result = time_to_cron("09:30") | ||
| assert result == "30 9 * * *" | ||
|
|
||
| def test_no_schedule_type(self): | ||
| """Test None schedule type.""" | ||
| from models.scheduling.schedule_management import calculate_next_scheduled_action | ||
| result = time_to_cron("23:45") | ||
| assert result == "45 23 * * *" | ||
|
|
||
| schedule_config = {"scheduleType": None, "timezone": "UTC"} | ||
| def test_time_to_cron_with_day(self): | ||
| """Test time_to_cron_with_day function.""" | ||
| from models.scheduling.schedule_management import time_to_cron_with_day | ||
|
|
||
| result = calculate_next_scheduled_action(schedule_config) | ||
| # Monday is 1 | ||
| result = time_to_cron_with_day("09:30", 1) | ||
| assert result == "30 9 * * 1" | ||
|
|
||
| assert result is None | ||
| # Sunday is 0 | ||
| result = time_to_cron_with_day("14:15", 0) | ||
| assert result == "15 14 * * 0" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The entire TestCalculateNextScheduledAction class was replaced with TestHelperFunctions containing simpler utility function tests. According to the PR changes, complex scheduling calculation tests were removed. However, verify that the removed tests for calculate_next_scheduled_action with different schedule types (RECURRING_DAILY, EACH_DAY, timezone conversion) are not critical for validating the scheduling logic, or ensure equivalent coverage exists elsewhere.
| @patch("models.handler.schedule_handlers.schedule_management.delete_schedule") | ||
| def test_delete_schedule_lambda_error_handling( | ||
| self, | ||
| mock_delete_schedule, | ||
| mock_model_table, | ||
| mock_guardrails_table, | ||
| mock_autoscaling_client, | ||
| mock_stepfunctions_client, | ||
| sample_model_item, | ||
| ): | ||
| """Test DeleteSchedule Lambda error handling.""" | ||
| """Test DeleteSchedule schedule management error handling.""" | ||
| mock_model_table.get_item.return_value = {"Item": sample_model_item} | ||
|
|
||
| # Setup mock lambda error response | ||
| mock_lambda_response = {"StatusCode": 500, "Payload": MagicMock()} | ||
| mock_lambda_response["Payload"].read.return_value = json.dumps( | ||
| {"statusCode": 500, "body": {"message": "Lambda execution failed"}} | ||
| ).encode() | ||
| # Setup mock schedule management error response | ||
| mock_delete_schedule.return_value = {"statusCode": 500, "body": {"message": "Schedule management failed"}} | ||
|
|
||
| with patch("boto3.client") as mock_boto3: | ||
| mock_boto3.return_value.invoke.return_value = mock_lambda_response | ||
|
|
||
| handler = DeleteScheduleHandler( | ||
| autoscaling_client=mock_autoscaling_client, | ||
| stepfunctions_client=mock_stepfunctions_client, | ||
| model_table_resource=mock_model_table, | ||
| guardrails_table_resource=mock_guardrails_table, | ||
| ) | ||
| handler = DeleteScheduleHandler( | ||
| autoscaling_client=mock_autoscaling_client, | ||
| stepfunctions_client=mock_stepfunctions_client, | ||
| model_table_resource=mock_model_table, | ||
| guardrails_table_resource=mock_guardrails_table, | ||
| ) | ||
|
|
||
| with pytest.raises(ValueError, match="Failed to delete schedule: Lambda execution failed"): | ||
| handler(model_id="test-model") | ||
| with pytest.raises(ValueError, match="Failed to delete schedule: Schedule management failed"): | ||
| handler(model_id="test-model") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The test mocks schedule_management.delete_schedule to return an error response. The expected error message is "Failed to delete schedule: Schedule management failed". Verify the handler code properly extracts the message from the response body.
| "scheduleType": "RECURRING", | ||
| "timezone": "UTC", | ||
| "dailySchedule": {"startTime": "09:00", "stopTime": "17:00"}, | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The schedule type constant has been updated from RECURRING_DAILY to RECURRING which aligns with the PR changes. However, verify that all test data throughout this file uses the updated constant name consistently. Consider checking if there are other test methods that might still reference the old constant.
| ).encode() | ||
| mock_lambda_client.invoke.return_value = mock_lambda_response | ||
| # Mock schedule management error response | ||
| mock_update_schedule.return_value = {"statusCode": 500, "body": {"message": "Schedule management failed"}} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The mock response structure has changed from a nested JSON string to a direct dict for the body. Verify that the actual schedule_management.update_schedule() function returns {"statusCode": 500, "body": {...}} with body as a dict (not JSON string). If the actual implementation returns body as a JSON string, this mock will not match the real behavior.
| # Verify result (should pass through unchanged, no schedule management call) | ||
| assert result["modelId"] == "test-model" | ||
| mock_lambda_client.invoke.assert_not_called() | ||
| assert "scheduled_action_arns" not in result |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The comment on line 118 states "should pass through unchanged, no schedule management call" but there's no assertion verifying that mock_update_schedule was not called. Consider adding mock_update_schedule.assert_not_called() to verify the handler correctly skips schedule management when ASG is missing.
| @patch("models.state_machine.schedule_handlers.schedule_management.delete_schedule") | ||
| def test_handle_cleanup_schedule_success(self, mock_delete_schedule, lambda_context): | ||
| """Test successful schedule cleanup.""" | ||
| from models.state_machine.schedule_handlers import handle_cleanup_schedule | ||
|
|
||
| # Mock event | ||
| event = {"modelId": "test-model"} | ||
|
|
||
| # Mock successful lambda response | ||
| mock_lambda_response = {"StatusCode": 200, "Payload": MagicMock()} | ||
| mock_lambda_response["Payload"].read.return_value = json.dumps( | ||
| {"statusCode": 200, "body": json.dumps({"message": "Schedule deleted successfully"})} | ||
| ).encode() | ||
| mock_lambda_client.invoke.return_value = mock_lambda_response | ||
| # Mock successful schedule management response | ||
| mock_delete_schedule.return_value = { | ||
| "statusCode": 200, | ||
| "body": json.dumps({"message": "Schedule deleted successfully"}), | ||
| } | ||
|
|
||
| # Execute | ||
| result = handle_cleanup_schedule(event, lambda_context) | ||
|
|
||
| # Verify result | ||
| assert result["modelId"] == "test-model" | ||
|
|
||
| # Verify lambda invocation | ||
| mock_lambda_client.invoke.assert_called_once() | ||
| # Verify schedule management was called | ||
| mock_delete_schedule.assert_called_once() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The test mocks schedule_management.delete_schedule but the old code mocked lambda_client. Ensure that the actual handle_cleanup_schedule implementation now calls schedule_management.delete_schedule() directly instead of invoking a Lambda function. The mock response structure (body as JSON string) should also be verified against the actual implementation.
Screen.Recording.2025-11-24.at.10.25.21.movupdate.demo.movASG.auto.schedule.mov |
Features
Changes
Frontend
ScheduleConfig.tsx: Scheduling UI with validationAutoScalingConfig.tsx: Toggle between 24/7 and scheduled operationModelManagementUtils.tsx: Schedule display in model cardsBackend
schedule_management.py: Core scheduling logic with ASG integrationschedule_monitoring.py: CloudWatch Events processing and LiteLLM lifecycleschedule_handlers.py: State machine integrationdomain_objects.py: Enhanced data modelsTechnical Details
LiteLLM Integration
Validation
UI Changes
Screenshots
Auto Scaling Configuration
Model Management Cards
By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.
Summary (generated)
Summary of Changes
This pull request implements comprehensive schedule management features for model auto-scaling across the entire application stack. Below is an overview of the major changes:
Backend Implementation
Schedule Management Core
EACH_DAY/RECURRING_DAILYrenamed toDAILY/RECURRINGautoScalingGroup→auto_scaling_group,status→model_statusNew Functions and Modules
register_litellm()andremove_litellm()for model lifecycle managementschedule_handlers.pymodule implementing core scheduling logic and CloudWatch Events monitoringscale_immediately()function to enforce schedules upon model creationmerge_schedule_data()to preserve existing schedule metadata during updatesadjust_initial_capacity_for_schedule()integrated intohandle_create_model_stack()Data Structure Changes
model_config.autoScalingConfig.schedulingJobStatusPydantic model for job status trackingValidation and Error Handling
Infrastructure Changes
CDK Stack Updates
ScheduleManagementLambdafor handling scheduled ASG actionsScheduleMonitoringLambdafor processing CloudWatch eventsPutScheduledUpdateGroupAction,DeleteScheduledAction,DescribeScheduledActionsConstants Refactoring
APP_MANAGEMENT_KEYconstant across infrastructure files for improved maintainabilityFrontend UI Implementation
New Components
ScheduleConfig.tsxcomponent providing React form interface for schedule managementFeatures
formatScheduleType()andformatScheduleDetails()Type Definitions
ScheduleTypeenum and schedule-related TypeScript typesIAutoScalingConfigAPI Integration
useUpdateScheduleMutationhook for schedule-only updatesTesting
All changes maintain backward compatibility through optional scheduling fields.