Skip to content

Conversation

@nishi7409
Copy link
Member

@nishi7409 nishi7409 commented Nov 21, 2025

Features

  • Daily scheduling (different times per day)
  • Recurring scheduling (same time daily)
  • Timezone support
  • Auto Scaling Group integration
  • LiteLLM automatic registration/deregistration
  • CloudWatch Events monitoring

Changes

Frontend

  • ScheduleConfig.tsx: Scheduling UI with validation
  • AutoScalingConfig.tsx: Toggle between 24/7 and scheduled operation
  • ModelManagementUtils.tsx: Schedule display in model cards
  • Added comprehensive test coverage

Backend

  • schedule_management.py: Core scheduling logic with ASG integration
  • schedule_monitoring.py: CloudWatch Events processing and LiteLLM lifecycle
  • schedule_handlers.py: State machine integration
  • domain_objects.py: Enhanced data models

Technical Details

LiteLLM Integration

  • Models register with LiteLLM when scaling up
  • Models deregister when scaling down
  • Retry logic with exponential backoff

Validation

  • 24-hour time format validation
  • 2-hour minimum schedule windows
  • Timezone validation
  • At least one day required for daily schedules

UI Changes

  • Simple Auto Scaling toggle
  • Progressive disclosure of schedule options
  • Clean timezone and time selection
  • Improved model card display format

Screenshots

Auto Scaling Configuration

image image image

Model Management Cards

image

By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.

Summary (generated)

Summary of Changes

This pull request implements comprehensive schedule management features for model auto-scaling across the entire application stack. Below is an overview of the major changes:

Backend Implementation

Schedule Management Core

  • Introduced daily and recurring scheduling with timezone support
  • Auto Scaling Group (ASG) integration for capacity management
  • LiteLLM automatic registration and deregistration lifecycle management
  • Enum simplification: EACH_DAY/RECURRING_DAILY renamed to DAILY/RECURRING
  • DynamoDB field updates: autoScalingGroupauto_scaling_group, statusmodel_status

New Functions and Modules

  • register_litellm() and remove_litellm() for model lifecycle management
  • schedule_handlers.py module implementing core scheduling logic and CloudWatch Events monitoring
  • scale_immediately() function to enforce schedules upon model creation
  • merge_schedule_data() to preserve existing schedule metadata during updates
  • adjust_initial_capacity_for_schedule() integrated into handle_create_model_stack()

Data Structure Changes

  • Schedule data now stored under model_config.autoScalingConfig.scheduling
  • New JobStatus Pydantic model for job status tracking
  • Removed default fallback values from Lambda function name environment variable lookups

Validation and Error Handling

  • 24-hour format validation for time inputs
  • 2-hour minimum scheduling window enforcement
  • Timezone verification
  • Enhanced error handling and logging throughout

Infrastructure Changes

CDK Stack Updates

  • New ScheduleManagementLambda for handling scheduled ASG actions
  • New ScheduleMonitoringLambda for processing CloudWatch events
  • CloudWatch Events rule for ASG instance launch/terminate events
  • Expanded IAM permissions for Auto Scaling operations: PutScheduledUpdateGroupAction, DeleteScheduledAction, DescribeScheduledActions
  • State machine workflow modified to include schedule creation step after stack creation

Constants Refactoring

  • Replaced hardcoded strings with APP_MANAGEMENT_KEY constant across infrastructure files for improved maintainability

Frontend UI Implementation

New Components

  • ScheduleConfig.tsx component providing React form interface for schedule management
  • Support for daily and recurring scheduling with timezone selection
  • Progressive disclosure of schedule options based on user selections

Features

  • Timezone selection from IANA timezone database with fallback support
  • Per-day schedule configuration
  • Comprehensive validation for time formats and scheduling windows
  • New card definition sections for displaying schedule information
  • Utility functions: formatScheduleType() and formatScheduleDetails()

Type Definitions

  • New ScheduleType enum and schedule-related TypeScript types
  • Zod validation schemas for time format (HH:MM), minimum windows, and timezone requirements
  • Integration of scheduling configuration into IAutoScalingConfig

API Integration

  • New useUpdateScheduleMutation hook for schedule-only updates
  • Dedicated schedule API endpoint routing
  • Improved error handling and loading state management

Testing

  • Comprehensive test coverage for new scheduling functionality
  • Tests refactored to use schedule management functions directly
  • Edge case and error scenario coverage
  • ASG integration and LiteLLM lifecycle management tests
  • Exception handling and retry logic validation

All changes maintain backward compatibility through optional scheduling fields.

@github-actions
Copy link

github-actions bot commented Nov 21, 2025

Summary of Changes

This pull request implements comprehensive schedule management features for model auto-scaling across the entire application stack. Below is an overview of the major changes:

Backend Implementation

Schedule Management Core

  • Introduced daily and recurring scheduling with timezone support
  • Auto Scaling Group (ASG) integration for capacity management
  • LiteLLM automatic registration and deregistration lifecycle management
  • Enum simplification: EACH_DAY/RECURRING_DAILY renamed to DAILY/RECURRING
  • DynamoDB field updates: autoScalingGroupauto_scaling_group, statusmodel_status

New Functions and Modules

  • register_litellm() and remove_litellm() for model lifecycle management
  • schedule_handlers.py module implementing core scheduling logic and CloudWatch Events monitoring
  • scale_immediately() function to enforce schedules upon model creation
  • merge_schedule_data() to preserve existing schedule metadata during updates
  • adjust_initial_capacity_for_schedule() integrated into handle_create_model_stack()

Data Structure Changes

  • Schedule data now stored under model_config.autoScalingConfig.scheduling
  • New JobStatus Pydantic model for job status tracking
  • Removed default fallback values from Lambda function name environment variable lookups

Validation and Error Handling

  • 24-hour format validation for time inputs
  • 2-hour minimum scheduling window enforcement
  • Timezone verification
  • Enhanced error handling and logging throughout

Infrastructure Changes

CDK Stack Updates

  • New ScheduleManagementLambda for handling scheduled ASG actions
  • New ScheduleMonitoringLambda for processing CloudWatch events
  • CloudWatch Events rule for ASG instance launch/terminate events
  • Expanded IAM permissions for Auto Scaling operations: PutScheduledUpdateGroupAction, DeleteScheduledAction, DescribeScheduledActions
  • State machine workflow modified to include schedule creation step after stack creation

Constants Refactoring

  • Replaced hardcoded strings with APP_MANAGEMENT_KEY constant across infrastructure files for improved maintainability

Frontend UI Implementation

New Components

  • ScheduleConfig.tsx component providing React form interface for schedule management
  • Support for daily and recurring scheduling with timezone selection
  • Progressive disclosure of schedule options based on user selections

Features

  • Timezone selection from IANA timezone database with fallback support
  • Per-day schedule configuration
  • Comprehensive validation for time formats and scheduling windows
  • New card definition sections for displaying schedule information
  • Utility functions: formatScheduleType() and formatScheduleDetails()

Type Definitions

  • New ScheduleType enum and schedule-related TypeScript types
  • Zod validation schemas for time format (HH:MM), minimum windows, and timezone requirements
  • Integration of scheduling configuration into IAutoScalingConfig

API Integration

  • New useUpdateScheduleMutation hook for schedule-only updates
  • Dedicated schedule API endpoint routing
  • Improved error handling and loading state management

Testing

  • Comprehensive test coverage for new scheduling functionality
  • Tests refactored to use schedule management functions directly
  • Edge case and error scenario coverage
  • ASG integration and LiteLLM lifecycle management tests
  • Exception handling and retry logic validation

All changes maintain backward compatibility through optional scheduling fields.

logger.error(f"Failed to check ASG state: {e}")
raise ValueError(f"Failed to check ASG {asg_name}: {str(e)}")

# Update model status

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Inconsistent indentation detected. This line has extra leading whitespace:

Suggested change
# Update model status
# Update model status

try:
response = model_table.scan(
FilterExpression="autoScalingGroup = :asg_name",
FilterExpression="auto_scaling_group = :asg_name",

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The field name was changed from autoScalingGroup to auto_scaling_group. Verify that this matches the actual DynamoDB schema. If the schema uses snake_case, this is correct, but if it uses camelCase, this will cause the scan to fail silently and return no results.

Comment on lines 321 to 335
def update_model_status(model_id: str, new_status: ModelStatus, reason: str) -> None:
"""Update model status in DynamoDB"""
try:
# Convert enum to string value for DynamoDB
status_str = new_status.value if hasattr(new_status, "value") else str(new_status)

model_table.update_item(
Key={"model_id": model_id},
UpdateExpression="SET #status = :status, lastStatusUpdate = :timestamp, statusReason = :reason",
ExpressionAttributeNames={"#status": "status"},
UpdateExpression="SET model_status = :status, lastStatusUpdate = :timestamp, statusReason = :reason",
ExpressionAttributeValues={
":status": new_status,
":status": status_str,
":timestamp": datetime.now(dt_timezone.utc).isoformat(),
":reason": reason,
},
)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Inconsistent indentation detected throughout this function. Lines 321-335 have extra leading whitespace:

Suggested change
def update_model_status(model_id: str, new_status: ModelStatus, reason: str) -> None:
"""Update model status in DynamoDB"""
try:
# Convert enum to string value for DynamoDB
status_str = new_status.value if hasattr(new_status, "value") else str(new_status)
model_table.update_item(
Key={"model_id": model_id},
UpdateExpression="SET #status = :status, lastStatusUpdate = :timestamp, statusReason = :reason",
ExpressionAttributeNames={"#status": "status"},
UpdateExpression="SET model_status = :status, lastStatusUpdate = :timestamp, statusReason = :reason",
ExpressionAttributeValues={
":status": new_status,
":status": status_str,
":timestamp": datetime.now(dt_timezone.utc).isoformat(),
":reason": reason,
},
)
def update_model_status(model_id: str, new_status: ModelStatus, reason: str) -> None:
"""Update model status in DynamoDB"""
try:
# Convert enum to string value for DynamoDB
status_str = new_status.value if hasattr(new_status, "value") else str(new_status)
model_table.update_item(
Key={"model_id": model_id},
UpdateExpression="SET model_status = :status, lastStatusUpdate = :timestamp, statusReason = :reason",
ExpressionAttributeValues={
":status": status_str,
":timestamp": datetime.now(dt_timezone.utc).isoformat(),
":reason": reason,
},
)
logger.info(f"Updated model {model_id} model_status to {status_str}: {reason}")

Key={"model_id": model_id},
UpdateExpression="SET #status = :status, lastStatusUpdate = :timestamp, statusReason = :reason",
ExpressionAttributeNames={"#status": "status"},
UpdateExpression="SET model_status = :status, lastStatusUpdate = :timestamp, statusReason = :reason",

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The DynamoDB field name was changed from status to model_status. Verify that this matches the actual DynamoDB schema. The old code used ExpressionAttributeNames to handle the reserved keyword status, but the new field name model_status is not a reserved word. Ensure this field exists in your DynamoDB table schema.

Comment on lines +436 to +550
def register_litellm(model_id: str) -> None:
"""Register model with LiteLLM if missing"""
try:
model_key = {"model_id": model_id}
ddb_item = model_table.get_item(Key=model_key, ConsistentRead=True).get("Item")

if not ddb_item:
logger.warning(f"Model {model_id} not found in DynamoDB")
return

# Check if already registered
existing_litellm_id = ddb_item.get("litellm_id")
if existing_litellm_id:
logger.info(f"Model {model_id} already registered with LiteLLM")
return

model_url = ddb_item.get("model_url")
if not model_url:
logger.warning(f"Model {model_id} has no model_url")
return

# Initialize LiteLLM client
secrets_manager = boto3.client("secretsmanager", region_name=os.environ["AWS_REGION"], config=retry_config)
iam_client = boto3.client("iam", region_name=os.environ["AWS_REGION"], config=retry_config)

litellm_client = LiteLLMClient(
base_uri=get_rest_api_container_endpoint(),
verify=get_cert_path(iam_client),
headers={
"Authorization": secrets_manager.get_secret_value(
SecretId=os.environ.get("MANAGEMENT_KEY_NAME"), VersionStage="AWSCURRENT"
)["SecretString"],
"Content-Type": "application/json",
},
)

litellm_config_str = os.environ.get("LITELLM_CONFIG_OBJ", json.dumps({}))
try:
litellm_params = json.loads(litellm_config_str)
litellm_params = litellm_params.get("litellm_settings", {})
except json.JSONDecodeError:
litellm_params = {}

model_name = ddb_item["model_config"]["modelName"]
litellm_params["model"] = f"openai/{model_name}"
litellm_params["api_base"] = model_url

# Register with LiteLLM
litellm_response = litellm_client.add_model(
model_name=model_id,
litellm_params=litellm_params,
)

litellm_id = litellm_response["model_info"]["id"]
logger.info(f"Registered model {model_id} with LiteLLM: {litellm_id}")

# Update DynamoDB with new litellm_id
model_table.update_item(
Key=model_key,
UpdateExpression="SET litellm_id = :lid",
ExpressionAttributeValues={":lid": litellm_id},
)

except Exception as e:
logger.error(f"Failed to register {model_id} with LiteLLM: {e}", exc_info=True)


def remove_litellm(model_id: str) -> None:
"""Remove model from LiteLLM if registered"""
try:
model_key = {"model_id": model_id}
ddb_item = model_table.get_item(Key=model_key, ConsistentRead=True).get("Item")

if not ddb_item:
logger.warning(f"Model {model_id} not found in DynamoDB")
return

litellm_id = ddb_item.get("litellm_id")
if not litellm_id:
logger.info(f"Model {model_id} has no LiteLLM registration to remove")
return

try:
# Initialize LiteLLM client
secrets_manager = boto3.client("secretsmanager", region_name=os.environ["AWS_REGION"], config=retry_config)
iam_client = boto3.client("iam", region_name=os.environ["AWS_REGION"], config=retry_config)

litellm_client = LiteLLMClient(
base_uri=get_rest_api_container_endpoint(),
verify=get_cert_path(iam_client),
headers={
"Authorization": secrets_manager.get_secret_value(
SecretId=os.environ.get("MANAGEMENT_KEY_NAME"), VersionStage="AWSCURRENT"
)["SecretString"],
"Content-Type": "application/json",
},
)

# Remove from LiteLLM
litellm_client.delete_model(identifier=litellm_id)
logger.info(f"Removed model {model_id} from LiteLLM: {litellm_id}")

# Clear litellm_id from DynamoDB
model_table.update_item(
Key=model_key,
UpdateExpression="SET litellm_id = :li",
ExpressionAttributeValues={":li": None},
)

except Exception as e:
logger.error(f"Failed to remove {model_id} from LiteLLM: {e}", exc_info=True)

except Exception as e:
logger.error(f"Error in remove_litellm for {model_id}: {e}", exc_info=True)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The register_litellm() and remove_litellm() functions are new and contain significant logic for LiteLLM client initialization. Consider extracting the common LiteLLM client initialization code (lines 458-470 and 520-532) into a shared helper function to reduce duplication:

Suggested change
def register_litellm(model_id: str) -> None:
"""Register model with LiteLLM if missing"""
try:
model_key = {"model_id": model_id}
ddb_item = model_table.get_item(Key=model_key, ConsistentRead=True).get("Item")
if not ddb_item:
logger.warning(f"Model {model_id} not found in DynamoDB")
return
# Check if already registered
existing_litellm_id = ddb_item.get("litellm_id")
if existing_litellm_id:
logger.info(f"Model {model_id} already registered with LiteLLM")
return
model_url = ddb_item.get("model_url")
if not model_url:
logger.warning(f"Model {model_id} has no model_url")
return
# Initialize LiteLLM client
secrets_manager = boto3.client("secretsmanager", region_name=os.environ["AWS_REGION"], config=retry_config)
iam_client = boto3.client("iam", region_name=os.environ["AWS_REGION"], config=retry_config)
litellm_client = LiteLLMClient(
base_uri=get_rest_api_container_endpoint(),
verify=get_cert_path(iam_client),
headers={
"Authorization": secrets_manager.get_secret_value(
SecretId=os.environ.get("MANAGEMENT_KEY_NAME"), VersionStage="AWSCURRENT"
)["SecretString"],
"Content-Type": "application/json",
},
)
litellm_config_str = os.environ.get("LITELLM_CONFIG_OBJ", json.dumps({}))
try:
litellm_params = json.loads(litellm_config_str)
litellm_params = litellm_params.get("litellm_settings", {})
except json.JSONDecodeError:
litellm_params = {}
model_name = ddb_item["model_config"]["modelName"]
litellm_params["model"] = f"openai/{model_name}"
litellm_params["api_base"] = model_url
# Register with LiteLLM
litellm_response = litellm_client.add_model(
model_name=model_id,
litellm_params=litellm_params,
)
litellm_id = litellm_response["model_info"]["id"]
logger.info(f"Registered model {model_id} with LiteLLM: {litellm_id}")
# Update DynamoDB with new litellm_id
model_table.update_item(
Key=model_key,
UpdateExpression="SET litellm_id = :lid",
ExpressionAttributeValues={":lid": litellm_id},
)
except Exception as e:
logger.error(f"Failed to register {model_id} with LiteLLM: {e}", exc_info=True)
def remove_litellm(model_id: str) -> None:
"""Remove model from LiteLLM if registered"""
try:
model_key = {"model_id": model_id}
ddb_item = model_table.get_item(Key=model_key, ConsistentRead=True).get("Item")
if not ddb_item:
logger.warning(f"Model {model_id} not found in DynamoDB")
return
litellm_id = ddb_item.get("litellm_id")
if not litellm_id:
logger.info(f"Model {model_id} has no LiteLLM registration to remove")
return
try:
# Initialize LiteLLM client
secrets_manager = boto3.client("secretsmanager", region_name=os.environ["AWS_REGION"], config=retry_config)
iam_client = boto3.client("iam", region_name=os.environ["AWS_REGION"], config=retry_config)
litellm_client = LiteLLMClient(
base_uri=get_rest_api_container_endpoint(),
verify=get_cert_path(iam_client),
headers={
"Authorization": secrets_manager.get_secret_value(
SecretId=os.environ.get("MANAGEMENT_KEY_NAME"), VersionStage="AWSCURRENT"
)["SecretString"],
"Content-Type": "application/json",
},
)
# Remove from LiteLLM
litellm_client.delete_model(identifier=litellm_id)
logger.info(f"Removed model {model_id} from LiteLLM: {litellm_id}")
# Clear litellm_id from DynamoDB
model_table.update_item(
Key=model_key,
UpdateExpression="SET litellm_id = :li",
ExpressionAttributeValues={":li": None},
)
except Exception as e:
logger.error(f"Failed to remove {model_id} from LiteLLM: {e}", exc_info=True)
except Exception as e:
logger.error(f"Error in remove_litellm for {model_id}: {e}", exc_info=True)
def _get_litellm_client() -> LiteLLMClient:
"""Initialize and return a LiteLLM client"""
secrets_manager = boto3.client("secretsmanager", region_name=os.environ["AWS_REGION"], config=retry_config)
iam_client = boto3.client("iam", region_name=os.environ["AWS_REGION"], config=retry_config)
return LiteLLMClient(
base_uri=get_rest_api_container_endpoint(),
verify=get_cert_path(iam_client),
headers={
"Authorization": secrets_manager.get_secret_value(
SecretId=os.environ.get("MANAGEMENT_KEY_NAME"), VersionStage="AWSCURRENT"
)["SecretString"],
"Content-Type": "application/json",
},
)

Comment on lines +82 to +169
def adjust_initial_capacity_for_schedule(prepared_event: Dict[str, Any]) -> None:
"""Adjust Auto Scaling Group initial capacity based on schedule configuration"""
try:
# Check if scheduling is configured
auto_scaling_config = prepared_event.get("autoScalingConfig", {})
scheduling_config = auto_scaling_config.get("scheduling")

if not scheduling_config or not scheduling_config.get("scheduleEnabled"):
logger.info("No scheduling configured - using original capacity settings")
return

schedule_type = scheduling_config.get("scheduleType")
timezone_name = scheduling_config.get("timezone")

try:
tz = ZoneInfo(timezone_name)
now = datetime.now(tz)
current_time = now.time()
current_day = now.strftime("%A").lower()
is_within_schedule = False

if schedule_type == "RECURRING" and scheduling_config.get("dailySchedule"):
# Daily recurring schedule
daily_schedule = scheduling_config["dailySchedule"]
start_time_str = daily_schedule.get("startTime")
stop_time_str = daily_schedule.get("stopTime")

if start_time_str and stop_time_str:
# Parse times
start_hour, start_minute = map(int, start_time_str.split(":"))
stop_hour, stop_minute = map(int, stop_time_str.split(":"))

start_time_obj = datetime.min.time().replace(hour=start_hour, minute=start_minute)
stop_time_obj = datetime.min.time().replace(hour=stop_hour, minute=stop_minute)

# Check if current time is within the schedule
if start_time_obj <= stop_time_obj:
# Normal schedule within same day
is_within_schedule = start_time_obj <= current_time <= stop_time_obj
else:
# Schedule crosses midnight
is_within_schedule = current_time >= start_time_obj or current_time <= stop_time_obj

elif schedule_type == "DAILY" and scheduling_config.get("weeklySchedule"):
# Daily schedule
weekly_schedule = scheduling_config["weeklySchedule"]
today_schedule = weekly_schedule.get(current_day)

if today_schedule and today_schedule.get("startTime") and today_schedule.get("stopTime"):
start_time_str = today_schedule["startTime"]
stop_time_str = today_schedule["stopTime"]

# Parse times
start_hour, start_minute = map(int, start_time_str.split(":"))
stop_hour, stop_minute = map(int, stop_time_str.split(":"))

start_time_obj = datetime.min.time().replace(hour=start_hour, minute=start_minute)
stop_time_obj = datetime.min.time().replace(hour=stop_hour, minute=stop_minute)

# Check if current time is within the schedule
if start_time_obj <= stop_time_obj:
# Normal schedule within same day
is_within_schedule = start_time_obj <= current_time <= stop_time_obj
else:
# Schedule crosses midnight
is_within_schedule = current_time >= start_time_obj or current_time <= stop_time_obj

# Adjust capacity based on schedule
if is_within_schedule:
logger.info(f"Current time {current_time} ({timezone_name}) is within scheduled hours")
# Keep original capacity settings
else:
logger.info(f"Current time {current_time} ({timezone_name}) is outside scheduled hours")
# Set desired capacity to 0 for deployment outside scheduled hours
auto_scaling_config["minCapacity"] = 0
# Keep maxCapacity at original value - CloudFormation requires maxCapacity > 0
auto_scaling_config["desiredCapacity"] = 0

except Exception as time_error:
logger.error(f"Error processing schedule time logic: {time_error}", exc_info=True)
# If we can't determine the schedule, use default capacity to be safe
logger.info("Using original capacity settings due to schedule processing error")

except Exception as e:
logger.error(f"Error adjusting initial capacity for schedule: {e}", exc_info=True)
# If scheduling logic fails, proceed with original capacity settings
logger.info("Using original capacity settings due to scheduling error")

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The adjust_initial_capacity_for_schedule() function has several issues:

  1. Time parsing vulnerability: Lines 111-112 and 135-136 use split(":") without validation. If the time format is invalid (e.g., "25:70"), the int() conversion will fail. The PR description mentions "24-hour time format validation" but this function doesn't validate the format before parsing.

  2. Incorrect time object creation: Lines 114-115 and 138-139 use datetime.min.time().replace() which creates time objects at midnight (00:00:00). This is correct for time-only objects, but the logic is unclear. Consider using time() constructor directly:

Suggested change
def adjust_initial_capacity_for_schedule(prepared_event: Dict[str, Any]) -> None:
"""Adjust Auto Scaling Group initial capacity based on schedule configuration"""
try:
# Check if scheduling is configured
auto_scaling_config = prepared_event.get("autoScalingConfig", {})
scheduling_config = auto_scaling_config.get("scheduling")
if not scheduling_config or not scheduling_config.get("scheduleEnabled"):
logger.info("No scheduling configured - using original capacity settings")
return
schedule_type = scheduling_config.get("scheduleType")
timezone_name = scheduling_config.get("timezone")
try:
tz = ZoneInfo(timezone_name)
now = datetime.now(tz)
current_time = now.time()
current_day = now.strftime("%A").lower()
is_within_schedule = False
if schedule_type == "RECURRING" and scheduling_config.get("dailySchedule"):
# Daily recurring schedule
daily_schedule = scheduling_config["dailySchedule"]
start_time_str = daily_schedule.get("startTime")
stop_time_str = daily_schedule.get("stopTime")
if start_time_str and stop_time_str:
# Parse times
start_hour, start_minute = map(int, start_time_str.split(":"))
stop_hour, stop_minute = map(int, stop_time_str.split(":"))
start_time_obj = datetime.min.time().replace(hour=start_hour, minute=start_minute)
stop_time_obj = datetime.min.time().replace(hour=stop_hour, minute=stop_minute)
# Check if current time is within the schedule
if start_time_obj <= stop_time_obj:
# Normal schedule within same day
is_within_schedule = start_time_obj <= current_time <= stop_time_obj
else:
# Schedule crosses midnight
is_within_schedule = current_time >= start_time_obj or current_time <= stop_time_obj
elif schedule_type == "DAILY" and scheduling_config.get("weeklySchedule"):
# Daily schedule
weekly_schedule = scheduling_config["weeklySchedule"]
today_schedule = weekly_schedule.get(current_day)
if today_schedule and today_schedule.get("startTime") and today_schedule.get("stopTime"):
start_time_str = today_schedule["startTime"]
stop_time_str = today_schedule["stopTime"]
# Parse times
start_hour, start_minute = map(int, start_time_str.split(":"))
stop_hour, stop_minute = map(int, stop_time_str.split(":"))
start_time_obj = datetime.min.time().replace(hour=start_hour, minute=start_minute)
stop_time_obj = datetime.min.time().replace(hour=stop_hour, minute=stop_minute)
# Check if current time is within the schedule
if start_time_obj <= stop_time_obj:
# Normal schedule within same day
is_within_schedule = start_time_obj <= current_time <= stop_time_obj
else:
# Schedule crosses midnight
is_within_schedule = current_time >= start_time_obj or current_time <= stop_time_obj
# Adjust capacity based on schedule
if is_within_schedule:
logger.info(f"Current time {current_time} ({timezone_name}) is within scheduled hours")
# Keep original capacity settings
else:
logger.info(f"Current time {current_time} ({timezone_name}) is outside scheduled hours")
# Set desired capacity to 0 for deployment outside scheduled hours
auto_scaling_config["minCapacity"] = 0
# Keep maxCapacity at original value - CloudFormation requires maxCapacity > 0
auto_scaling_config["desiredCapacity"] = 0
except Exception as time_error:
logger.error(f"Error processing schedule time logic: {time_error}", exc_info=True)
# If we can't determine the schedule, use default capacity to be safe
logger.info("Using original capacity settings due to schedule processing error")
except Exception as e:
logger.error(f"Error adjusting initial capacity for schedule: {e}", exc_info=True)
# If scheduling logic fails, proceed with original capacity settings
logger.info("Using original capacity settings due to scheduling error")
start_time_obj = time(start_hour, start_minute)
stop_time_obj = time(stop_hour, stop_minute)
  1. Missing import: The code uses time() implicitly but doesn't import it. Add from datetime import time to the imports.

  2. Inconsistent error handling: The nested try-except blocks (lines 96-163) catch all exceptions broadly. The inner exception at line 160 logs and continues, but the outer exception at line 165 also logs and continues, making it unclear which error path is taken. Consider more specific exception handling.

  3. Logic issue with schedule type check: Line 103 checks schedule_type == "RECURRING" but line 125 checks schedule_type == "DAILY". According to the PR changes summary, the enum was renamed from RECURRING_DAILYRECURRING and EACH_DAYDAILY. Verify these string values match the frontend enum values being sent.

Comment on lines +337 to +338
# Adjust initial capacity based on schedule if scheduling is configured
adjust_initial_capacity_for_schedule(prepared_event)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The call to adjust_initial_capacity_for_schedule() modifies prepared_event in-place. Ensure this function is called at the right point in the workflow—after all necessary configuration is set but before the event is used downstream. Verify that modifying minCapacity and desiredCapacity at this stage doesn't conflict with any subsequent CloudFormation stack creation logic.

Comment on lines +361 to +363
# Remove scheduling configuration from autoScalingConfig before sending to ECS deployer
if "autoScalingConfig" in prepared_event and "scheduling" in prepared_event["autoScalingConfig"]:
del prepared_event["autoScalingConfig"]["scheduling"]

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The scheduling configuration is removed from autoScalingConfig before invoking the ECS deployer. This is intentional per the PR description, but ensure the ECS deployer Lambda doesn't expect this field. If it does, this could cause silent failures. Consider adding a log statement to confirm the removal:

Suggested change
# Remove scheduling configuration from autoScalingConfig before sending to ECS deployer
if "autoScalingConfig" in prepared_event and "scheduling" in prepared_event["autoScalingConfig"]:
del prepared_event["autoScalingConfig"]["scheduling"]
if "autoScalingConfig" in prepared_event and "scheduling" in prepared_event["autoScalingConfig"]:
logger.info("Removing scheduling configuration before ECS deployer invocation")
del prepared_event["autoScalingConfig"]["scheduling"]

Comment on lines +372 to +402
try:
response = lambdaClient.invoke(
FunctionName=os.environ["ECS_MODEL_DEPLOYER_FN_ARN"],
Payload=json.dumps({"modelConfig": prepared_event}),
)

except Exception as invoke_error:
raise StackFailedToCreateException(
json.dumps(
{
"error": f"Failed to invoke ECS Model Deployer Lambda: {str(invoke_error)}",
"event": event,
"invoke_error": str(invoke_error),
}
)
)

try:
payload = response["Payload"].read()
payload = json.loads(payload)
except Exception as parse_error:
raise StackFailedToCreateException(
json.dumps(
{
"error": f"Failed to parse ECS Model Deployer response: {str(parse_error)}",
"event": event,
"raw_response": str(response["Payload"].read()),
"parse_error": str(parse_error),
}
)
)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The error handling for the ECS deployer invocation has been improved with try-except blocks, but there's a potential issue: at line 398, response["Payload"].read() is called again after already being read at line 390. The stream is consumed after the first read, so the second read will return an empty result. This will cause the error message to show an empty string. Remove the second read:

Suggested change
try:
response = lambdaClient.invoke(
FunctionName=os.environ["ECS_MODEL_DEPLOYER_FN_ARN"],
Payload=json.dumps({"modelConfig": prepared_event}),
)
except Exception as invoke_error:
raise StackFailedToCreateException(
json.dumps(
{
"error": f"Failed to invoke ECS Model Deployer Lambda: {str(invoke_error)}",
"event": event,
"invoke_error": str(invoke_error),
}
)
)
try:
payload = response["Payload"].read()
payload = json.loads(payload)
except Exception as parse_error:
raise StackFailedToCreateException(
json.dumps(
{
"error": f"Failed to parse ECS Model Deployer response: {str(parse_error)}",
"event": event,
"raw_response": str(response["Payload"].read()),
"parse_error": str(parse_error),
}
)
)
except Exception as parse_error:
raise StackFailedToCreateException(
json.dumps(
{
"error": f"Failed to parse ECS Model Deployer response: {str(parse_error)}",
"event": event,
"raw_response": payload,
"parse_error": str(parse_error),
}
)
)

Comment on lines 404 to +421
stack_name = payload.get("stackName", None)

if not stack_name:
# Log the full payload for debugging
logger.error(f"ECS Model Deployer response: {payload}")
error_message = payload.get("errorMessage", "Unknown error")
error_type = payload.get("errorType", "Unknown error type")

error_message = payload.get("errorMessage")
error_type = payload.get("errorType")
trace = payload.get("trace", [])
raise StackFailedToCreateException(
json.dumps(
{
"error": f"Failed to create Model CloudFormation Stack. {error_type}: {error_message}",
"event": event,
"deployer_response": payload,
"debug_info": {
"error_type": error_type,
"error_message": error_message,
"stack_trace": trace,
"full_response": payload,
},

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Indentation is inconsistent in the new hunk. Lines 404-410 have incorrect indentation (extra spaces). Ensure proper Python indentation alignment with the surrounding code.

from decimal import Decimal
from functools import cache
from typing import Any, Callable, cast, Dict, Optional, TypeVar, Union
from typing import Any, Callable, cast, Dict, List, Optional, TypeVar, Union

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The import statement adds List to the typing imports. Verify that List is actually used in this module. If it's not used, consider removing it to keep imports minimal. If it is used, ensure it's being used instead of the built-in list type hint (which is preferred in Python 3.9+).


response = lambda_client.invoke(
FunctionName=os.environ.get("SCHEDULE_MANAGEMENT_FUNCTION_NAME", "LISA-ScheduleManagement"),
FunctionName=os.environ.get("SCHEDULE_MANAGEMENT_FUNCTION_NAME"),

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Missing default fallback value for SCHEDULE_MANAGEMENT_FUNCTION_NAME environment variable. If this environment variable is not set, os.environ.get() will return None, causing the Lambda invocation to fail. Consider adding a sensible default value or ensuring this variable is always defined in the deployment configuration.

Suggested change
FunctionName=os.environ.get("SCHEDULE_MANAGEMENT_FUNCTION_NAME"),
FunctionName=os.environ.get("SCHEDULE_MANAGEMENT_FUNCTION_NAME", "LISA-ScheduleManagement"),


response = lambda_client.invoke(
FunctionName=os.environ.get("SCHEDULE_MANAGEMENT_FUNCTION_NAME", "LISA-ScheduleManagement"),
FunctionName=os.environ.get("SCHEDULE_MANAGEMENT_FUNCTION_NAME"),

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Missing default fallback value for SCHEDULE_MANAGEMENT_FUNCTION_NAME environment variable. If this environment variable is not set, os.environ.get() will return None, causing the Lambda invocation to fail. Consider adding a sensible default value or ensuring this variable is always defined in the deployment configuration.

Suggested change
FunctionName=os.environ.get("SCHEDULE_MANAGEMENT_FUNCTION_NAME"),
FunctionName=os.environ.get("SCHEDULE_MANAGEMENT_FUNCTION_NAME", "LISA-ScheduleManagement"),

Comment on lines +175 to +192
const scheduleMonitoringLambda = new Function(this, 'ScheduleMonitoring', {
runtime: getDefaultRuntime(),
handler: 'models.scheduling.schedule_monitoring.lambda_handler',
code: Code.fromAsset(lambdaPath),
layers: lambdaLayers,
environment: {
MODEL_TABLE_NAME: modelTable.tableName,
ECS_CLUSTER_NAME: `${config.deploymentPrefix}-ECS-Cluster`,
LISA_API_URL_PS_NAME: lisaServeEndpointUrlPs.parameterName,
MANAGEMENT_KEY_NAME: managementKeyName,
REST_API_VERSION: 'v2',
},
role: stateMachinesLambdaRole,
vpc: vpc.vpc,
securityGroups: securityGroups,
timeout: Duration.minutes(5),
description: 'Processes Auto Scaling Group CloudWatch events to update model status',
});

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The scheduleMonitoringLambda environment variable ECS_CLUSTER_NAME is set to ${config.deploymentPrefix}-ECS-Cluster, but according to the PR description, the monitoring logic has been refactored to use Auto Scaling Group state checking instead of ECS-based status tracking. Verify that this environment variable is still needed or remove it if it is no longer used.

Comment on lines +476 to +510
public addScheduleAwareService (
taskName: ECSTasks,
taskDefinition: TaskDefinition,
scheduleConfig?: {
scheduleEnabled: boolean;
scheduleType: string;
timezone: string;
}
): { service: Ec2Service; targetGroup?: ApplicationTargetGroup } {
const result = this.addTask(taskName, taskDefinition);
const { service } = result;

// Add schedule-related tags to the service
if (scheduleConfig?.scheduleEnabled) {
Tags.of(service).add('ScheduleEnabled', 'true');
Tags.of(service).add('ScheduleType', scheduleConfig.scheduleType);
Tags.of(service).add('Timezone', scheduleConfig.timezone);
Tags.of(service).add('ScheduleManaged', 'true');
} else {
Tags.of(service).add('ScheduleEnabled', 'false');
Tags.of(service).add('RunsAllTime', 'true');
}

// Add schedule-related environment variables
if (scheduleConfig?.scheduleEnabled) {
const container = service.taskDefinition.findContainer(createCdkId([taskName, 'Container']));
if (container) {
container.addEnvironment('SCHEDULE_ENABLED', 'true');
container.addEnvironment('SCHEDULE_TYPE', scheduleConfig.scheduleType);
container.addEnvironment('SCHEDULE_TIMEZONE', scheduleConfig.timezone);
}
}

return result;
}

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The addScheduleAwareService() method adds schedule-related tags and environment variables to services, but there's a potential issue with the implementation:

  1. The method modifies the task definition by adding environment variables directly to the container (lines 501-506). This happens after the service is created via this.addTask(), which may have already started the task. Consider whether these environment variables should be set before service creation.

  2. The method searches for a container using createCdkId([taskName, 'Container']) (line 501), but there's no validation that this container exists or that the naming convention matches. If the container isn't found, the environment variables silently fail to be added. Consider adding explicit error handling or logging.

Suggested change
public addScheduleAwareService (
taskName: ECSTasks,
taskDefinition: TaskDefinition,
scheduleConfig?: {
scheduleEnabled: boolean;
scheduleType: string;
timezone: string;
}
): { service: Ec2Service; targetGroup?: ApplicationTargetGroup } {
const result = this.addTask(taskName, taskDefinition);
const { service } = result;
// Add schedule-related tags to the service
if (scheduleConfig?.scheduleEnabled) {
Tags.of(service).add('ScheduleEnabled', 'true');
Tags.of(service).add('ScheduleType', scheduleConfig.scheduleType);
Tags.of(service).add('Timezone', scheduleConfig.timezone);
Tags.of(service).add('ScheduleManaged', 'true');
} else {
Tags.of(service).add('ScheduleEnabled', 'false');
Tags.of(service).add('RunsAllTime', 'true');
}
// Add schedule-related environment variables
if (scheduleConfig?.scheduleEnabled) {
const container = service.taskDefinition.findContainer(createCdkId([taskName, 'Container']));
if (container) {
container.addEnvironment('SCHEDULE_ENABLED', 'true');
container.addEnvironment('SCHEDULE_TYPE', scheduleConfig.scheduleType);
container.addEnvironment('SCHEDULE_TIMEZONE', scheduleConfig.timezone);
}
}
return result;
}
public addScheduleAwareService (
taskName: ECSTasks,
taskDefinition: TaskDefinition,
scheduleConfig?: {
scheduleEnabled: boolean;
scheduleType: string;
timezone: string;
}
): { service: Ec2Service; targetGroup?: ApplicationTargetGroup } {
// Add schedule-related environment variables before service creation
if (scheduleConfig?.scheduleEnabled) {
const container = taskDefinition.findContainer(createCdkId([taskName, 'Container']));
if (container) {
container.addEnvironment('SCHEDULE_ENABLED', 'true');
container.addEnvironment('SCHEDULE_TYPE', scheduleConfig.scheduleType);
container.addEnvironment('SCHEDULE_TIMEZONE', scheduleConfig.timezone);
}
}
const result = this.addTask(taskName, taskDefinition);
const { service } = result;
// Add schedule-related tags to the service
if (scheduleConfig?.scheduleEnabled) {
Tags.of(service).add('ScheduleEnabled', 'true');
Tags.of(service).add('ScheduleType', scheduleConfig.scheduleType);
Tags.of(service).add('Timezone', scheduleConfig.timezone);
Tags.of(service).add('ScheduleManaged', 'true');
} else {
Tags.of(service).add('ScheduleEnabled', 'false');
Tags.of(service).add('RunsAllTime', 'true');
}
return result;
}

Comment on lines +434 to +457
public createScheduledAction (
actionName: string,
schedule: string,
minSize?: number,
maxSize?: number,
desiredCapacity?: number,
timezone?: string
): CfnScheduledAction {
const scheduledAction = new CfnScheduledAction(this, createCdkId([this.identifier, actionName, 'ScheduledAction']), {
autoScalingGroupName: this.autoScalingGroup.autoScalingGroupName,
recurrence: schedule,
...(minSize !== undefined && { minSize }),
...(maxSize !== undefined && { maxSize }),
...(desiredCapacity !== undefined && { desiredCapacity }),
...(timezone && { timeZone: timezone })
});

// Add tags to track the scheduled action
Tags.of(scheduledAction).add('ActionType', 'Schedule');
Tags.of(scheduledAction).add('LISACluster', this.identifier);
Tags.of(scheduledAction).add('CreatedBy', 'LISA-ScheduleManagement');

return scheduledAction;
}

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The createScheduledAction() method uses conditional spread operators to optionally include capacity parameters (lines 445-447). However, AWS CloudFormation requires at least one of minSize, maxSize, or desiredCapacity to be specified for a scheduled action to be meaningful. Consider adding validation to ensure at least one capacity parameter is provided:

Suggested change
public createScheduledAction (
actionName: string,
schedule: string,
minSize?: number,
maxSize?: number,
desiredCapacity?: number,
timezone?: string
): CfnScheduledAction {
const scheduledAction = new CfnScheduledAction(this, createCdkId([this.identifier, actionName, 'ScheduledAction']), {
autoScalingGroupName: this.autoScalingGroup.autoScalingGroupName,
recurrence: schedule,
...(minSize !== undefined && { minSize }),
...(maxSize !== undefined && { maxSize }),
...(desiredCapacity !== undefined && { desiredCapacity }),
...(timezone && { timeZone: timezone })
});
// Add tags to track the scheduled action
Tags.of(scheduledAction).add('ActionType', 'Schedule');
Tags.of(scheduledAction).add('LISACluster', this.identifier);
Tags.of(scheduledAction).add('CreatedBy', 'LISA-ScheduleManagement');
return scheduledAction;
}
public createScheduledAction (
actionName: string,
schedule: string,
minSize?: number,
maxSize?: number,
desiredCapacity?: number,
timezone?: string
): CfnScheduledAction {
if (minSize === undefined && maxSize === undefined && desiredCapacity === undefined) {
throw new Error('At least one of minSize, maxSize, or desiredCapacity must be specified for a scheduled action');
}
const scheduledAction = new CfnScheduledAction(this, createCdkId([this.identifier, actionName, 'ScheduledAction']), {
autoScalingGroupName: this.autoScalingGroup.autoScalingGroupName,
recurrence: schedule,
...(minSize !== undefined && { minSize }),
...(maxSize !== undefined && { maxSize }),
...(desiredCapacity !== undefined && { desiredCapacity }),
...(timezone && { timeZone: timezone })
});
// Add tags to track the scheduled action
Tags.of(scheduledAction).add('ActionType', 'Schedule');
Tags.of(scheduledAction).add('LISACluster', this.identifier);
Tags.of(scheduledAction).add('CreatedBy', 'LISA-ScheduleManagement');
return scheduledAction;
}

Comment on lines +136 to +144
try:
result = schedule_management.delete_schedule(payload)

result = json.loads(response["Payload"].read())
if result.get("statusCode") != 200:
error_message = result.get("body", {}).get("message", "Unknown error")
raise ValueError(f"Failed to delete schedule: {error_message}")

if response["StatusCode"] != 200 or result.get("statusCode") != 200:
error_message = result.get("body", {}).get("message", "Unknown error")
raise ValueError(f"Failed to delete schedule: {error_message}")
except Exception as e:
raise ValueError(f"Failed to delete schedule: {str(e)}")

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Similar to the UpdateScheduleHandler, verify that schedule_management.delete_schedule() returns a dict with a statusCode key. The error handling assumes this structure, but if the function raises exceptions instead, the try-except should be adjusted accordingly.

import boto3
from botocore.config import Config

from ..scheduling import schedule_management

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The import of schedule_management module is added here, but the old code removed the lambda_client initialization. Ensure that schedule_management module is properly available in the deployment package and that all its functions (update_schedule, delete_schedule) are correctly implemented to handle the payloads being passed.

Comment on lines +58 to +69
# Call schedule management function directly
payload = {
"operation": "update",
"modelId": model_id,
"scheduleConfig": scheduling_config,
"autoScalingGroup": auto_scaling_group,
}

response = lambda_client.invoke(
FunctionName=os.environ.get("SCHEDULE_MANAGEMENT_FUNCTION_NAME"),
InvocationType="RequestResponse",
Payload=json.dumps(payload),
)

result = json.loads(response["Payload"].read())
result = schedule_management.update_schedule(payload)

if result.get("statusCode") == 200:
result_body = json.loads(result["body"])
result_body = json.loads(result["body"]) if isinstance(result["body"], str) else result["body"]

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The refactoring changes from Lambda invocation to direct function calls. However, there's an inconsistency in result handling:

  • Line 69 checks if result.get("statusCode") == 200: but direct function calls may not return a dict with statusCode key
  • The result parsing on line 69 assumes the result structure matches Lambda response format

Verify that schedule_management.update_schedule() returns a dict with statusCode and body keys, or adjust the result handling accordingly.

Comment on lines 116 to +126
payload = {
"operation": "update",
"modelId": model_id,
"scheduleConfig": new_scheduling_config,
"autoScalingGroup": auto_scaling_group,
}

response = lambda_client.invoke(
FunctionName=os.environ.get("SCHEDULE_MANAGEMENT_FUNCTION_NAME", "LISA-ScheduleManagement"),
InvocationType="RequestResponse",
Payload=json.dumps(payload),
)

result = json.loads(response["Payload"].read())
result = schedule_management.update_schedule(payload)

if result.get("statusCode") == 200:
result_body = json.loads(result["body"])
result_body = json.loads(result["body"]) if isinstance(result["body"], str) else result["body"]

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same issue as the previous hunk: the result handling assumes a Lambda response format with statusCode and body keys. Ensure schedule_management.update_schedule() returns the expected structure, or update the result parsing logic to match the actual return type.

Comment on lines 161 to +163
payload = {"operation": "delete", "modelId": model_id}

response = lambda_client.invoke(
FunctionName=os.environ.get("SCHEDULE_MANAGEMENT_FUNCTION_NAME", "LISA-ScheduleManagement"),
InvocationType="RequestResponse",
Payload=json.dumps(payload),
)

result = json.loads(response["Payload"].read())
result = schedule_management.delete_schedule(payload)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The result handling on line 164 (not shown but implied) checks result.get("statusCode") == 200:. Verify that schedule_management.delete_schedule() returns a dict with a statusCode key, or adjust the result handling to match the actual return type of the direct function call.

Comment on lines +194 to +208
// Add permission for state machine lambdas to invoke the ScheduleManagement lambda
const scheduleManagementPermission = new Policy(this, 'ScheduleManagementInvokePerms', {
statements: [
new PolicyStatement({
effect: Effect.ALLOW,
actions: [
'lambda:InvokeFunction',
],
resources: [
scheduleManagementLambda.functionArn,
],
})
]
});
stateMachinesLambdaRole.attachInlinePolicy(scheduleManagementPermission);

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The inline policy ScheduleManagementInvokePerms grants permission to invoke scheduleManagementLambda, but there's no corresponding permission for the CloudWatch Events rule to invoke scheduleMonitoringLambda. Add a Lambda invoke permission for the CloudWatch Events rule:

Suggested change
// Add permission for state machine lambdas to invoke the ScheduleManagement lambda
const scheduleManagementPermission = new Policy(this, 'ScheduleManagementInvokePerms', {
statements: [
new PolicyStatement({
effect: Effect.ALLOW,
actions: [
'lambda:InvokeFunction',
],
resources: [
scheduleManagementLambda.functionArn,
],
})
]
});
stateMachinesLambdaRole.attachInlinePolicy(scheduleManagementPermission);
// Add permission for state machine lambdas to invoke the ScheduleManagement lambda
const scheduleManagementPermission = new Policy(this, 'ScheduleManagementInvokePerms', {
statements: [
new PolicyStatement({
effect: Effect.ALLOW,
actions: [
'lambda:InvokeFunction',
],
resources: [
scheduleManagementLambda.functionArn,
],
})
]
});
stateMachinesLambdaRole.attachInlinePolicy(scheduleManagementPermission);
scheduleMonitoringLambda.grantInvoke(new ServicePrincipal('events.amazonaws.com'));

Comment on lines +259 to +284
// Check if this is a scheduling-only update
const isSchedulingOnlyUpdate = updateFields.autoScalingConfig?.scheduling &&
Object.keys(updateFields).length === 2 && // modelId + autoScalingConfig
Object.keys(updateFields.autoScalingConfig).length === 1 && // only scheduling
Object.keys(updateFields.autoScalingConfig)[0] === 'scheduling';

if (isSchedulingOnlyUpdate) {
// Use separate schedule API for scheduling-only updates
resetScheduleUpdate();
updateScheduleMutation({
modelId: props.selectedItems[0].modelId,
scheduleConfig: state.form.autoScalingConfig.scheduling
});
} else {
// Handle autoScalingConfig if present (non-scheduling changes)
if (updateFields.autoScalingConfig) {
// Only pick instance-specific fields for autoScalingInstanceConfig
const instanceConfigFields = ['minCapacity', 'maxCapacity', 'desiredCapacity', 'cooldown', 'defaultInstanceWarmup'];
const autoScalingInstanceConfig = _.pick(updateFields.autoScalingConfig, instanceConfigFields);
const filteredInstanceConfig = _.pickBy(autoScalingInstanceConfig, (value) => value !== undefined);

// Only include autoScalingInstanceConfig if it has at least one instance-specific property
if (!_.isEmpty(filteredInstanceConfig)) {
updateRequest.autoScalingInstanceConfig = filteredInstanceConfig;
}
}

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The logic for detecting scheduling-only updates has a potential issue. The condition checks if updateFields.autoScalingConfig has only a scheduling key, but this doesn't account for cases where other fields in autoScalingConfig might be undefined or null. Consider using a more robust check:

Suggested change
// Check if this is a scheduling-only update
const isSchedulingOnlyUpdate = updateFields.autoScalingConfig?.scheduling &&
Object.keys(updateFields).length === 2 && // modelId + autoScalingConfig
Object.keys(updateFields.autoScalingConfig).length === 1 && // only scheduling
Object.keys(updateFields.autoScalingConfig)[0] === 'scheduling';
if (isSchedulingOnlyUpdate) {
// Use separate schedule API for scheduling-only updates
resetScheduleUpdate();
updateScheduleMutation({
modelId: props.selectedItems[0].modelId,
scheduleConfig: state.form.autoScalingConfig.scheduling
});
} else {
// Handle autoScalingConfig if present (non-scheduling changes)
if (updateFields.autoScalingConfig) {
// Only pick instance-specific fields for autoScalingInstanceConfig
const instanceConfigFields = ['minCapacity', 'maxCapacity', 'desiredCapacity', 'cooldown', 'defaultInstanceWarmup'];
const autoScalingInstanceConfig = _.pick(updateFields.autoScalingConfig, instanceConfigFields);
const filteredInstanceConfig = _.pickBy(autoScalingInstanceConfig, (value) => value !== undefined);
// Only include autoScalingInstanceConfig if it has at least one instance-specific property
if (!_.isEmpty(filteredInstanceConfig)) {
updateRequest.autoScalingInstanceConfig = filteredInstanceConfig;
}
}
// Check if this is a scheduling-only update
const isSchedulingOnlyUpdate = updateFields.autoScalingConfig?.scheduling &&
Object.keys(updateFields).length === 2 && // modelId + autoScalingConfig
Object.keys(updateFields.autoScalingConfig).filter(key => updateFields.autoScalingConfig[key] !== undefined).length === 1 && // only scheduling
updateFields.autoScalingConfig.scheduling !== undefined;

Alternatively, explicitly check that other ASG fields are undefined:

Suggested change
// Check if this is a scheduling-only update
const isSchedulingOnlyUpdate = updateFields.autoScalingConfig?.scheduling &&
Object.keys(updateFields).length === 2 && // modelId + autoScalingConfig
Object.keys(updateFields.autoScalingConfig).length === 1 && // only scheduling
Object.keys(updateFields.autoScalingConfig)[0] === 'scheduling';
if (isSchedulingOnlyUpdate) {
// Use separate schedule API for scheduling-only updates
resetScheduleUpdate();
updateScheduleMutation({
modelId: props.selectedItems[0].modelId,
scheduleConfig: state.form.autoScalingConfig.scheduling
});
} else {
// Handle autoScalingConfig if present (non-scheduling changes)
if (updateFields.autoScalingConfig) {
// Only pick instance-specific fields for autoScalingInstanceConfig
const instanceConfigFields = ['minCapacity', 'maxCapacity', 'desiredCapacity', 'cooldown', 'defaultInstanceWarmup'];
const autoScalingInstanceConfig = _.pick(updateFields.autoScalingConfig, instanceConfigFields);
const filteredInstanceConfig = _.pickBy(autoScalingInstanceConfig, (value) => value !== undefined);
// Only include autoScalingInstanceConfig if it has at least one instance-specific property
if (!_.isEmpty(filteredInstanceConfig)) {
updateRequest.autoScalingInstanceConfig = filteredInstanceConfig;
}
}
// Check if this is a scheduling-only update
const isSchedulingOnlyUpdate = updateFields.autoScalingConfig?.scheduling &&
Object.keys(updateFields).length === 2 && // modelId + autoScalingConfig
updateFields.autoScalingConfig.minCapacity === undefined &&
updateFields.autoScalingConfig.maxCapacity === undefined &&
updateFields.autoScalingConfig.desiredCapacity === undefined &&
updateFields.autoScalingConfig.cooldown === undefined &&
updateFields.autoScalingConfig.defaultInstanceWarmup === undefined;

Comment on lines +349 to +358
useEffect(() => {
if (!isScheduleUpdating && isScheduleUpdateSuccess) {
notificationService.generateNotification(`Successfully updated schedule: ${state.form.modelId}`, 'success');
props.setVisible(false);
props.setIsEdit(false);
props.setSelectedItems([]);
resetState();
}
// eslint-disable-next-line react-hooks/exhaustive-deps
}, [isScheduleUpdating, isScheduleUpdateSuccess]);

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The useEffect hook for schedule update success is missing scheduleUpdateError and scheduleUpdateSuccess in its dependency array. While the eslint-disable comment suppresses the warning, this could lead to stale closures. Consider adding these to the dependency array or ensure they're properly memoized:

Suggested change
useEffect(() => {
if (!isScheduleUpdating && isScheduleUpdateSuccess) {
notificationService.generateNotification(`Successfully updated schedule: ${state.form.modelId}`, 'success');
props.setVisible(false);
props.setIsEdit(false);
props.setSelectedItems([]);
resetState();
}
// eslint-disable-next-line react-hooks/exhaustive-deps
}, [isScheduleUpdating, isScheduleUpdateSuccess]);
useEffect(() => {
if (!isScheduleUpdating && isScheduleUpdateSuccess) {
notificationService.generateNotification(`Successfully updated schedule: ${state.form.modelId}`, 'success');
props.setVisible(false);
props.setIsEdit(false);
props.setSelectedItems([]);
resetState();
}
// eslint-disable-next-line react-hooks/exhaustive-deps
}, [isScheduleUpdating, isScheduleUpdateSuccess, notificationService]);

Comment on lines +361 to +364
const reviewError = normalizeError('Model',
isCreateError ? createError :
isUpdateError ? updateError :
isScheduleUpdateError ? scheduleUpdateError : undefined);

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The error normalization logic now includes scheduleUpdateError, but there's no corresponding error handling UI or user feedback mechanism shown for schedule update failures. Ensure that scheduleUpdateError is properly displayed to the user when a schedule update fails, similar to how other errors are handled.

Comment on lines +167 to +173
model_config = model_item.get("model_config", {})
auto_scaling_config = model_config.get("autoScalingConfig", {})
scheduling_config = auto_scaling_config.get("scheduling", {})

# Calculate next scheduled action if schedule is active
next_action = None
if scheduling_config.get("scheduleEnabled", False):
next_action = calculate_next_scheduled_action(scheduling_config)

return {
"statusCode": 200,
"body": json.dumps(
{"modelId": model_id, "scheduling": scheduling_config, "nextScheduledAction": next_action}, default=str
),
"body": json.dumps({"modelId": model_id, "scheduling": scheduling_config}, default=str),

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The indentation is inconsistent in this block. Lines 169 and 171-173 have extra leading spaces that don't align with the surrounding code structure. This should be corrected to maintain consistent indentation.

Suggested change
model_config = model_item.get("model_config", {})
auto_scaling_config = model_config.get("autoScalingConfig", {})
scheduling_config = auto_scaling_config.get("scheduling", {})
# Calculate next scheduled action if schedule is active
next_action = None
if scheduling_config.get("scheduleEnabled", False):
next_action = calculate_next_scheduled_action(scheduling_config)
return {
"statusCode": 200,
"body": json.dumps(
{"modelId": model_id, "scheduling": scheduling_config, "nextScheduledAction": next_action}, default=str
),
"body": json.dumps({"modelId": model_id, "scheduling": scheduling_config}, default=str),
model_item = response["Item"]
model_config = model_item.get("model_config", {})
auto_scaling_config = model_config.get("autoScalingConfig", {})
scheduling_config = auto_scaling_config.get("scheduling", {})
return {
"statusCode": 200,
"body": json.dumps({"modelId": model_id, "scheduling": scheduling_config}, default=str),
}

Comment on lines +185 to +199
if schedule_config.scheduleType == ScheduleType.RECURRING:
# Create daily recurring schedule
if not schedule_config.dailySchedule:
raise ValueError("dailySchedule required for RECURRING_DAILY type")
raise ValueError("dailySchedule required for RECURRING type")

scheduled_action_arns.extend(
create_daily_scheduled_actions(
model_id, auto_scaling_group, schedule_config.dailySchedule, schedule_config.timezone
)
)

elif schedule_config.scheduleType == ScheduleType.EACH_DAY:
elif schedule_config.scheduleType == ScheduleType.DAILY:
# Create individual day schedules
if not schedule_config.weeklySchedule:
raise ValueError("weeklySchedule required for EACH_DAY type")
raise ValueError("weeklySchedule required for DAILY type")

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The indentation is inconsistent in this block. Lines 186-188 and 197-199 have extra leading spaces. Additionally, the logic appears reversed: ScheduleType.RECURRING should use weeklySchedule (for different times per day across the week), while ScheduleType.DAILY should use dailySchedule (for the same time daily). The current implementation has these swapped.

Suggested change
if schedule_config.scheduleType == ScheduleType.RECURRING:
# Create daily recurring schedule
if not schedule_config.dailySchedule:
raise ValueError("dailySchedule required for RECURRING_DAILY type")
raise ValueError("dailySchedule required for RECURRING type")
scheduled_action_arns.extend(
create_daily_scheduled_actions(
model_id, auto_scaling_group, schedule_config.dailySchedule, schedule_config.timezone
)
)
elif schedule_config.scheduleType == ScheduleType.EACH_DAY:
elif schedule_config.scheduleType == ScheduleType.DAILY:
# Create individual day schedules
if not schedule_config.weeklySchedule:
raise ValueError("weeklySchedule required for EACH_DAY type")
raise ValueError("weeklySchedule required for DAILY type")
if schedule_config.scheduleType == ScheduleType.RECURRING:
# Create daily recurring schedule
if not schedule_config.dailySchedule:
raise ValueError("dailySchedule required for RECURRING type")
scheduled_action_arns.extend(
create_daily_scheduled_actions(
model_id, auto_scaling_group, schedule_config.dailySchedule, schedule_config.timezone
)
)
elif schedule_config.scheduleType == ScheduleType.DAILY:
# Create individual day schedules
if not schedule_config.weeklySchedule:
raise ValueError("weeklySchedule required for DAILY type")

Comment on lines 289 to +298
# Create start action
start_cron = convert_to_utc_cron(day_schedule.startTime, timezone_name)
start_cron = time_to_cron(day_schedule.startTime)
start_action_name = f"{model_id}-daily-start"

try:
autoscaling_client.put_scheduled_update_group_action(
AutoScalingGroupName=auto_scaling_group,
ScheduledActionName=start_action_name,
Recurrence=start_cron,
TimeZone=timezone_name,

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The indentation is inconsistent in this block. Lines 289-291 and 293 have extra leading spaces that don't align with the surrounding code.

Suggested change
# Create start action
start_cron = convert_to_utc_cron(day_schedule.startTime, timezone_name)
start_cron = time_to_cron(day_schedule.startTime)
start_action_name = f"{model_id}-daily-start"
try:
autoscaling_client.put_scheduled_update_group_action(
AutoScalingGroupName=auto_scaling_group,
ScheduledActionName=start_action_name,
Recurrence=start_cron,
TimeZone=timezone_name,
# Create start action
start_cron = time_to_cron(day_schedule.startTime)
start_action_name = f"{model_id}-daily-start"
try:
autoscaling_client.put_scheduled_update_group_action(
AutoScalingGroupName=auto_scaling_group,
ScheduledActionName=start_action_name,
Recurrence=start_cron,
TimeZone=timezone_name,

Comment on lines 314 to +321
stop_action_name = f"{model_id}-daily-stop"

try:
autoscaling_client.put_scheduled_update_group_action(
AutoScalingGroupName=auto_scaling_group,
ScheduledActionName=stop_action_name,
Recurrence=stop_cron,
TimeZone=timezone_name,

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The indentation is inconsistent in this block. Lines 314-315 and 316 have extra leading spaces that don't align with the surrounding code.

Suggested change
stop_action_name = f"{model_id}-daily-stop"
try:
autoscaling_client.put_scheduled_update_group_action(
AutoScalingGroupName=auto_scaling_group,
ScheduledActionName=stop_action_name,
Recurrence=stop_cron,
TimeZone=timezone_name,
# Create stop action
stop_cron = time_to_cron(day_schedule.stopTime)
stop_action_name = f"{model_id}-daily-stop"
try:
autoscaling_client.put_scheduled_update_group_action(
AutoScalingGroupName=auto_scaling_group,
ScheduledActionName=stop_action_name,
Recurrence=stop_cron,
TimeZone=timezone_name,

Comment on lines 370 to +377
start_action_name = f"{model_id}-{day_name}-start"

try:
autoscaling_client.put_scheduled_update_group_action(
AutoScalingGroupName=auto_scaling_group,
ScheduledActionName=start_action_name,
Recurrence=start_cron,
TimeZone=timezone_name,

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The indentation is inconsistent in this block. Lines 370-371 and 372 have extra leading spaces that don't align with the surrounding code.

Suggested change
start_action_name = f"{model_id}-{day_name}-start"
try:
autoscaling_client.put_scheduled_update_group_action(
AutoScalingGroupName=auto_scaling_group,
ScheduledActionName=start_action_name,
Recurrence=start_cron,
TimeZone=timezone_name,
# Create start action for this day
start_cron = time_to_cron_with_day(day_schedule.startTime, day_num)
start_action_name = f"{model_id}-{day_name}-start"
try:
autoscaling_client.put_scheduled_update_group_action(
AutoScalingGroupName=auto_scaling_group,
ScheduledActionName=start_action_name,
Recurrence=start_cron,
TimeZone=timezone_name,

Comment on lines 394 to +401
stop_action_name = f"{model_id}-{day_name}-stop"

try:
autoscaling_client.put_scheduled_update_group_action(
AutoScalingGroupName=auto_scaling_group,
ScheduledActionName=stop_action_name,
Recurrence=stop_cron,
TimeZone=timezone_name,

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The indentation is inconsistent in this block. Lines 394-395 and 396 have extra leading spaces that don't align with the surrounding code.

Suggested change
stop_action_name = f"{model_id}-{day_name}-stop"
try:
autoscaling_client.put_scheduled_update_group_action(
AutoScalingGroupName=auto_scaling_group,
ScheduledActionName=stop_action_name,
Recurrence=stop_cron,
TimeZone=timezone_name,
# Create stop action for this day
stop_cron = time_to_cron_with_day(day_schedule.stopTime, day_num)
stop_action_name = f"{model_id}-{day_name}-stop"
try:
autoscaling_client.put_scheduled_update_group_action(
AutoScalingGroupName=auto_scaling_group,
ScheduledActionName=stop_action_name,
Recurrence=stop_cron,
TimeZone=timezone_name,

Comment on lines 421 to +428
hour, minute = map(int, time_str.split(":"))
return f"{minute} {hour} * * *"

# Create timezone-aware datetime for today
tz = ZoneInfo(timezone_name)
today = datetime.now(tz).date()
local_dt = datetime.combine(today, datetime.min.time().replace(hour=hour, minute=minute), tzinfo=tz)

# Convert to UTC
utc_dt = local_dt.astimezone(dt_timezone.utc)

# Return cron expression (minute hour * * *)
return f"{utc_dt.minute} {utc_dt.hour} * * *"


def convert_to_utc_cron_weekdays(time_str: str, timezone_name: str) -> str:
"""Convert local time to UTC cron expression for weekdays only (Mon-Fri)"""
from zoneinfo import ZoneInfo

# Parse time
def time_to_cron_with_day(time_str: str, day_of_week: int) -> str:
"""Convert time string (HH:MM) to cron expression with day"""
hour, minute = map(int, time_str.split(":"))

# Create timezone-aware datetime
tz = ZoneInfo(timezone_name)
today = datetime.now(tz).date()
local_dt = datetime.combine(today, datetime.min.time().replace(hour=hour, minute=minute), tzinfo=tz)

# Convert to UTC
utc_dt = local_dt.astimezone(dt_timezone.utc)

# Return cron expression with weekdays (minute hour * * 1-5)
return f"{utc_dt.minute} {utc_dt.hour} * * 1-5"


def convert_to_utc_cron_with_day(time_str: str, timezone_name: str, day_of_week: int) -> str:
"""Convert local time to UTC cron expression with specific day of week"""
from zoneinfo import ZoneInfo

# Parse time
hour, minute = map(int, time_str.split(":"))

# Create timezone-aware datetime
tz = ZoneInfo(timezone_name)
today = datetime.now(tz).date()
local_dt = datetime.combine(today, datetime.min.time().replace(hour=hour, minute=minute), tzinfo=tz)

# Convert to UTC
utc_dt = local_dt.astimezone(dt_timezone.utc)

# Return cron expression with day of week (minute hour * * day)
return f"{utc_dt.minute} {utc_dt.hour} * * {day_of_week}"
return f"{minute} {hour} * * {day_of_week}"

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The indentation is inconsistent in this block. Lines 421 and 427 have extra leading spaces that don't align with the function body.

Suggested change
hour, minute = map(int, time_str.split(":"))
return f"{minute} {hour} * * *"
# Create timezone-aware datetime for today
tz = ZoneInfo(timezone_name)
today = datetime.now(tz).date()
local_dt = datetime.combine(today, datetime.min.time().replace(hour=hour, minute=minute), tzinfo=tz)
# Convert to UTC
utc_dt = local_dt.astimezone(dt_timezone.utc)
# Return cron expression (minute hour * * *)
return f"{utc_dt.minute} {utc_dt.hour} * * *"
def convert_to_utc_cron_weekdays(time_str: str, timezone_name: str) -> str:
"""Convert local time to UTC cron expression for weekdays only (Mon-Fri)"""
from zoneinfo import ZoneInfo
# Parse time
def time_to_cron_with_day(time_str: str, day_of_week: int) -> str:
"""Convert time string (HH:MM) to cron expression with day"""
hour, minute = map(int, time_str.split(":"))
# Create timezone-aware datetime
tz = ZoneInfo(timezone_name)
today = datetime.now(tz).date()
local_dt = datetime.combine(today, datetime.min.time().replace(hour=hour, minute=minute), tzinfo=tz)
# Convert to UTC
utc_dt = local_dt.astimezone(dt_timezone.utc)
# Return cron expression with weekdays (minute hour * * 1-5)
return f"{utc_dt.minute} {utc_dt.hour} * * 1-5"
def convert_to_utc_cron_with_day(time_str: str, timezone_name: str, day_of_week: int) -> str:
"""Convert local time to UTC cron expression with specific day of week"""
from zoneinfo import ZoneInfo
# Parse time
hour, minute = map(int, time_str.split(":"))
# Create timezone-aware datetime
tz = ZoneInfo(timezone_name)
today = datetime.now(tz).date()
local_dt = datetime.combine(today, datetime.min.time().replace(hour=hour, minute=minute), tzinfo=tz)
# Convert to UTC
utc_dt = local_dt.astimezone(dt_timezone.utc)
# Return cron expression with day of week (minute hour * * day)
return f"{utc_dt.minute} {utc_dt.hour} * * {day_of_week}"
return f"{minute} {hour} * * {day_of_week}"
def time_to_cron(time_str: str) -> str:
"""Convert time string (HH:MM) to cron expression"""
hour, minute = map(int, time_str.split(":"))
return f"{minute} {hour} * * *"
def time_to_cron_with_day(time_str: str, day_of_week: int) -> str:
"""Convert time string (HH:MM) to cron expression with day"""
hour, minute = map(int, time_str.split(":"))
return f"{minute} {hour} * * {day_of_week}"

Comment on lines +543 to 575
# Check if model_config.autoScalingConfig exists first
response = model_table.get_item(Key={"model_id": model_id})
if "Item" not in response:
raise ValueError(f"Model {model_id} not found")

model_item = response["Item"]
auto_scaling_config_exists = "autoScalingConfig" in model_item
model_config_exists = "model_config" in model_item
auto_scaling_config_exists = model_config_exists and "autoScalingConfig" in model_item.get("model_config", {})

if scheduling_config:
# Prepare the scheduling configuration for storage
schedule_data = scheduling_config.model_dump()
schedule_data["scheduledActionArns"] = scheduled_action_arns
schedule_data["scheduleEnabled"] = enabled
schedule_data["lastScheduleUpdate"] = datetime.now(dt_timezone.utc).isoformat()

schedule_data["scheduleConfigured"] = enabled
schedule_data["lastScheduleFailed"] = False

# Calculate next scheduled action
if enabled:
next_action = calculate_next_scheduled_action(schedule_data)
if next_action:
schedule_data["nextScheduledAction"] = next_action

if auto_scaling_config_exists:
# Update existing autoScalingConfig.scheduling
# Update existing model_config.autoScalingConfig.scheduling
model_table.update_item(
Key={"model_id": model_id},
UpdateExpression="SET autoScalingConfig.scheduling = :scheduling",
UpdateExpression="SET model_config.autoScalingConfig.scheduling = :scheduling",
ExpressionAttributeValues={":scheduling": schedule_data},
)
else:
# Create autoScalingConfig with scheduling
# Create model_config.autoScalingConfig with scheduling
model_table.update_item(
Key={"model_id": model_id},
UpdateExpression="SET autoScalingConfig = :autoScalingConfig",
UpdateExpression="SET model_config.autoScalingConfig = :autoScalingConfig",
ExpressionAttributeValues={":autoScalingConfig": {"scheduling": schedule_data}},
)
else:

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The indentation is inconsistent throughout this block. Lines 544-546, 548, 552-559, 561-567, 569-574 have extra leading spaces that don't align with the surrounding code structure.

Suggested change
# Check if model_config.autoScalingConfig exists first
response = model_table.get_item(Key={"model_id": model_id})
if "Item" not in response:
raise ValueError(f"Model {model_id} not found")
model_item = response["Item"]
auto_scaling_config_exists = "autoScalingConfig" in model_item
model_config_exists = "model_config" in model_item
auto_scaling_config_exists = model_config_exists and "autoScalingConfig" in model_item.get("model_config", {})
if scheduling_config:
# Prepare the scheduling configuration for storage
schedule_data = scheduling_config.model_dump()
schedule_data["scheduledActionArns"] = scheduled_action_arns
schedule_data["scheduleEnabled"] = enabled
schedule_data["lastScheduleUpdate"] = datetime.now(dt_timezone.utc).isoformat()
schedule_data["scheduleConfigured"] = enabled
schedule_data["lastScheduleFailed"] = False
# Calculate next scheduled action
if enabled:
next_action = calculate_next_scheduled_action(schedule_data)
if next_action:
schedule_data["nextScheduledAction"] = next_action
if auto_scaling_config_exists:
# Update existing autoScalingConfig.scheduling
# Update existing model_config.autoScalingConfig.scheduling
model_table.update_item(
Key={"model_id": model_id},
UpdateExpression="SET autoScalingConfig.scheduling = :scheduling",
UpdateExpression="SET model_config.autoScalingConfig.scheduling = :scheduling",
ExpressionAttributeValues={":scheduling": schedule_data},
)
else:
# Create autoScalingConfig with scheduling
# Create model_config.autoScalingConfig with scheduling
model_table.update_item(
Key={"model_id": model_id},
UpdateExpression="SET autoScalingConfig = :autoScalingConfig",
UpdateExpression="SET model_config.autoScalingConfig = :autoScalingConfig",
ExpressionAttributeValues={":autoScalingConfig": {"scheduling": schedule_data}},
)
else:
# Check if model_config.autoScalingConfig exists first
response = model_table.get_item(Key={"model_id": model_id})
if "Item" not in response:
raise ValueError(f"Model {model_id} not found")
model_item = response["Item"]
model_config_exists = "model_config" in model_item
auto_scaling_config_exists = model_config_exists and "autoScalingConfig" in model_item.get("model_config", {})
if scheduling_config:
# Prepare the scheduling configuration for storage
schedule_data = scheduling_config.model_dump()
schedule_data["scheduledActionArns"] = scheduled_action_arns
schedule_data["scheduleEnabled"] = enabled
schedule_data["lastScheduleUpdate"] = datetime.now(dt_timezone.utc).isoformat()
schedule_data["scheduleConfigured"] = enabled
schedule_data["lastScheduleFailed"] = False
if auto_scaling_config_exists:
# Update existing model_config.autoScalingConfig.scheduling
model_table.update_item(
Key={"model_id": model_id},
UpdateExpression="SET model_config.autoScalingConfig.scheduling = :scheduling",
ExpressionAttributeValues={":scheduling": schedule_data},
)
else:
# Create model_config.autoScalingConfig with scheduling
model_table.update_item(
Key={"model_id": model_id},
UpdateExpression="SET model_config.autoScalingConfig = :autoScalingConfig",
ExpressionAttributeValues={":autoScalingConfig": {"scheduling": schedule_data}},
)
else:
# Remove scheduling configuration for always run behavior
if auto_scaling_config_exists:
model_table.update_item(
Key={"model_id": model_id},
UpdateExpression="REMOVE model_config.autoScalingConfig.scheduling",
)

Comment on lines +527 to 531
model_config = model_item.get("model_config", {})
auto_scaling_config = model_config.get("autoScalingConfig", {})
scheduling_config = auto_scaling_config.get("scheduling", {})

return scheduling_config.get("scheduledActionArns", [])

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The indentation is inconsistent in this block. Line 527 has extra leading spaces that don't align with the surrounding code.

Suggested change
model_config = model_item.get("model_config", {})
auto_scaling_config = model_config.get("autoScalingConfig", {})
scheduling_config = auto_scaling_config.get("scheduling", {})
return scheduling_config.get("scheduledActionArns", [])
model_item = response["Item"]
model_config = model_item.get("model_config", {})
auto_scaling_config = model_config.get("autoScalingConfig", {})
scheduling_config = auto_scaling_config.get("scheduling", {})

url: '/models/metadata/instances'
})
}),
updateSchedule: builder.mutation<{message: string, modelId: string, scheduleEnabled: boolean}, {modelId: string, scheduleConfig: any}>({

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The scheduleConfig parameter is typed as any. Consider using a more specific type definition (e.g., IScheduleConfig or similar) to improve type safety and enable better IDE support and compile-time validation.

Suggested change
updateSchedule: builder.mutation<{message: string, modelId: string, scheduleEnabled: boolean}, {modelId: string, scheduleConfig: any}>({
updateSchedule: builder.mutation<{message: string, modelId: string, scheduleEnabled: boolean}, {modelId: string, scheduleConfig: IScheduleConfig}>({}

Comment on lines +83 to +90
transformErrorResponse: (baseQueryReturnValue) => {
return {
name: 'Update Schedule Error',
message: baseQueryReturnValue.data?.type === 'RequestValidationError' ?
baseQueryReturnValue.data.detail.map((error) => error.msg).join(', ') :
baseQueryReturnValue.data.message
};
},

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The error transformation logic assumes baseQueryReturnValue.data exists and has specific properties. Add a null/undefined check to prevent potential runtime errors when the response structure is unexpected.

Suggested change
transformErrorResponse: (baseQueryReturnValue) => {
return {
name: 'Update Schedule Error',
message: baseQueryReturnValue.data?.type === 'RequestValidationError' ?
baseQueryReturnValue.data.detail.map((error) => error.msg).join(', ') :
baseQueryReturnValue.data.message
};
},
transformErrorResponse: (baseQueryReturnValue) => {
const data = baseQueryReturnValue.data;
return {
name: 'Update Schedule Error',
message: data?.type === 'RequestValidationError' ?
data.detail?.map((error) => error.msg).join(', ') ??
'Validation error occurred' :
data?.message ?? 'An error occurred while updating the schedule'
};
},

Comment on lines +101 to +121
const validateDailySchedule = (dailySchedule?: IDaySchedule): string | undefined => {
if (!dailySchedule) return 'Recurring schedule must be configured when selected.';

const { startTime, stopTime } = dailySchedule;

if (!startTime && !stopTime) {
return 'Recurring schedule must have both start and stop times.';
}

if (!startTime) return 'Start time is required for recurring schedule.';
if (!stopTime) return 'Stop time is required for recurring schedule.';

if (!isValidTimeFormat(startTime)) return 'Start time must be in HH:MM format (24-hour).';
if (!isValidTimeFormat(stopTime)) return 'Stop time must be in HH:MM format (24-hour).';

// Use enhanced validation for time pair
const timePairError = validateTimePair(startTime, stopTime);
if (timePairError) return timePairError;

return undefined;
};

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The function name and error messages are inconsistent. The function is named validateDailySchedule but it validates a recurring schedule (same time daily). The error messages reference "Recurring schedule" when they should reference "Daily schedule" or vice versa. This naming mismatch contradicts the PR description which states RECURRING is for same time daily.

Suggested change
const validateDailySchedule = (dailySchedule?: IDaySchedule): string | undefined => {
if (!dailySchedule) return 'Recurring schedule must be configured when selected.';
const { startTime, stopTime } = dailySchedule;
if (!startTime && !stopTime) {
return 'Recurring schedule must have both start and stop times.';
}
if (!startTime) return 'Start time is required for recurring schedule.';
if (!stopTime) return 'Stop time is required for recurring schedule.';
if (!isValidTimeFormat(startTime)) return 'Start time must be in HH:MM format (24-hour).';
if (!isValidTimeFormat(stopTime)) return 'Stop time must be in HH:MM format (24-hour).';
// Use enhanced validation for time pair
const timePairError = validateTimePair(startTime, stopTime);
if (timePairError) return timePairError;
return undefined;
};
const validateRecurringSchedule = (dailySchedule?: IDaySchedule): string | undefined => {
if (!dailySchedule) return 'Daily schedule must be configured when selected.';
const { startTime, stopTime } = dailySchedule;
if (!startTime && !stopTime) {
return 'Daily schedule must have both start and stop times.';
}
if (!startTime) return 'Start time is required for daily schedule.';
if (!stopTime) return 'Stop time is required for daily schedule.';
if (!isValidTimeFormat(startTime)) return 'Start time must be in HH:MM format (24-hour).';
if (!isValidTimeFormat(stopTime)) return 'Stop time must be in HH:MM format (24-hour).';
// Use enhanced validation for time pair
const timePairError = validateTimePair(startTime, stopTime);
if (timePairError) return timePairError;
return undefined;
};

Comment on lines +123 to +173
const validateWeeklySchedule = (weeklySchedule?: IWeeklySchedule): { [key: string]: string } => {
const errors: { [key: string]: string } = {};

if (!weeklySchedule) {
errors.general = 'Daily schedule must be configured when selected.';
return errors;
}

const daysOfWeek: (keyof IWeeklySchedule)[] = ['monday', 'tuesday', 'wednesday', 'thursday', 'friday', 'saturday', 'sunday'];
let hasAtLeastOneDay = false;

daysOfWeek.forEach((day) => {
const daySchedule = weeklySchedule[day];
if (daySchedule && daySchedule.startTime && daySchedule.stopTime) {
hasAtLeastOneDay = true;

const { startTime, stopTime } = daySchedule;

// Validate time formats
if (startTime && !isValidTimeFormat(startTime)) {
errors[`${day}_startTime`] = 'Start time must be in HH:MM format (24-hour).';
}

if (stopTime && !isValidTimeFormat(stopTime)) {
errors[`${day}_stopTime`] = 'Stop time must be in HH:MM format (24-hour).';
}

// Use enhanced validation for time pair with clearer error messages
if (startTime && stopTime && isValidTimeFormat(startTime) && isValidTimeFormat(stopTime)) {
const timePairError = validateTimePair(startTime, stopTime);
if (timePairError) {
errors[`${day}_times`] = timePairError;
}
}
} else if (daySchedule && (daySchedule.startTime || daySchedule.stopTime)) {
// Partial schedule validation
if (daySchedule.startTime && !daySchedule.stopTime) {
errors[`${day}_stopTime`] = 'Stop time is required when Start time is provided.';
}
if (daySchedule.stopTime && !daySchedule.startTime) {
errors[`${day}_startTime`] = 'Start time is required when Stop time is provided.';
}
}
});

if (!hasAtLeastOneDay) {
errors.general = 'At least one day must have a schedule configured.';
}

return errors;
};

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The function name and error messages are inconsistent. The function is named validateWeeklySchedule but it validates a daily schedule (different times per day). The error messages reference "Daily schedule" when they should reference "Weekly schedule" or vice versa. This naming mismatch contradicts the PR description which states DAILY is for different times per day.

Suggested change
const validateWeeklySchedule = (weeklySchedule?: IWeeklySchedule): { [key: string]: string } => {
const errors: { [key: string]: string } = {};
if (!weeklySchedule) {
errors.general = 'Daily schedule must be configured when selected.';
return errors;
}
const daysOfWeek: (keyof IWeeklySchedule)[] = ['monday', 'tuesday', 'wednesday', 'thursday', 'friday', 'saturday', 'sunday'];
let hasAtLeastOneDay = false;
daysOfWeek.forEach((day) => {
const daySchedule = weeklySchedule[day];
if (daySchedule && daySchedule.startTime && daySchedule.stopTime) {
hasAtLeastOneDay = true;
const { startTime, stopTime } = daySchedule;
// Validate time formats
if (startTime && !isValidTimeFormat(startTime)) {
errors[`${day}_startTime`] = 'Start time must be in HH:MM format (24-hour).';
}
if (stopTime && !isValidTimeFormat(stopTime)) {
errors[`${day}_stopTime`] = 'Stop time must be in HH:MM format (24-hour).';
}
// Use enhanced validation for time pair with clearer error messages
if (startTime && stopTime && isValidTimeFormat(startTime) && isValidTimeFormat(stopTime)) {
const timePairError = validateTimePair(startTime, stopTime);
if (timePairError) {
errors[`${day}_times`] = timePairError;
}
}
} else if (daySchedule && (daySchedule.startTime || daySchedule.stopTime)) {
// Partial schedule validation
if (daySchedule.startTime && !daySchedule.stopTime) {
errors[`${day}_stopTime`] = 'Stop time is required when Start time is provided.';
}
if (daySchedule.stopTime && !daySchedule.startTime) {
errors[`${day}_startTime`] = 'Start time is required when Stop time is provided.';
}
}
});
if (!hasAtLeastOneDay) {
errors.general = 'At least one day must have a schedule configured.';
}
return errors;
};
const validateDailySchedule = (weeklySchedule?: IWeeklySchedule): { [key: string]: string } => {
const errors: { [key: string]: string } = {};
if (!weeklySchedule) {
errors.general = 'Weekly schedule must be configured when selected.';
return errors;
}
const daysOfWeek: (keyof IWeeklySchedule)[] = ['monday', 'tuesday', 'wednesday', 'thursday', 'friday', 'saturday', 'sunday'];
let hasAtLeastOneDay = false;
daysOfWeek.forEach((day) => {
const daySchedule = weeklySchedule[day];
if (daySchedule && daySchedule.startTime && daySchedule.stopTime) {
hasAtLeastOneDay = true;
const { startTime, stopTime } = daySchedule;
// Validate time formats
if (startTime && !isValidTimeFormat(startTime)) {
errors[`${day}_startTime`] = 'Start time must be in HH:MM format (24-hour).';
}
if (stopTime && !isValidTimeFormat(stopTime)) {
errors[`${day}_stopTime`] = 'Stop time must be in HH:MM format (24-hour).';
}
// Use enhanced validation for time pair with clearer error messages
if (startTime && stopTime && isValidTimeFormat(startTime) && isValidTimeFormat(stopTime)) {
const timePairError = validateTimePair(startTime, stopTime);
if (timePairError) {
errors[`${day}_times`] = timePairError;
}
}
} else if (daySchedule && (daySchedule.startTime || daySchedule.stopTime)) {
// Partial schedule validation
if (daySchedule.startTime && !daySchedule.stopTime) {
errors[`${day}_stopTime`] = 'Stop time is required when Start time is provided.';
}
if (daySchedule.stopTime && !daySchedule.startTime) {
errors[`${day}_startTime`] = 'Start time is required when Stop time is provided.';
}
}
});
if (!hasAtLeastOneDay) {
errors.general = 'At least one day must have a schedule configured.';
}
return errors;
};

Comment on lines +179 to +180
const showDailySchedule = showScheduleOptions && scheduleType === ScheduleType.RECURRING;
const showWeeklySchedule = showScheduleOptions && scheduleType === ScheduleType.DAILY;

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The logic for determining which schedule UI to show is reversed. Line 179 shows daily schedule when scheduleType === ScheduleType.RECURRING, but based on the PR description, RECURRING means same time daily (which should show the daily schedule UI). Line 180 shows weekly schedule when scheduleType === ScheduleType.DAILY, but DAILY means different times per day (which should show the weekly schedule UI). These assignments are backwards.

Suggested change
const showDailySchedule = showScheduleOptions && scheduleType === ScheduleType.RECURRING;
const showWeeklySchedule = showScheduleOptions && scheduleType === ScheduleType.DAILY;
const showDailySchedule = showScheduleOptions && scheduleType === ScheduleType.DAILY;
const showWeeklySchedule = showScheduleOptions && scheduleType === ScheduleType.RECURRING;

Comment on lines +198 to +206
if (scheduleType === ScheduleType.RECURRING) {
const dailyError = validateDailySchedule(props.item.dailySchedule);
if (dailyError) {
errors.dailySchedule = dailyError;
}
} else if (scheduleType === ScheduleType.DAILY) {
const weeklyErrors = validateWeeklySchedule(props.item.weeklySchedule);
Object.assign(errors, weeklyErrors);
}

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The validation logic is calling the wrong validation functions for each schedule type. Line 199 calls validateDailySchedule when scheduleType === ScheduleType.RECURRING, but based on the corrected function names, it should call validateRecurringSchedule. Line 204 calls validateWeeklySchedule when scheduleType === ScheduleType.DAILY, but it should call validateDailySchedule.

Suggested change
if (scheduleType === ScheduleType.RECURRING) {
const dailyError = validateDailySchedule(props.item.dailySchedule);
if (dailyError) {
errors.dailySchedule = dailyError;
}
} else if (scheduleType === ScheduleType.DAILY) {
const weeklyErrors = validateWeeklySchedule(props.item.weeklySchedule);
Object.assign(errors, weeklyErrors);
}
if (scheduleType === ScheduleType.RECURRING) {
const recurringError = validateRecurringSchedule(props.item.dailySchedule);
if (recurringError) {
errors.dailySchedule = recurringError;
}
} else if (scheduleType === ScheduleType.DAILY) {
const dailyErrors = validateDailySchedule(props.item.weeklySchedule);
Object.assign(errors, dailyErrors);
}

Copy link

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Commits Files that changed from the base of the PR and between a3405dc and 51ca92a commits.
Files selected (11)
  • ecs_model_deployer/src/lib/ecs-model.ts (2)
  • lib/core/apiBaseConstruct.ts (2)
  • lib/mcp/mcp-server-api.ts (2)
  • lib/models/model-api.ts (8)
  • lib/rag/ragConstruct.ts (2)
  • lib/schema/constants.ts (1)
  • lib/schema/index.ts (1)
  • lib/serve/serveApplicationConstruct.ts (2)
  • test/lambda/test_schedule_handlers.py (14)
  • test/lambda/test_schedule_management.py (14)
  • test/lambda/test_state_machine_schedule_handlers.py (12)
Files skipped from review due to trivial changes (1)
  • lib/schema/constants.ts
Review comments generated (33)
  • Review: 33
  • LGTM: 0

Tips

Chat with AI reviewer (/reviewbot)

  • Reply on review comments left by this bot to ask follow-up questions. A review comment is a comment on a diff or a file.
  • Invite the bot into a review comment chain by tagging /reviewbot in a reply.

Code suggestions

  • The bot may make code suggestions, but please review them carefully before committing since the line number ranges may be misaligned.
  • You can edit the comment made by the bot and manually tweak the suggestion if it is slightly off.

Pausing incremental reviews

  • Add /reviewbot: ignore anywhere in the PR description to pause further reviews from the bot.

Comment on lines +210 to +220
new Rule(this, 'AutoScalingEventsRule', {
eventPattern: {
source: ['aws.autoscaling'],
detailType: [
'EC2 Instance Launch Successful',
'EC2 Instance Terminate Successful'
]
},
targets: [new LambdaFunction(scheduleMonitoringLambda)],
description: 'Triggers ScheduleMonitoring Lambda when Auto Scaling Group instances launch or terminate successfully',
});

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The CloudWatch Events rule is created but never stored or exported. If other constructs need to reference this rule (e.g., for adding additional targets or modifying it), consider storing it as a class property:

Suggested change
new Rule(this, 'AutoScalingEventsRule', {
eventPattern: {
source: ['aws.autoscaling'],
detailType: [
'EC2 Instance Launch Successful',
'EC2 Instance Terminate Successful'
]
},
targets: [new LambdaFunction(scheduleMonitoringLambda)],
description: 'Triggers ScheduleMonitoring Lambda when Auto Scaling Group instances launch or terminate successfully',
});
this.autoScalingEventsRule = new Rule(this, 'AutoScalingEventsRule', {
eventPattern: {
source: ['aws.autoscaling'],
detailType: [
'EC2 Instance Launch Successful',
'EC2 Instance Terminate Successful'
]
},
targets: [new LambdaFunction(scheduleMonitoringLambda)],
description: 'Triggers ScheduleMonitoring Lambda when Auto Scaling Group instances launch or terminate successfully',
});

Comment on lines +160 to +192
const scheduleManagementLambda = new Function(this, 'ScheduleManagement', {
runtime: getDefaultRuntime(),
handler: 'models.scheduling.schedule_management.lambda_handler',
code: Code.fromAsset(lambdaPath),
layers: lambdaLayers,
environment: {
MODEL_TABLE_NAME: modelTable.tableName,
},
role: stateMachinesLambdaRole,
vpc: vpc.vpc,
securityGroups: securityGroups,
timeout: Duration.minutes(5),
description: 'Manages Auto Scaling scheduled actions for LISA model scheduling',
});

const scheduleMonitoringLambda = new Function(this, 'ScheduleMonitoring', {
runtime: getDefaultRuntime(),
handler: 'models.scheduling.schedule_monitoring.lambda_handler',
code: Code.fromAsset(lambdaPath),
layers: lambdaLayers,
environment: {
MODEL_TABLE_NAME: modelTable.tableName,
ECS_CLUSTER_NAME: `${config.deploymentPrefix}-ECS-Cluster`,
LISA_API_URL_PS_NAME: lisaServeEndpointUrlPs.parameterName,
MANAGEMENT_KEY_NAME: managementKeyName,
REST_API_VERSION: 'v2',
},
role: stateMachinesLambdaRole,
vpc: vpc.vpc,
securityGroups: securityGroups,
timeout: Duration.minutes(5),
description: 'Processes Auto Scaling Group CloudWatch events to update model status',
});

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Both scheduleManagementLambda and scheduleMonitoringLambda are created but only scheduleManagementLambda is passed to the state machine. Consider storing both as class properties to enable external access and potential future integrations:

Suggested change
const scheduleManagementLambda = new Function(this, 'ScheduleManagement', {
runtime: getDefaultRuntime(),
handler: 'models.scheduling.schedule_management.lambda_handler',
code: Code.fromAsset(lambdaPath),
layers: lambdaLayers,
environment: {
MODEL_TABLE_NAME: modelTable.tableName,
},
role: stateMachinesLambdaRole,
vpc: vpc.vpc,
securityGroups: securityGroups,
timeout: Duration.minutes(5),
description: 'Manages Auto Scaling scheduled actions for LISA model scheduling',
});
const scheduleMonitoringLambda = new Function(this, 'ScheduleMonitoring', {
runtime: getDefaultRuntime(),
handler: 'models.scheduling.schedule_monitoring.lambda_handler',
code: Code.fromAsset(lambdaPath),
layers: lambdaLayers,
environment: {
MODEL_TABLE_NAME: modelTable.tableName,
ECS_CLUSTER_NAME: `${config.deploymentPrefix}-ECS-Cluster`,
LISA_API_URL_PS_NAME: lisaServeEndpointUrlPs.parameterName,
MANAGEMENT_KEY_NAME: managementKeyName,
REST_API_VERSION: 'v2',
},
role: stateMachinesLambdaRole,
vpc: vpc.vpc,
securityGroups: securityGroups,
timeout: Duration.minutes(5),
description: 'Processes Auto Scaling Group CloudWatch events to update model status',
});
this.scheduleManagementLambda = new Function(this, 'ScheduleManagement', {
runtime: getDefaultRuntime(),
handler: 'models.scheduling.schedule_management.lambda_handler',
code: Code.fromAsset(lambdaPath),
layers: lambdaLayers,
environment: {
MODEL_TABLE_NAME: modelTable.tableName,
},
role: stateMachinesLambdaRole,
vpc: vpc.vpc,
securityGroups: securityGroups,
timeout: Duration.minutes(5),
description: 'Manages Auto Scaling scheduled actions for LISA model scheduling',
});
this.scheduleMonitoringLambda = new Function(this, 'ScheduleMonitoring', {
runtime: getDefaultRuntime(),
handler: 'models.scheduling.schedule_monitoring.lambda_handler',
code: Code.fromAsset(lambdaPath),
layers: lambdaLayers,
environment: {
MODEL_TABLE_NAME: modelTable.tableName,
ECS_CLUSTER_NAME: `${config.deploymentPrefix}-ECS-Cluster`,
LISA_API_URL_PS_NAME: lisaServeEndpointUrlPs.parameterName,
MANAGEMENT_KEY_NAME: managementKeyName,
REST_API_VERSION: 'v2',
},
role: stateMachinesLambdaRole,
vpc: vpc.vpc,
securityGroups: securityGroups,
timeout: Duration.minutes(5),
description: 'Processes Auto Scaling Group CloudWatch events to update model status',
});

Comment on lines 77 to +78
@patch("models.scheduling.schedule_management.calculate_next_scheduled_action")
def test_update_operation_success(
self, mock_calculate_next, mock_autoscaling_client, mock_model_table, lambda_context, sample_schedule_config
self, mock_autoscaling_client, mock_model_table, lambda_context, sample_schedule_config

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The @patch decorator for calculate_next_scheduled_action was removed from the method signature. Verify that the test logic no longer depends on this mock. If the function is still being called internally, this could cause the test to fail or behave unexpectedly.

Comment on lines 87 to +91
# Mock successful Auto Scaling operations
mock_autoscaling_client.put_scheduled_update_group_action.return_value = {}
mock_autoscaling_client.describe_scheduled_actions.return_value = {"ScheduledUpdateGroupActions": []}
mock_autoscaling_client.describe_auto_scaling_groups.return_value = {
"AutoScalingGroups": [{"MinSize": 1, "MaxSize": 10, "DesiredCapacity": 3}]
}

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The mock setup was simplified by removing mock_calculate_next and describe_scheduled_actions mocks. Ensure that the actual implementation of lambda_handler doesn't call describe_scheduled_actions or that it's mocked elsewhere. If it does, this test may not properly validate the behavior.

Comment on lines +250 to +273
class TestHelperFunctions:
"""Test helper functions."""

result = calculate_next_scheduled_action(schedule_config)
def test_time_to_cron(self):
"""Test time_to_cron function."""
from models.scheduling.schedule_management import time_to_cron

# Just verify it returns a result
assert result is not None
assert result["action"] in ["START", "STOP"]

def test_timezone_conversion(self):
"""Test timezone conversion in schedule calculation."""
from models.scheduling.schedule_management import calculate_next_scheduled_action

schedule_config = {
"scheduleType": "RECURRING_DAILY",
"timezone": "America/New_York",
"dailySchedule": {"startTime": "09:00", "stopTime": "17:00"},
}

result = calculate_next_scheduled_action(schedule_config)

# Just verify it returns a result
assert result is not None
assert result["action"] in ["START", "STOP"]
result = time_to_cron("09:30")
assert result == "30 9 * * *"

def test_no_schedule_type(self):
"""Test None schedule type."""
from models.scheduling.schedule_management import calculate_next_scheduled_action
result = time_to_cron("23:45")
assert result == "45 23 * * *"

schedule_config = {"scheduleType": None, "timezone": "UTC"}
def test_time_to_cron_with_day(self):
"""Test time_to_cron_with_day function."""
from models.scheduling.schedule_management import time_to_cron_with_day

result = calculate_next_scheduled_action(schedule_config)
# Monday is 1
result = time_to_cron_with_day("09:30", 1)
assert result == "30 9 * * 1"

assert result is None
# Sunday is 0
result = time_to_cron_with_day("14:15", 0)
assert result == "15 14 * * 0"

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The entire TestCalculateNextScheduledAction class was replaced with TestHelperFunctions containing simpler utility function tests. According to the PR changes, complex scheduling calculation tests were removed. However, verify that the removed tests for calculate_next_scheduled_action with different schedule types (RECURRING_DAILY, EACH_DAY, timezone conversion) are not critical for validating the scheduling logic, or ensure equivalent coverage exists elsewhere.

Comment on lines +851 to +875
@patch("models.handler.schedule_handlers.schedule_management.delete_schedule")
def test_delete_schedule_lambda_error_handling(
self,
mock_delete_schedule,
mock_model_table,
mock_guardrails_table,
mock_autoscaling_client,
mock_stepfunctions_client,
sample_model_item,
):
"""Test DeleteSchedule Lambda error handling."""
"""Test DeleteSchedule schedule management error handling."""
mock_model_table.get_item.return_value = {"Item": sample_model_item}

# Setup mock lambda error response
mock_lambda_response = {"StatusCode": 500, "Payload": MagicMock()}
mock_lambda_response["Payload"].read.return_value = json.dumps(
{"statusCode": 500, "body": {"message": "Lambda execution failed"}}
).encode()
# Setup mock schedule management error response
mock_delete_schedule.return_value = {"statusCode": 500, "body": {"message": "Schedule management failed"}}

with patch("boto3.client") as mock_boto3:
mock_boto3.return_value.invoke.return_value = mock_lambda_response

handler = DeleteScheduleHandler(
autoscaling_client=mock_autoscaling_client,
stepfunctions_client=mock_stepfunctions_client,
model_table_resource=mock_model_table,
guardrails_table_resource=mock_guardrails_table,
)
handler = DeleteScheduleHandler(
autoscaling_client=mock_autoscaling_client,
stepfunctions_client=mock_stepfunctions_client,
model_table_resource=mock_model_table,
guardrails_table_resource=mock_guardrails_table,
)

with pytest.raises(ValueError, match="Failed to delete schedule: Lambda execution failed"):
handler(model_id="test-model")
with pytest.raises(ValueError, match="Failed to delete schedule: Schedule management failed"):
handler(model_id="test-model")

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The test mocks schedule_management.delete_schedule to return an error response. The expected error message is "Failed to delete schedule: Schedule management failed". Verify the handler code properly extracts the message from the response body.

Comment on lines +61 to 64
"scheduleType": "RECURRING",
"timezone": "UTC",
"dailySchedule": {"startTime": "09:00", "stopTime": "17:00"},
}

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The schedule type constant has been updated from RECURRING_DAILY to RECURRING which aligns with the PR changes. However, verify that all test data throughout this file uses the updated constant name consistently. Consider checking if there are other test methods that might still reference the old constant.

).encode()
mock_lambda_client.invoke.return_value = mock_lambda_response
# Mock schedule management error response
mock_update_schedule.return_value = {"statusCode": 500, "body": {"message": "Schedule management failed"}}

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The mock response structure has changed from a nested JSON string to a direct dict for the body. Verify that the actual schedule_management.update_schedule() function returns {"statusCode": 500, "body": {...}} with body as a dict (not JSON string). If the actual implementation returns body as a JSON string, this mock will not match the real behavior.

Comment on lines +118 to +120
# Verify result (should pass through unchanged, no schedule management call)
assert result["modelId"] == "test-model"
mock_lambda_client.invoke.assert_not_called()
assert "scheduled_action_arns" not in result

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The comment on line 118 states "should pass through unchanged, no schedule management call" but there's no assertion verifying that mock_update_schedule was not called. Consider adding mock_update_schedule.assert_not_called() to verify the handler correctly skips schedule management when ASG is missing.

Comment on lines +197 to +218
@patch("models.state_machine.schedule_handlers.schedule_management.delete_schedule")
def test_handle_cleanup_schedule_success(self, mock_delete_schedule, lambda_context):
"""Test successful schedule cleanup."""
from models.state_machine.schedule_handlers import handle_cleanup_schedule

# Mock event
event = {"modelId": "test-model"}

# Mock successful lambda response
mock_lambda_response = {"StatusCode": 200, "Payload": MagicMock()}
mock_lambda_response["Payload"].read.return_value = json.dumps(
{"statusCode": 200, "body": json.dumps({"message": "Schedule deleted successfully"})}
).encode()
mock_lambda_client.invoke.return_value = mock_lambda_response
# Mock successful schedule management response
mock_delete_schedule.return_value = {
"statusCode": 200,
"body": json.dumps({"message": "Schedule deleted successfully"}),
}

# Execute
result = handle_cleanup_schedule(event, lambda_context)

# Verify result
assert result["modelId"] == "test-model"

# Verify lambda invocation
mock_lambda_client.invoke.assert_called_once()
# Verify schedule management was called
mock_delete_schedule.assert_called_once()

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The test mocks schedule_management.delete_schedule but the old code mocked lambda_client. Ensure that the actual handle_cleanup_schedule implementation now calls schedule_management.delete_schedule() directly instead of invoking a Lambda function. The mock response structure (body as JSON string) should also be verified against the actual implementation.

@nishi7409
Copy link
Member Author

Screen.Recording.2025-11-24.at.10.25.21.mov
update.demo.mov
ASG.auto.schedule.mov

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants