-
Notifications
You must be signed in to change notification settings - Fork 3
Refactor deploying image logic #346
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
@Haivilo for the sake of documentation, can you provide details on performance gains as a result of this change in a comment? @Danidite and @vGsteiger can you engage with Steve to fix rough edges and bring the new capability to the project? Thanks |
Hi @engshahrad, the build time improves as the number of functions increase, which reaches ~50% reduction when you have 5 workflow functions. This was tested locally. Here is the improvement of example build time: For the performance related to migrating speed, I will discuss with @Danidite and get back to you. I was unable to retrieve the results efficiently. |
…ing and function registration - Changed `_copy_image_to_region` to `_copy_image_if_not_exists` and modified logic - Added logic to skip image copying if it already exists in the current region. - Updated Docker image naming to use `workflow_instance_id`, so every lambda function uses the same docker image. - Introduced `set_wrapped_function` method in `CaribouFunction` to register wrapped functions. - Enhanced `CaribouWorkflow` to pass `successor_function_name` in various methods for better tracking. - Added a new `generic_handler.py` to handle dynamic function routing in Lambda. - Updated deployment packager to include the new generic handler. - Adjusted tests to reflect changes in image copying logic.
b1eb883
to
f0b2c81
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It overall looks very good, great work! However, I do have a few minor comments and concerns.
workflow = app.workflow | ||
|
||
# Get payload and target function | ||
payload = _get_payload(event) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems like this function was never used. The "result = target_function(event)" seems to read the original event. Is the target function suppose to take in the original event or the parsed payload?
# Get payload and target function | ||
payload = _get_payload(event) | ||
target_function_name = event.get("target") if isinstance(event, dict) else None | ||
target_function, func_name = _find_target_function(workflow, target_function_name) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Function name is also never used here, is this intentional? If so maybe the _find_target_function(...) function should only return the target_function as its not used anywhere else.
target_function_name = event.get("target") if isinstance(event, dict) else None | ||
target_function, func_name = _find_target_function(workflow, target_function_name) | ||
|
||
_, _ = payload, func_name # Unused variables, for now disabled to run tests |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Make sure to remove this after the above fixes/changes, I temporarily added it to pass the compliance tests.
parts = deployed_image_uri.split("/") | ||
original_region = parts[0].split(".")[3] | ||
original_image_name = parts[1] | ||
|
||
ecr_client = self._client("ecr") | ||
new_region = ecr_client.meta.region_name | ||
if new_region == original_region: | ||
logger.info("Image already exists in the %s region, skipping copy", new_region) | ||
return deployed_image_uri |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If I am understanding it correctly, all this checks is if the new region does not equal to the original region name (home region I assume). So for scenarios where this is not the first re-deployment, would it still perform redundant ECR operations? If so would a better way to check simply just checking if the image already exist in the ECR via some boto3 call?
for func_name, caribou_func in workflow.functions.items(): | ||
if caribou_func.name == target_name: | ||
return caribou_func.wrapped_function, func_name | ||
else: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Honestly, this else statement where if no target_name is provided it defaults to calling the entry point seems potentially risky for future maintenance. If the target is somehow lost in a future change, it may result in infinite loops (if they also bypass our other failsafes). A better approach might be to modify the invoker (client CLI) so that even for the first function, it must explicitly specify the target_name of the entry_point function. Then, in this code, you can simply raise an error and terminate if target_name is missing. This doesn't need to be implemented in this PR, just create a new issue for the change.
TODO
Fix integration tests
Summary of Changes
Image Mapping Changed from Per-Function to Per-Workflow
workflow_instance_id
instead of per function.Introduced Generic Lambda Handler
generic_handler.py
, a dynamic entrypoint that routes invocations based on thetarget
field in the event payload.Payload Format Enhancement
"target"
field indicating the function name to invoke, to support the generic Lambda handler.Minor Enhancements
workflow_instance_id
instead of full function name._copy_image_to_region
was renamed to_copy_image_if_not_exists
and now avoids redundant ECR operations when the image exists in the same region.