Skip to content

Feature/aws managed capacity#1418

Open
wangsha wants to merge 173 commits intozappa:masterfrom
wangsha:feature/aws-managed-capacity
Open

Feature/aws managed capacity#1418
wangsha wants to merge 173 commits intozappa:masterfrom
wangsha:feature/aws-managed-capacity

Conversation

@wangsha
Copy link
Copy Markdown
Contributor

@wangsha wangsha commented Jan 26, 2026

Description

closes #1420
AWS Lambda Managed Instances Capacity Providers enables running Lambda functions on your own EC2 instances, giving tighter control over compute capacity, networking, and cost. Capacity Providers define the instance types and scaling behavior used to execute your functions.

This PR updates the project to support (and document) configuring Lambda Managed Instances via Capacity Providers, making it possible to opt into managed instance execution when you need predictable capacity or specific instance characteristics.

Key changes

Adds/updates configuration and documentation for defining a Lambda Capacity Provider.
Wires the new settings into the deployment flow (no behavior change unless explicitly enabled).

suriya and others added 30 commits July 17, 2021 16:39
When Zappa receives a compressed text/plain response from the
application, it tries to process it as a text response. Instead, Zappa
should treat the response as if it were a binary one and base-64 encode
the response body.

See issue #2080 binary_support logic in handler.py (0.51.0) broke compressed text response
Miserlou/Zappa#2080
version bump
Support AWS Lambdas ephemeral storage setting (zappa#1120)
Fix handling of gzip-encoded text response
version bump
… (currently openapi schema can't be passed as text.
@coveralls
Copy link
Copy Markdown

coveralls commented Jan 27, 2026

Coverage Status

coverage: 73.732% (-0.3%) from 74.02%
when pulling adf4b17 on wangsha:feature/aws-managed-capacity
into 0c76f11 on zappa:master.

@monkut
Copy link
Copy Markdown
Collaborator

monkut commented Jan 28, 2026

Can you create a "feature issue" for tracking purposes?

@wangsha
Copy link
Copy Markdown
Contributor Author

wangsha commented Jan 28, 2026

added

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR adds support for AWS Lambda Managed Instances Capacity Providers, a feature that enables running Lambda functions on EC2 instances for more predictable capacity and cost optimization. The implementation includes configuration support, deployment/update workflows, validation logic, and documentation for this new capability.

Changes:

  • Adds capacity_provider_config configuration option to specify Lambda capacity provider settings
  • Implements mutual exclusion between VPC configurations and capacity providers with validation
  • Adds waiting logic for capacity provider state transitions during deployments
  • Updates function URL handling with improved return values and CloudFront custom domain cleanup

Reviewed changes

Copilot reviewed 9 out of 9 changed files in this pull request and generated 18 comments.

Show a summary per file
File Description
zappa/core.py Core capacity provider implementation including create/update Lambda function support, VPC conflict validation, capacity provider polling logic, function URL improvements, and CloudFront cleanup
zappa/cli.py CLI integration for capacity provider config loading, deploy/update command modifications, undeploy support for function URL custom domains, and certify command updates
tests/test_settings.yaml Test configuration with capacity provider example
tests/test_wsgi_script_name_settings.py Added EXCEPTION_HANDLER config for test consistency
tests/test_core.py Comprehensive unit tests for capacity provider functionality including configuration, VPC rejection, concurrency handling, and state polling
tests/placebo/* Mock AWS API responses for CloudFront and Route53 operations
README.md Documentation for capacity_provider_config and function_url_domains settings

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread zappa/core.py
Comment on lines +3426 to +3428
if s["Sid"] in ["FunctionURLAllowPublicAccess"]:
continue
delete_response = self.lambda_client.remove_permission(FunctionName=lambda_name, StatementId=s["Sid"])
Copy link

Copilot AI Feb 17, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This change removes ALL Lambda function permissions except FunctionURLAllowPublicAccess, not just CloudWatch Events permissions. This is a breaking change that will remove permissions for API Gateway, SNS, S3, and other AWS services that may have been configured. The original code specifically checked for principal.get("Service") == "events.amazonaws.com" to only remove CloudWatch Events permissions. This change should preserve other permissions beyond just FunctionURL permissions.

Suggested change
if s["Sid"] in ["FunctionURLAllowPublicAccess"]:
continue
delete_response = self.lambda_client.remove_permission(FunctionName=lambda_name, StatementId=s["Sid"])
# Always preserve Function URL public access statement
if s.get("Sid") in ["FunctionURLAllowPublicAccess"]:
continue
# Only remove permissions created for CloudWatch Events
principal = s.get("Principal", {})
service = None
if isinstance(principal, dict):
service = principal.get("Service")
# Service may be a string or a list
if isinstance(service, list):
if "events.amazonaws.com" not in service:
continue
elif service != "events.amazonaws.com":
continue
delete_response = self.lambda_client.remove_permission(
FunctionName=lambda_name,
StatementId=s["Sid"],
)

Copilot uses AI. Check for mistakes.
Comment thread zappa/core.py
response = self.cloudfront_client.update_distribution(
DistributionConfig=new_config, Id=id, IfMatch=distribution["ETag"]
)
response
Copy link

Copilot AI Feb 17, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This statement on line 2750 has no effect and appears to be a leftover debug statement or incomplete code. The variable response is assigned but never used after this line. This line should either be removed or the intended logging/assertion should be added.

Suggested change
response

Copilot uses AI. Check for mistakes.
Comment thread zappa/cli.py
Comment on lines 1290 to +1304
@@ -1292,13 +1297,11 @@ def update(self, source_zip=None, no_upload=False, docker_image_uri=None):

if endpoint_url != api_url:
deployed_string = deployed_string + " (" + api_url + ")"

if self.stage_config.get("touch", True):
self.zappa.wait_until_lambda_function_is_updated(function_name=self.lambda_name)
if api_url:
self.touch_endpoint(api_url)
elif endpoint_url:
self.touch_endpoint(endpoint_url)
touch_url = api_url

if self.stage_config.get("touch", True):
self.touch_endpoint(touch_url)
Copy link

Copilot AI Feb 17, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The variable touch_url is uninitialized when neither self.use_apigateway nor the api_url condition on line 1300 is true. If self.use_apigateway is False, touch_url is never set, and calling touch_endpoint(touch_url) on line 1304 will raise a NameError. Initialize touch_url = endpoint_url before the conditional blocks or ensure all code paths define it.

Copilot uses AI. Check for mistakes.
Comment thread zappa/core.py
Comment on lines +1405 to +1407
versions_in_lambda = self.list_lambda_function_versions(function_name)
versions_in_lambda.remove("$LATEST")

Copy link

Copilot AI Feb 17, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The special version identifier "$LATEST.PUBLISHED" is not a standard AWS Lambda version identifier. AWS Lambda uses "$LATEST" for the unpublished version and numeric strings (e.g., "1", "2", "3") for published versions. The identifier "$LATEST.PUBLISHED" appears to be specific to capacity provider functionality but is not documented in standard Lambda API. Verify this identifier is correct, and if it's capacity-provider-specific, add a comment explaining its purpose.

Suggested change
versions_in_lambda = self.list_lambda_function_versions(function_name)
versions_in_lambda.remove("$LATEST")
versions_in_lambda = self.list_lambda_function_versions(function_name)
# "$LATEST" is the standard, unpublished version and should never be deleted.
versions_in_lambda.remove("$LATEST")
# "$LATEST.PUBLISHED" is not part of the documented Lambda versioning API.
# It can appear when Lambda capacity providers are used; we ignore it here so
# that we do not attempt to delete it as a normal published version.

Copilot uses AI. Check for mistakes.
Comment thread zappa/cli.py
function_url_config=self.function_url_config,
)
self.zappa.deploy_lambda_function_url(**kwargs)
endpoint_url = self.zappa.deploy_lambda_function_url(**kwargs)
Copy link

Copilot AI Feb 17, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the deploy method, endpoint_url is assigned the return value of deploy_lambda_function_url on line 959, which returns a response dictionary, not a string URL. This will cause touch_endpoint on line 1015 to fail when trying to concatenate the response dictionary with a string path. The return value should be extracted as endpoint_url = self.zappa.deploy_lambda_function_url(**kwargs)["FunctionUrl"] to get the actual URL string.

Suggested change
endpoint_url = self.zappa.deploy_lambda_function_url(**kwargs)
endpoint_url = self.zappa.deploy_lambda_function_url(**kwargs)["FunctionUrl"]

Copilot uses AI. Check for mistakes.
Comment thread zappa/core.py
response = self.lambda_client.list_function_url_configs(FunctionName=function_name, MaxItems=50)
if not response.get("FunctionUrlConfigs", []):
print("no function url configured on lambda, skip setting custom domains")
logger.info("no function url configured on lambda, skip setting custom domains")
Copy link

Copilot AI Feb 17, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Similar to the issue in undeploy_function_url_custom_domain, this method logs a message when no function URL configs exist (line 1872) but doesn't return early, then unconditionally accesses response["FunctionUrlConfigs"][0] on line 1873, which will raise an IndexError. Add a return statement after the log message on line 1872.

Suggested change
logger.info("no function url configured on lambda, skip setting custom domains")
logger.info("no function url configured on lambda, skip setting custom domains")
return

Copilot uses AI. Check for mistakes.
Comment thread zappa/core.py
Comment on lines 1828 to +1856
@@ -1672,19 +1846,21 @@ def update_lambda_function_url(self, function_name, function_url_config):
)
else:
response = self.lambda_client.update_function_url_config(
FunctionName=function_name, AuthType=function_url_config["authorizer"]
FunctionName=function_name,
AuthType=function_url_config["authorizer"],
)
print("function URL address: {}".format(response["FunctionUrl"]))
logger.info("function URL address: {}".format(response["FunctionUrl"]))
self.update_function_url_policy(config["FunctionArn"], function_url_config)
return response["FunctionUrl"]
else:
self.deploy_lambda_function_url(function_name, function_url_config)
return self.deploy_lambda_function_url(function_name, function_url_config)
Copy link

Copilot AI Feb 17, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Inconsistent return type between deploy_lambda_function_url and update_lambda_function_url. The deploy_lambda_function_url method returns the full response object (line 1828), while update_lambda_function_url returns just response["FunctionUrl"] as a string (line 1854) in the update path, but falls back to calling deploy_lambda_function_url (line 1856) which returns the full response. This inconsistency will cause issues for callers expecting a consistent return type. Both methods should return the same type, preferably just the FunctionUrl string.

Copilot uses AI. Check for mistakes.
Comment thread zappa/core.py
function_arn: str,
capacity_provider_name: str,
function_state: str = "Active",
marker: str | None = None,
Copy link

Copilot AI Feb 17, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The type hint str | None uses the union type syntax with the pipe operator, which was introduced in Python 3.10. According to the PR description and README, this project supports Python 3.9. For Python 3.9 compatibility, use Optional[str] from the typing module instead, or use Union[str, None]. The code already imports Optional from typing on line 25.

Suggested change
marker: str | None = None,
marker: Optional[str] = None,

Copilot uses AI. Check for mistakes.
Comment thread zappa/core.py
logger.info("Updating Lambda function code..")

kwargs = dict(FunctionName=function_name, Publish=publish)
kwargs = dict(FunctionName=function_name, Publish=publish, Architectures=[self.architecture])
Copy link

Copilot AI Feb 17, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

According to AWS Lambda API documentation, the Architectures parameter is not valid for the update_function_code API call. It can only be set during function creation or via update_function_configuration. Including it here will cause an API error. Remove Architectures from the kwargs on line 1343, or ensure architecture updates happen via update_function_configuration instead.

Suggested change
kwargs = dict(FunctionName=function_name, Publish=publish, Architectures=[self.architecture])
kwargs = dict(FunctionName=function_name, Publish=publish)

Copilot uses AI. Check for mistakes.
Comment thread zappa/core.py
Comment on lines +1269 to +1271
if capacity_provider_config:
kwargs["CapacityProviderConfig"] = capacity_provider_config
kwargs.pop("VpcConfig")
Copy link

Copilot AI Feb 17, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same issue as in update_lambda_configuration: popping "VpcConfig" from kwargs after it's been added is redundant when using capacity providers. Since the validation on lines 1244-1248 already ensures VPC and capacity provider aren't used together, VpcConfig should either not be added to kwargs when capacity_provider_config is set, or this pattern should be documented with a comment explaining the reasoning.

Copilot uses AI. Check for mistakes.
Copy link
Copy Markdown
Collaborator

@monkut monkut left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PR Review: Feature/AWS Managed Capacity (#1418)

Thanks for the contribution! This adds support for AWS Lambda Managed Instances Capacity Providers — a useful feature for users needing predictable capacity on their own EC2 instances.

Overall the feature is well-structured with good test coverage for the new functionality. However, there are several issues that need to be addressed before merging, ranging from bugs that will cause runtime errors to a breaking behavioral change.


Critical Issues

1. _clear_policy() regression — will delete ALL Lambda permissions (zappa/core.py)

This is the most concerning change. The current code in master only removes CloudWatch Events permissions (created by schedule_events), intentionally preserving API Gateway, SNS, S3, and other service permissions. The PR changes this to delete every permission statement except FunctionURLAllowPublicAccess.

This will break any deployment that uses API Gateway, SNS triggers, S3 event notifications, or any other service-based permission. The original filter (principal.get("Service") == "events.amazonaws.com") was deliberate and well-documented.

The fix should preserve the original CloudWatch Events filter while also skipping FunctionURLAllowPublicAccess:

if s["Sid"] in ["FunctionURLAllowPublicAccess"]:
    continue
# Only remove CloudWatch Events permissions (created by schedule_events)
principal = s.get("Principal", {})
if isinstance(principal, dict) and principal.get("Service") == "events.amazonaws.com":
    delete_response = ...

2. Missing return after "no function url" check — will IndexError (zappa/core.py)

Both update_lambda_function_url_domains() (line ~1854) and undeploy_function_url_custom_domain() (line ~2783) log "no function url configured" but don't actually return. Execution continues and immediately hits response["FunctionUrlConfigs"][0] which raises IndexError on an empty list.

# Both methods need:
if not response.get("FunctionUrlConfigs", []):
    logger.info("no function url configured on lambda, skip...")
    return  # <-- missing

3. deploy_lambda_function_url() return value mismatch (zappa/cli.py:958)

In deploy(), the PR assigns endpoint_url = self.zappa.deploy_lambda_function_url(**kwargs) but deploy_lambda_function_url returns the full response dict, not a URL string. This will cause touch_endpoint() to fail. Should be:

endpoint_url = self.zappa.deploy_lambda_function_url(**kwargs)["FunctionUrl"]

4. Inconsistent return types between deploy_lambda_function_url and update_lambda_function_url

deploy_lambda_function_url returns the full response dict. update_lambda_function_url returns response["FunctionUrl"] (a string) in the update path but falls through to deploy_lambda_function_url (returning a dict) in the create path. Callers can't rely on the return type. Normalize both to return the URL string.

5. update_domain_name()res undefined when self.apigateway is falsy (zappa/core.py:~3353)

The method wraps the apigateway_client.update_domain_name() call in if self.apigateway: but then unconditionally returns res. If self.apigateway is falsy, res is never defined → NameError.

6. Architectures added to update_function_code kwargs (zappa/core.py:1318)

Architectures is not a valid parameter for the update_function_code API. It can only be set via create_function or update_function_configuration. This will cause an API error on every update.

7. Python 3.9 compatibility — str | None type hint (zappa/core.py:1623)

The str | None union syntax requires Python 3.10+. The project supports Python 3.9. Use Optional[str] instead.


Moderate Issues

8. touch_url may be uninitialized (zappa/cli.py)

In the update() method, touch_url = endpoint_url is set, but endpoint_url itself may not be set if the function URL update path returns inconsistent types (see #4 above). The flow is fragile.

9. Hardcoded time.sleep(10) (zappa/core.py:~1524)

A hardcoded 10-second sleep with no explanation. This should either be documented with a reason, or replaced with a polling/waiter approach consistent with the rest of the codebase.

10. max() on potentially empty sequence (zappa/core.py:~1518)

max(int(v) for v in versions_in_lambda if v.isdigit()) will raise ValueError if there are no numeric versions. Add a guard:

numeric_versions = [int(v) for v in versions_in_lambda if v.isdigit()]
if numeric_versions:
    latest_version = max(numeric_versions)

11. Capacity provider name extraction is fragile (zappa/core.py:~1500-1510)

The ARN parsing logic (split("capacity-provider/") then rsplit("/")) is hard to follow and may not handle all formats. Consider a cleaner approach by splitting the ARN by : and then parsing the resource component, or using a regex.

12. Standalone expression response does nothing (zappa/core.py:~2802)

response = self.cloudfront_client.update_distribution(...)
response  # <-- this line does nothing, leftover debug?

Remove it.

13. f-string nested quotes (zappa/core.py:~1516)

function_arn=f"{response["FunctionArn"]}:{latest_version}"

This is a syntax error in Python <3.12. Use single quotes for the dict key inside the f-string:

function_arn=f"{response['FunctionArn']}:{latest_version}"

Minor / Style

14. log_type="Tail" if not self.capacity_provider_config else "None" (zappa/cli.py:~1579)

Why does log_type change to "None" when using capacity providers? This should have a comment explaining the reasoning — capacity provider log behavior differs from standard Lambda?

15. "$LATEST.PUBLISHED" version identifier (zappa/core.py:~1382)

This doesn't appear in standard AWS Lambda documentation. If it's capacity-provider-specific, add a comment explaining its purpose so future maintainers understand why it's handled.

16. Commit history is very messy

The branch has 80+ commits including many merges from master, unrelated changes (Python 3.11 support, werkzeug fixes, version bumps, etc.), and commits from other contributors' PRs that were merged into the fork. This makes the actual capacity provider changes hard to review. Consider squashing into a clean set of commits that only contain the capacity provider feature.

17. function_url_domains docs added to README but that setting is from a different feature

The README diff adds function_url_domains documentation — this appears to be from the function URL custom domains feature, not the capacity provider feature. If it's already in master, the diff is just noise. If not, it should be a separate PR.


Summary

The core capacity provider integration (config loading, create_lambda_function, update_lambda_configuration, wait_for_capacity_provider_response) is well-implemented with appropriate VPC validation and concurrency guards. The test coverage for these paths is good.

The main blocker is the _clear_policy() regression (#1), which will break existing deployments. The missing return statements (#2) and return type mismatches (#3, #4) will cause runtime errors. The Architectures in update_function_code (#6) will cause API errors on every update.

Recommend fixing issues #1-7 before merge, and squashing the commit history.

monkut

This comment was marked as off-topic.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Feature] Support Lambda managed instances capacity providers

8 participants