Fix #5 unzip overwrite failure and add cache reuse#6
Conversation
When the awscli cache layer is restored by the lifecycle, the layer directory already contains the aws/ tree from the previous install. unzip without -o prompts to replace existing files; in non-interactive builds it reads EOF, chooses "None", and exits non-zero, triggering the "Failed to extract AWS CLI" error. Two fixes: - Add -o flag to unzip so it overwrites without prompting - Add cache reuse: if the cached binary exists and runs for the same architecture, skip the 64MB download+install entirely Also pass --update to aws/install to handle the case where the previous aws-cli directory already exists. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
There was a problem hiding this comment.
🤖 NeetoBugWatch Review
I've analyzed your pull request and found some potential improvements that might be worth considering. Please review the specific line comments below for helpful suggestions.
💡 To trigger a re-review of this PR, add a comment with neeto-bugwatch-run on its own line (not inside a code block or blockquote).
Warning
Verdict: Needs review — 3 comments found, including major issues
Click to copy prompt for AI agent
Verdict: review_required
The following are code review comments from an AI reviewer that only has access
to the PR diff, not the full codebase. Treat every comment as a hypothesis,
not a fact. For each comment:
1. Investigate the verification questions against the full codebase first.
2. Go beyond the provided questions. Use your full codebase access to ask your
own questions, check related code paths, and consider edge cases that the
reviewer could not have seen from the diff alone.
3. Only implement the suggested fix if your investigation confirms the issue is real.
4. If the issue is not real, skip it entirely. Do not implement speculative fixes.
Prioritize by severity (critical > major > minor > trivial) and confidence score.
Only attend to comments with confidence below 0.5 if your investigation proves them valid.
[
{
"file_path": "bin/build",
"line_number": 93,
"comment": "The installation process leaves the extracted `aws/` installer directory inside `TARGET_DIR`. Since `TARGET_DIR` acts as the buildpack layer directory, leaving this large installer directory (150MB+) will bloat the cached layer and the final container image size.",
"confidence": 0.95,
"severity": "major",
"suggested_fix": "Remove the extracted installer directory by appending `rm -rf \"$TARGET_DIR/aws\"` when removing the zip file.",
"verification_questions": [
"Does `unzip awscli.zip` extract an `aws/` directory inside `TARGET_DIR`?",
"Are the files inside `TARGET_DIR/aws/` needed after the `TARGET_DIR/aws/install` script completes?"
]
},
{
"file_path": "bin/build",
"line_number": 68,
"comment": "The `curl` command lacks the `-f` (or `--fail`) flag. If the URL responds with an HTTP error (like 404), `curl` will still succeed (exit code 0) but save the HTML error page into `awscli.zip`. This obscures the root cause, as the script will confusingly fail at the `unzip` step instead.",
"confidence": 0.95,
"severity": "minor",
"suggested_fix": "Add the `-f` flag to `curl` (e.g., `curl -fL`) so that it exits with a non-zero status code on HTTP errors.",
"verification_questions": [
"What is the current behavior if `awscli_arch` evaluates to an invalid architecture resulting in a 404?",
"Would debugging be easier if the script failed immediately at the download step with an explicit HTTP error?"
]
},
{
"file_path": "bin/build",
"line_number": 61,
"comment": "Parsing TOML with `grep` and `sed` can be fragile. `grep 'architecture'` might accidentally match descriptions or comments containing the word 'architecture' before the actual key. Also, the `sed` regex strictly requires double quotes; if single quotes are used, it will fail to capture the value and fallback to the entire matched line, causing unnecessary cache misses.",
"confidence": 0.85,
"severity": "minor",
"suggested_fix": "Consider anchoring the grep search (e.g., `grep -E '^\\s*architecture\\s*='`) to match only the key, and relaxing the `sed` regex to handle optional single quotes, or use a tool like `yj` or `jq` if available.",
"verification_questions": [
"Could the TOML file contain the word 'architecture' in preceding lines, such as descriptions or comments?",
"Does the buildpack lifecycle ever write TOML values using single quotes instead of double quotes?"
]
}
]
| else | ||
| echo " ! awscli: Binary not found after installation" | ||
| exit 1 | ||
|
|
There was a problem hiding this comment.
The installation process leaves the extracted aws/ installer directory inside TARGET_DIR. Since TARGET_DIR acts as the buildpack layer directory, leaving this large installer directory (150MB+) will bloat the cached layer and the final container image size.
Click to see review metadata
{
"confidence": 0.95,
"severity": "major",
"suggested_fix": "Remove the extracted installer directory by appending `rm -rf \"$TARGET_DIR/aws\"` when removing the zip file.",
"verification_questions": [
"Does `unzip awscli.zip` extract an `aws/` directory inside `TARGET_DIR`?",
"Are the files inside `TARGET_DIR/aws/` needed after the `TARGET_DIR/aws/install` script completes?"
]
}| fi | ||
|
|
||
| if [[ "$cached_arch" == "$target_arch" ]] && [[ -f "$AWS_CLI_BIN" ]] && "$AWS_CLI_BIN" --version > /dev/null 2>&1; then | ||
| echo " awscli: Reusing cached AWS CLI for ${target_arch}" |
There was a problem hiding this comment.
The curl command lacks the -f (or --fail) flag. If the URL responds with an HTTP error (like 404), curl will still succeed (exit code 0) but save the HTML error page into awscli.zip. This obscures the root cause, as the script will confusingly fail at the unzip step instead.
Click to see review metadata
{
"confidence": 0.95,
"severity": "minor",
"suggested_fix": "Add the `-f` flag to `curl` (e.g., `curl -fL`) so that it exits with a non-zero status code on HTTP errors.",
"verification_questions": [
"What is the current behavior if `awscli_arch` evaluates to an invalid architecture resulting in a 404?",
"Would debugging be easier if the script failed immediately at the download step with an explicit HTTP error?"
]
}| if [ -f "$AWS_CLI_BIN" ]; then | ||
| # Reuse cached layer if binary exists and was built for the same architecture. | ||
| # Reading cached_arch from the TOML written by a previous build is reliable | ||
| # because the lifecycle restores it from the cache image before bin/build runs. |
There was a problem hiding this comment.
Parsing TOML with grep and sed can be fragile. grep 'architecture' might accidentally match descriptions or comments containing the word 'architecture' before the actual key. Also, the sed regex strictly requires double quotes; if single quotes are used, it will fail to capture the value and fallback to the entire matched line, causing unnecessary cache misses.
Click to see review metadata
{
"confidence": 0.85,
"severity": "minor",
"suggested_fix": "Consider anchoring the grep search (e.g., `grep -E '^\\s*architecture\\s*='`) to match only the key, and relaxing the `sed` regex to handle optional single quotes, or use a tool like `yj` or `jq` if available.",
"verification_questions": [
"Could the TOML file contain the word 'architecture' in preceding lines, such as descriptions or comments?",
"Does the buildpack lifecycle ever write TOML values using single quotes instead of double quotes?"
]
}Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Summary
unzip -q awscli.zip(without-o) prompts to overwrite existing files when the cache is restored. In non-interactive builds it reads EOF, treats as "None", exits non-zero → "Failed to extract AWS CLI"-otounzipso it overwrites without prompting$AWS_CLI_BINexists, runs, and was built for the same architecture (checked via cached TOML metadata), skip the 64MB download + install entirely--updatetoaws/installfor the case where theaws-cli/directory already exists, and removes the zip after install to clean upBehavior change
Test plan
"Reusing cached AWS CLI for arm64"on second build🤖 Generated with Claude Code