-
Notifications
You must be signed in to change notification settings - Fork 16
fix!: safe rpm lock handling + AWS provider bump #82
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
📝 WalkthroughWalkthroughThis PR updates infrastructure-as-code documentation and shell scripts to reflect newer AWS provider versions (>=6.0), bumps module dependencies (ssm_policy 2.0.1→2.0.2, tailscale_subnet_router 1.4.0→1.8.0), adds input variables for encryption and SSL enforcement, and significantly refactors the EC2 userdata script with improved error handling, centralized configuration, OS guards, and modular retry logic. Estimated code review effort🎯 4 (Complex) | ⏱️ ~50 minutes Possibly related PRs
Suggested reviewers
Pre-merge checks and finishing touches❌ Failed checks (1 warning, 1 inconclusive)
✅ Passed checks (3 passed)
✨ Finishing touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (1)
userdata.sh.tmpl (1)
51-77: Core RPM lock handling fix addresses the race condition effectively.The
wait_pkg_lockfunction properly waits on/var/lib/rpm/.rpm.lockusingfuserand checks for activednf/rpmprocesses—this aligns with the PR objective to avoid deleting the lock file.Minor concern with the fallback logic (lines 60-61): When
fuseris unavailable, checking file existence (-e) doesn't indicate the lock is held—the file may exist but be unlocked. Consider usinglsofas an alternative, or accept this as a conservative "better safe than sorry" fallback.Also, the PR description mentions checking for
yum, which isn't in thepgreplist. On AL2023,yumtypically symlinks todnf, so this is likely fine.Optional: Add lsof fallback if preferred
if command -v fuser >/dev/null 2>&1; then if fuser /var/lib/rpm/.rpm.lock >/dev/null 2>&1; then busy=true; fi + elif command -v lsof >/dev/null 2>&1; then + if lsof /var/lib/rpm/.rpm.lock >/dev/null 2>&1; then busy=true; fi else # Best-effort if fuser is missing if [ -e /var/lib/rpm/.rpm.lock ]; then busy=true; fi fi
📜 Review details
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (3)
README.mduserdata.sh.tmplversions.tf
🧰 Additional context used
📓 Path-based instructions (1)
**/*.tf
⚙️ CodeRabbit configuration file
**/*.tf: You're a Terraform expert who has thoroughly studied all the documentation from Hashicorp https://developer.hashicorp.com/terraform/docs and OpenTofu https://opentofu.org/docs/.
You have a strong grasp of Terraform syntax and prioritize providing accurate and insightful code suggestions.
As a fan of the Cloud Posse / SweetOps ecosystem, you incorporate many of their best practices https://docs.cloudposse.com/best-practices/terraform/ while balancing them with general Terraform guidelines.
Files:
versions.tf
🧠 Learnings (3)
📚 Learning: 2024-11-21T13:30:01.588Z
Learnt from: gberenice
Repo: masterpointio/terraform-aws-tailscale PR: 41
File: main.tf:0-0
Timestamp: 2024-11-21T13:30:01.588Z
Learning: In this Terraform module (`main.tf`), read permissions (`ssm:GetParameter`) for SSM parameters are managed by the SSM Agent module (`masterpointio/ssm-agent/aws`), so adding `ssm:GetParameter` permissions in the custom `ssm_policy` module is unnecessary.
Applied to files:
README.md
📚 Learning: 2024-11-18T13:00:36.189Z
Learnt from: gberenice
Repo: masterpointio/terraform-spacelift-aws-integrations PR: 2
File: versions.tf:5-8
Timestamp: 2024-11-18T13:00:36.189Z
Learning: If a Terraform module does not include any AWS resources, there's no need to declare the AWS provider.
Applied to files:
README.mdversions.tf
📚 Learning: 2024-10-30T16:38:33.362Z
Learnt from: gberenice
Repo: masterpointio/terraform-spacelift-automation PR: 3
File: modules/spacelift-automation/variables.tf:0-0
Timestamp: 2024-10-30T16:38:33.362Z
Learning: Spacelift requires a specific Terraform version number; it does not support version constraints like "~> 1.7.0".
Applied to files:
README.mdversions.tf
🔇 Additional comments (8)
README.md (2)
160-176: Documentation updates are consistent with code changes.The Requirements, Providers, and Modules sections correctly reflect:
- AWS provider
>= 6.0(matchesversions.tf)- Module version bumps for
ssm_policy(2.0.2) andtailscale_subnet_router(1.8.0)These appear to be auto-generated by the pre-commit terraform-docs hook.
196-197: New security-focused inputs are a good addition.The
allow_encrypted_uploads_onlyandallow_ssl_requests_onlyinputs enhance the S3 bucket security posture for session logging, supporting encryption and SSL enforcement requirements.userdata.sh.tmpl (5)
1-6: Solid shell script initialization with strict error handling.The
set -Eeuo pipefailis the right choice for infrastructure scripts—it ensures errors are caught early. The logging setup with tee and logger provides good observability.
13-25: Robust OS compatibility guard with sensible fallback.Good defensive practice to verify the script runs only on Amazon Linux 2023. The primary
/etc/os-releasecheck with a fallback to/etc/system-releasehandles edge cases well.
79-89: Clean retry helper with exponential backoff.The
retryfunction with exponential backoff (up to ~62s total) combined withwait_pkg_lockprovides robust package installation. Thepkgwrapper keeps the installation commands concise.
96-104: Tailscale installation flow is well-structured.The sequence—install prerequisites, add repo, refresh cache, install—follows best practices. Using
|| trueformakecacheis appropriate since cache refresh shouldn't block installation.
117-135: Tailscale startup sequence handles readiness and secrets appropriately.Good practices:
- Readiness wait loop before
tailscale up(40s max)set +xto prevent authkey exposure in logsNote on line 134: The
|| { echo "WARNING..."; }makestailscale upfailures non-fatal. This is reasonable for resilience (the instance stays functional for debugging), but transient network issues during provisioning could leave the router unconnected. Consider whether this aligns with your operational preferences—some teams prefer a hard failure here.versions.tf (1)
7-7: AWS provider constraint is appropriate, but the issue description needs correction.Updating to AWS provider
>= 6.0is a good move. In v6.0, thedata.aws_regiondata source deprecated itsnameattribute in favor ofregion. If Issue #80 involvesdata.aws_regionattribute access, ensure any references have been updated from.nameto.regionto remain compatible with v6+.
Gowiem
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@gberenice question on provider bump!
| aws = { | ||
| source = "hashicorp/aws" | ||
| version = ">= 5.0" | ||
| version = ">= 6.0" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@gberenice you mentioned that the AWS Provider constraint wasn't updated properly -- when did that happen and what was the usage of AWS that changed that required the bump? Technically this is should be a major provider rev I believe, so I want to make sure we're doing this right.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@Gowiem This happened within PR #76, and caused this recently reported issue: This object has no argument, nested block, or exported attribute named "region".
PR title is already configured for a major bump.
🤖 I have created a release *beep* *boop* --- ## [2.0.0](v1.11.0...v2.0.0) (2026-01-06) ### ⚠ BREAKING CHANGES * safe rpm lock handling + AWS provider bump ([#82](#82)) ### Bug Fixes * safe rpm lock handling + AWS provider bump ([#82](#82)) ([ce80388](ce80388)) --- This PR was generated with [Release Please](https://github.com/googleapis/release-please). See [documentation](https://github.com/googleapis/release-please#release-please). Co-authored-by: masterpointbot[bot] <177651640+masterpointbot[bot]@users.noreply.github.com> Co-authored-by: Veronika Gnilitska <[email protected]>
what
why
/var/lib/rpm/.rpm.lockcan corrupt the RPM DB. Also, other holders (yum, rpm, or any process holding the lock) aren’t caught.references
Summary by CodeRabbit
New Features
Chores
Bug Fixes
✏️ Tip: You can customize this high-level summary in your review settings.