Skip to content

Conversation

@epappas
Copy link
Member

@epappas epappas commented Dec 10, 2025

Summary by CodeRabbit

  • New Features

    • Comprehensive input validation framework for names, IDs, namespaces, descriptions, image refs, env var names, ports, and hostnames.
  • Bug Fixes

    • Stronger mount-path validation to prevent traversal/injection.
    • Tightened Kubernetes RBAC for secrets and removed unnecessary admin port exposure.
  • Security

    • Mandatory vulnerability scanning in CI with critical/high fail-on-findings and ignore-list support.
    • Added HTTP security headers, timeouts, container hardening, and health probes.

✏️ Tip: You can customize this high-level summary in your review settings.

* Add comprehensive input validation functions for user-provided data
* Implement validate_name, validate_user_id, validate_namespace
* Add validate_description, validate_image_reference, validate_env_var_name
* Include validate_port and validate_hostname with DNS compliance
* Prevent shell metacharacter injection in image references
* Export validation functions from mod.rs
* Add exhaustive unit tests for all validation functions
* Add validate_mount_path function to reject malicious paths
* Require absolute paths, reject empty/relative paths
* Block path traversal via '..' detection
* Restrict to safe characters (alphanumeric, dash, underscore, dot, slash)
* Enforce maximum path length of 4096 characters
* Add single quotes around mount path in shell script for defense
* Add CommandWrapError::InvalidMountPath error variant
* Add comprehensive security tests for mount path validation
* Add tests for command injection via user args (shlex escaping)
* Include tests for semicolon, pipe, backtick, newline injection
* Limit secrets access to named resources only (basilica-r2-credentials, etc.)
* Split configmap permissions: read-only for config, write for dynamic routes
* Add namespace get/list/watch/create permission for operator
* Restrict tenant role secret access to storage credential secrets
* Separate create verb (cannot use resourceNames) from update/patch
* Apply principle of least privilege to reduce blast radius
* Bind admin interface to localhost only (127.0.0.1)
* Add HTTP/1.1 security settings to prevent request smuggling
* Reject headers with underscores to mitigate header injection
* Set stream/request timeouts and max headers count
* Add security response headers (X-Content-Type-Options, X-Frame-Options)
* Enable pod-level security context with runAsNonRoot (UID 101)
* Set container-level securityContext with read-only root filesystem
* Drop all capabilities and prevent privilege escalation
* Add emptyDir volume for writable tmp directory
* Remove admin port from external exposure
* Add liveness and readiness probes
* Add Trivy scanner after each Docker image build step
* Scan validator, miner, api, operator, billing, payments, cli images
* Configure to fail on CRITICAL and HIGH severity vulnerabilities
* Use .trivyignore.yaml for vulnerability exceptions
* Add scanning to both test builds and production image builds
* Output results in table format for readability
@epappas epappas self-assigned this Dec 10, 2025
@coderabbitai
Copy link
Contributor

coderabbitai bot commented Dec 10, 2025

Walkthrough

Adds Trivy vulnerability scans to CI, a new input validation module with multiple validators and tests, mount-path validation and secure command wrapping in the operator, tightened RBAC rules for secrets/namespaces, Envoy timeout/security and pod hardening, and a .trivyignore.yaml file.

Changes

Cohort / File(s) Summary
CI/CD Security Scanning
.github/workflows/ci.yml
Adds Trivy vulnerability scanning steps to multiple CI jobs and the manual build-images path; scans built images (validator, miner, api, operator, billing, payments, cli and per-image variants), uses table format, severity CRITICAL,HIGH, exit-code 1, and .trivyignore.yaml to ignore listed items.
Trivy Ignore Configuration
.trivyignore.yaml
New Trivy ignore file with vulnerabilities list (empty) and explanatory comments/examples for ignore entries (id, statement, expires).
Input Validation Module
crates/basilica-common/src/validation/input.rs
crates/basilica-common/src/validation/mod.rs
Adds input validation module and re-exports; implements public validators: validate_name, validate_user_id, validate_namespace, validate_description, validate_image_reference, validate_env_var_name, validate_port, validate_hostname; returns detailed ValidationError variants and includes comprehensive tests.
Operator Mount Path & Command Wrapping
crates/basilica-operator/src/controllers/storage_utils.rs
Adds InvalidMountPath(String) error variant and validate_mount_path(path: &str) function; wrap_command_with_fuse_wait now validates mount paths and quotes interpolated mount_path in generated wait script; extensive security tests for path validation and shell-escaping edge cases added.
RBAC: Operator ClusterRole
orchestrator/k8s/core/rbac/operator-rbac.yaml
Removes broad create/list/watch/patch/update secrets rule; adds read-only secrets:get for specific resourceNames (basilica-r2-credentials, basilica-s3-credentials, basilica-gcs-credentials); adds namespaces rule (get,list,watch,create); removes ConfigMap write rule previously used for Envoy routing.
RBAC: Tenant Role
orchestrator/k8s/core/rbac/operator-tenant-role.yaml
Replaces broad secrets verbs with: read-only get,watch limited by resourceNames for the three credentials; a create rule for secrets to allow copying into tenant namespaces; and a patch,update rule scoped to the three credential resourceNames.
Envoy Config: Security & Timeouts
orchestrator/k8s/networking/envoy/envoy-configmap.yaml
Adds HTTP connection manager settings: HTTP/1.1 restrictions (disable HTTP/1.0, disallow absolute URLs), common protocol options (idle_timeout, max headers, header strictness, max stream duration), stream/request timeouts, and local_reply_config to inject security headers for certain status codes; binds Admin interface to localhost.
Envoy Deployment: Pod/Container Hardening
orchestrator/k8s/networking/envoy/envoy-deployment.yaml
Adds pod and container securityContext (runAsNonRoot, runAsUser/group, fsGroup, seccompProfile, readOnlyRootFilesystem, no privilege escalation, drop ALL caps), removes admin port exposure, adds liveness/readiness probes on /health:8080, adds EmptyDir tmp volume mounted at /tmp for writable admin logs.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

  • Review focus areas:
    • CI changes: confirm each Trivy step references the correct image names and that failing on CRITICAL/HIGH is intended.
    • Validation module: verify character sets, length bounds, and ValidationError variants match product requirements and internationalization needs.
    • Mount path validation and shell escaping: ensure the validation rules cover legitimate mount patterns and the generated shell logic is safe against injection.
    • RBAC manifests: verify resourceNames and verb splits don't inadvertently block required workflows (secret creation/updates and namespace creation).
    • Envoy config & Deployment: ensure timeouts, header injections, admin binding, and securityContext settings align with operational constraints and monitoring/health tooling.

Poem

🐰 I scanned each image with a vigilant eye,
I fenced off bad paths where mischief might lie,
Names and ports checked, secrets given fine-grain,
Envoy tucked in, probes to guard every lane—
Hopping home safe, I nibble a carrot of code. 🥕

Pre-merge checks and finishing touches

✅ Passed checks (3 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title clearly describes the main objective of the PR: security hardening for k3s. The changes span input validation, RBAC rules, Envoy security configs, deployment security contexts, and vulnerability scanning—all directly aligned with security hardening.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.
✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch feat/k3s-security-hardening

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 4

🧹 Nitpick comments (10)
orchestrator/k8s/networking/envoy/envoy-configmap.yaml (2)

49-51: Request timeout value may be overly permissive.

The request_timeout of 3600s (1 hour) is very lenient. Consider whether this aligns with your use case or should be reduced to a more typical value (e.g., 30s or 120s) to defend against slowloris attacks and resource exhaustion.


72-78: Consider narrowing the status code filter scope.

The status code filter with GE 100 applies security headers to all HTTP responses, including 1xx informational responses (e.g., 100 Continue). Consider using GE 200 or GE 400 to target only responses that should carry security headers (success, client errors, server errors), excluding informational responses.

orchestrator/k8s/networking/envoy/envoy-deployment.yaml (1)

81-92: Consider explicit probe timeout and failure thresholds.

The health probes lack explicit timeoutSeconds and failureThreshold values. The defaults (timeout: 1s, failureThreshold: 3) may be too tight for a proxy under load. Consider adding explicit values (e.g., timeoutSeconds: 5, failureThreshold: 3) to make failure detection more resilient.

crates/basilica-common/src/validation/input.rs (1)

105-161: Clarify the doc comment to match implementation.

The doc comment on line 110 states "Start with a letter" but the implementation specifically requires a lowercase letter (line 130). Consider updating the comment for precision.

Apply this diff:

 /// Namespace names must:
 /// - Be 1-63 characters
 /// - Contain only lowercase alphanumeric or dash
-/// - Start with a letter
+/// - Start with a lowercase letter
 /// - Not end with a dash
crates/basilica-operator/src/controllers/storage_utils.rs (6)

4-47: Strict mount path validation is good for security; double‑check it aligns with all expected real paths

The validator gives you a very tight surface (absolute, length‑bounded, ASCII only, [/._-], and no ..), which is great for injection resistance. The trade‑off is that it will now reject anything with other punctuation (e.g. ~, +, @, :) or non‑ASCII, and any path containing .. even when it’s not used for traversal (e.g. /foo..bar).

Before this ships broadly, I’d suggest reviewing where mount_path originates (PVC/CSI mount paths, hostPaths, etc.) and confirming that this restricted charset and the blanket contains("..") rule won’t reject legitimate paths already in use. If needed, it might be safer to (a) document the exact allowed pattern and/or (b) relax the character set slightly while still excluding shell‑significant characters.


57-90: wait_script interpolation is safe under current rules; consider future‑proofing if validation is relaxed later

Given validate_mount_path only allows ASCII alnum plus /._- and disallows quotes and $, interpolating {mount_path} directly into single‑quoted contexts (echo '…{mount_path}…', [ -f '{mount_path}/.fuse_ready' ]) is safe.

If you ever broaden the allowed character set (e.g., to permit spaces or quotes), this section will become a latent injection sink. It might be worth either (a) adding a short comment explicitly tying the allowed charset to these quoting assumptions, or (b) routing mount_path through a minimal shell‑safe quoting helper so this block doesn’t depend on the validator’s exact rules.


204-356: Mount path validation tests are solid; consider a couple of boundary/behavior clarifications

The negative tests cover a good spread of injection vectors (relative paths, traversal, shell metacharacters, newlines) and the positive cases with -, _, and . exercise the intended “safe” paths.

Two small follow‑ups you might consider:

  • Add an explicit max‑length test (length exactly 4096 and length 4097) to lock in the intended boundary behavior.
  • Add a test documenting what you expect for benign double‑dot usage like /data..backup so future readers know that contains("..") is intentionally rejecting that too (or adjust the validation if that should be allowed).

363-567: Command/arg injection tests are excellent but quite dependent on shlex’s exact quoting strategy

The coverage of semicolons, $(), backticks, pipes, redirects, newlines, quotes, env vars, and brace expansion is great from a security perspective. The main fragility I see is that several assertions bake in the exact string representation produced by shlex::try_join (e.g., expecting specific single vs. double quotes or escaped quotes), which might change between shlex versions even if the underlying security property (single, non‑splitting argument) is preserved.

To make these tests more resilient while still security‑focused, you could:

  • Prefer properties like “payload appears exactly once and only within a quoted segment” and “there is no unquoted ;/|/\n following exec …” over checking literal quoting style.
  • Optionally, factor out some helper(s) that, for a given payload, checks that there is no unquoted metacharacter in the exec line, so future changes to quoting don’t require touching dozens of tests.

Because this depends on the exact behavior of shlex::try_join, it’d be good to confirm against the crate’s docs or current behavior before tightening expectations further.


573-585: Empty command vector semantics may be surprising; consider treating it as invalid input

test_empty_command_array currently asserts that Some(vec![]) succeeds and yields a script containing exec . In many shells exec without a command is effectively a no‑op, but the intent at the API level is unclear: a caller supplying Some(vec![]) arguably provided an invalid “command”.

You might want to either:

  • Explicitly treat Some(vec![]) as invalid and return a ShellEscapeError/new error variant, or
  • Document that Some(vec![]) is accepted and intentionally results in “just wait, then fall through with a bare exec”.

Right now the behavior is a bit magical and not obviously useful to callers.


587-612: Edge‑case tests are good; the “unicode” test isn’t actually exercising non‑ASCII

The path‑with‑spaces test looks good and validates the shlex behavior you care about. For test_unicode_in_args, though, the argument "Hello" is pure ASCII, so it doesn’t actually stress unicode handling.

If you want this to cover unicode properly, consider changing the payload to something like "Héllø", "こんにちは", or an emoji, and then asserting that it passes through as a single argument (without needing to depend heavily on exact quoting form).

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 3b095cf and a510f59.

📒 Files selected for processing (8)
  • .github/workflows/ci.yml (13 hunks)
  • crates/basilica-common/src/validation/input.rs (1 hunks)
  • crates/basilica-common/src/validation/mod.rs (1 hunks)
  • crates/basilica-operator/src/controllers/storage_utils.rs (3 hunks)
  • orchestrator/k8s/core/rbac/operator-rbac.yaml (2 hunks)
  • orchestrator/k8s/core/rbac/operator-tenant-role.yaml (1 hunks)
  • orchestrator/k8s/networking/envoy/envoy-configmap.yaml (2 hunks)
  • orchestrator/k8s/networking/envoy/envoy-deployment.yaml (4 hunks)
🧰 Additional context used
🧠 Learnings (1)
📚 Learning: 2025-10-13T15:34:46.544Z
Learnt from: epappas
Repo: one-covenant/basilica PR: 184
File: crates/basilica-validator/src/rental/billing.rs:129-130
Timestamp: 2025-10-13T15:34:46.544Z
Learning: In the basilica-validator crate, rental_id is a String type (not UUID) throughout the system - in RentalInfo struct, database rentals.id column, and all rental operations. The persistence layer methods should accept &str for rental_id parameters, not &Uuid, since they immediately convert to String for SQL binding anyway.

Applied to files:

  • crates/basilica-common/src/validation/input.rs
🧬 Code graph analysis (2)
crates/basilica-common/src/validation/mod.rs (1)
crates/basilica-common/src/validation/input.rs (8)
  • validate_description (168-190)
  • validate_env_var_name (248-283)
  • validate_hostname (304-355)
  • validate_image_reference (199-241)
  • validate_name (16-59)
  • validate_namespace (112-161)
  • validate_port (286-296)
  • validate_user_id (67-103)
crates/basilica-common/src/validation/input.rs (4)
crates/basilica-sdk-python/python/basilica/exceptions.py (1)
  • ValidationError (122-153)
crates/basilica-sdk-python/python/basilica/deployment.py (1)
  • user_id (199-201)
crates/basilica-common/src/utils/docker_validation.rs (1)
  • image (38-39)
crates/basilica-sdk-python/python/basilica/_basilica.pyi (1)
  • port (234-234)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (11)
  • GitHub Check: test-python-sdk (3.13)
  • GitHub Check: test-python-sdk (3.12)
  • GitHub Check: test-python-sdk (3.10)
  • GitHub Check: test-python-sdk (3.11)
  • GitHub Check: build-payments (stable)
  • GitHub Check: build-billing (stable)
  • GitHub Check: build-cli (stable)
  • GitHub Check: build-validator (stable)
  • GitHub Check: build-operator (stable)
  • GitHub Check: build-api (stable)
  • GitHub Check: build-miner (stable)
🔇 Additional comments (16)
.github/workflows/ci.yml (2)

242-249: Pin Trivy action to a specific version instead of @master.

Using @master causes the workflow to pull the latest unvetted changes from the action repository on each run. This introduces reproducibility issues and potential security risks. Pinning to a specific release version ensures consistent behavior and allows for intentional updates.

Search for the latest stable release of aquasecurity/trivy-action and pin all instances to a specific version (e.g., @v0.X.Y format). You can find available versions at: https://github.com/aquasecurity/trivy-action/releases

Once you identify the appropriate version, replace all @master references:

-        uses: aquasecurity/trivy-action@master
+        uses: aquasecurity/[email protected]

This change applies to all Trivy scanning steps in the build jobs and the build-images job.

Also applies to: 304-311, 366-373, 422-429, 569-576, 679-686


242-249: Comprehensive Trivy vulnerability scanning adds strong security posture.

The addition of mandatory Trivy scanning across all container image builds is excellent security hardening. The configuration correctly:

  • Scans for CRITICAL and HIGH severity vulnerabilities
  • Fails the build (exit-code: 1) on findings, enforcing preventative security
  • Uses table format for clear vulnerability reports
  • Applies consistent scanning across all services (validator, miner, api, operator, billing, payments, cli)

This ensures no images with known critical/high vulnerabilities reach deployment stages.

Also applies to: 304-311, 366-373, 422-429, 569-576, 679-686, 833-840, 857-864, 881-888, 905-912, 927-934, 949-956, 972-979

orchestrator/k8s/networking/envoy/envoy-configmap.yaml (1)

93-93: Admin interface localhost binding is correct.

Restricting the admin interface to 127.0.0.1 effectively prevents external access to sensitive Envoy admin endpoints. This aligns well with removing admin port exposure from the deployment.

Also applies to: 98-98

orchestrator/k8s/networking/envoy/envoy-deployment.yaml (3)

57-57: Container security context is appropriately hardened.

The combination of allowPrivilegeEscalation: false, readOnlyRootFilesystem: true, and dropping all capabilities provides strong defense-in-depth. The /tmp volume mount at line 79-80 correctly supports the read-only root filesystem while preserving writable storage for logs.

Also applies to: 59-64


72-80: Volume configuration is appropriate for security posture.

The emptyDir volume for /tmp correctly supports the read-only root filesystem while providing ephemeral writable storage for admin logs. The volume lifecycle is properly tied to the pod.

Also applies to: 108-110


18-25: No action required—pod securityContext is correctly configured.

The Envoy image envoyproxy/envoy:v1.30 uses UID=101 and GID=101 as its default non-root user. The pod-level securityContext with runAsUser: 101, runAsGroup: 101, and fsGroup: 101 is aligned with the image's default and requires no changes.

Likely an incorrect or invalid review comment.

orchestrator/k8s/core/rbac/operator-rbac.yaml (1)

25-27: Namespace access rules are appropriately scoped for operator responsibilities.

The unscoped create verb for namespaces is necessary for an operator managing tenant provisioning. This is an appropriate use case where ClusterRole-level access is justified.

orchestrator/k8s/core/rbac/operator-tenant-role.yaml (1)

18-43: Secrets RBAC rules correctly implement least-privilege with appropriate scoping.

The split across read (lines 20-27), create (lines 31-33), and patch/update (lines 37-43) rules properly implements the least-privilege principle. The constraint note about resourceNames not working with create (lines 30-31) is correctly handled. The four credentials (basilica-r2-credentials, basilica-s3-credentials, basilica-gcs-credentials, basilica-storage-config) are appropriate for tenant namespace operations and intentionally differ from the cluster-wide operator-rbac.yaml, which manages platform-level configs like envoy settings.

crates/basilica-common/src/validation/mod.rs (1)

1-9: LGTM!

Clean module setup that exposes the new input validators through the public API. The structure follows the existing pattern for other validation modules.

crates/basilica-common/src/validation/input.rs (7)

16-59: LGTM!

The validation logic correctly enforces length limits, blocks control characters, and prevents leading/trailing whitespace. The allowance of spaces and dots in names is appropriate for general resource names like API keys, though note this wouldn't be suitable for Kubernetes resource names.


67-103: LGTM!

The validator appropriately handles federated identity formats (e.g., "github|12345", "user@domain") by allowing pipe and at-sign characters while maintaining security through control character blocking and length limits.


168-190: LGTM!

The validation appropriately allows newlines, tabs, and carriage returns for multi-line descriptions while blocking other potentially dangerous control characters.


248-283: LGTM!

Correctly implements POSIX environment variable naming conventions (must start with letter or underscore, contain only alphanumeric and underscore).


286-296: LGTM!

The validator ensures the port is non-zero and within the valid range (1-65535). Not restricting privileged ports (1-1023) is appropriate here, as that's typically an application-level policy decision.


304-355: LGTM!

Correctly implements DNS hostname validation per RFC 1035 standards (max 253 chars total, max 63 chars per label, no leading/trailing hyphens in labels, alphanumeric and hyphens only).


357-569: LGTM!

Comprehensive test coverage for all validators with appropriate focus on security scenarios (shell injection attempts, control characters, length limits). The tests cover both valid inputs and various invalid edge cases.

- Add .trivyignore.yaml for Trivy vulnerability scanning configuration
- Update validate_image_reference() documentation to accurately describe
  injection prevention validation behavior (not strict Docker format)
- Remove non-existent basilica-operator-config from Secrets RBAC rules
- Remove all ConfigMap RBAC rules (operator never accesses ConfigMaps;
  Envoy config is managed by basilica-api service)
- Remove non-existent basilica-storage-config from tenant role RBAC
- Apply rustfmt formatting changes
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🧹 Nitpick comments (4)
crates/basilica-operator/src/controllers/storage_utils.rs (2)

40-45: Consider refining the path traversal check.

The current contains("..") check would also reject legitimate paths containing double dots in filenames (e.g., /data/file..bak), though this is rare for mount paths. For stricter precision, consider checking for actual traversal patterns:

-    // Prevent path traversal
-    if path.contains("..") {
+    // Prevent path traversal - check for ".." as a path component
+    if path.split('/').any(|component| component == "..") {

This would allow file..txt while still rejecting /../ traversal attempts. However, the current stricter approach is also defensible for mount paths.


575-582: Consider validating non-empty command when provided.

An empty command vector (Some(vec![])) results in exec with nothing after it, which in shell simply continues execution (exec with no args returns success). This might mask bugs where a caller passes Some(vec![]) instead of None. Consider either:

  1. Treating Some(vec![]) the same as None
  2. Returning an error for empty command vectors
// Option 1: Normalize at the start of wrap_command_with_fuse_wait
let user_command = user_command.filter(|v| !v.is_empty());
crates/basilica-common/src/validation/input.rs (2)

9-61: Docstring for validate_name should mention dot (.) as an allowed character.

Implementation and tests allow . in names, and the error message explicitly lists it, but the doc bullet only mentions alphanumeric, dash, underscore, and space. To avoid confusion, update the doc to match behavior (or tighten the implementation, but the existing usage suggests . is intended to be allowed).

-/// - Contain only alphanumeric, dash, underscore, or space
+/// - Contain only alphanumeric, dash, underscore, space, or dot

Also applies to: 369-408


306-363: Hostname validation is correct; consider an extra test for invalid characters.

The implementation matches the documented DNS-style rules (label length, hyphen placement, character set) and the tests cover empty labels, hyphen edge cases, and label length. To lock in the invalid-character behavior, you might add a test for something like an underscore.

#[test]
fn test_validate_hostname_invalid_chars() {
    assert!(validate_hostname("invalid_host").is_err());
}

Also applies to: 550-577

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between a510f59 and 50fab6c.

📒 Files selected for processing (5)
  • .trivyignore.yaml (1 hunks)
  • crates/basilica-common/src/validation/input.rs (1 hunks)
  • crates/basilica-operator/src/controllers/storage_utils.rs (3 hunks)
  • orchestrator/k8s/core/rbac/operator-rbac.yaml (1 hunks)
  • orchestrator/k8s/core/rbac/operator-tenant-role.yaml (1 hunks)
✅ Files skipped from review due to trivial changes (1)
  • .trivyignore.yaml
🧰 Additional context used
🧠 Learnings (1)
📚 Learning: 2025-10-13T15:34:46.544Z
Learnt from: epappas
Repo: one-covenant/basilica PR: 184
File: crates/basilica-validator/src/rental/billing.rs:129-130
Timestamp: 2025-10-13T15:34:46.544Z
Learning: In the basilica-validator crate, rental_id is a String type (not UUID) throughout the system - in RentalInfo struct, database rentals.id column, and all rental operations. The persistence layer methods should accept &str for rental_id parameters, not &Uuid, since they immediately convert to String for SQL binding anyway.

Applied to files:

  • crates/basilica-common/src/validation/input.rs
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (12)
  • GitHub Check: build-payments (stable)
  • GitHub Check: test-python-sdk (3.13)
  • GitHub Check: test-python-sdk (3.10)
  • GitHub Check: test-python-sdk (3.12)
  • GitHub Check: test-python-sdk (3.11)
  • GitHub Check: build-validator (stable)
  • GitHub Check: build-billing (stable)
  • GitHub Check: build-miner (stable)
  • GitHub Check: build-api (stable)
  • GitHub Check: build-cli (stable)
  • GitHub Check: build-operator (stable)
  • GitHub Check: quick-checks
🔇 Additional comments (14)
orchestrator/k8s/core/rbac/operator-tenant-role.yaml (3)

18-27: ✓ Proper least-privilege read access with explicit resource names.

The read-only rule correctly restricts access to three named storage credentials using resourceNames, preventing access to other secrets in the namespace. Using resourceNames whenever possible is a Kubernetes best practice for restricting access to specific named instances of a resource.


28-33: ✓ Create rule correctly separates from read and update operations.

Since resourceNames cannot be used with create verbs (because resource names are not present in create API requests), the rule must not include resourceNames. The separate rule here properly implements this requirement. The comment correctly explains the limitation.


34-42: ✓ Update operations properly constrained to named credentials.

The dedicated patch/update rule correctly restricts modifications to only the three storage credentials, preventing unintended updates to other secrets in the namespace.

orchestrator/k8s/core/rbac/operator-rbac.yaml (3)

25-27: ✓ Namespace create permission appropriately scoped in ClusterRole.

The new namespace rule grants get, list, watch, and create verbs at cluster scope, which is appropriate for an operator that provisions tenant namespaces. If an Operator needs to deploy resources in any namespace, it requires a cluster role with permissions to create those resources. This implementation is standard for multi-tenant operators.


29-37: ✓ Cluster-wide read-only access to storage credentials.

The read-only rule with explicit resourceNames correctly restricts the operator to reading only the three storage credentials from the basilica-system namespace for distribution to tenant namespaces. This is more restrictive than the tenant-role variant (which also permits watch), reflecting the cluster-scoped nature of the operator's credential management responsibility.

Verify that the three referenced credentials (basilica-r2-credentials, basilica-s3-credentials, basilica-gcs-credentials) are consistently named across:

  1. Any Kubernetes secret manifests in the codebase
  2. Any Helm values or configuration that creates or references these secrets
  3. The tenant-role variant in this same PR (to confirm matching names)

This ensures the RBAC rules will function as intended when the operator attempts to read and copy these secrets to tenant namespaces.


1-74: All problematic ConfigMap RBAC rules have been successfully removed.

The file no longer contains any of the three flagged rules from the previous review:

  • ✓ No rule for basilica-operator-config ConfigMap
  • ✓ No rules for basilica-envoy-routes or basilica-envoy-clusters ConfigMaps
  • ✓ No cluster-level ConfigMap write permissions

The current operator-rbac.yaml contains only legitimate, scoped RBAC rules for core resources, secrets (read-only with resource restrictions), PodDisruptionBudgets, batch jobs, networking policies, Basilica CRDs, and deployments.

crates/basilica-operator/src/controllers/storage_utils.rs (4)

4-11: LGTM!

The error type is well-structured with thiserror and provides clear, descriptive error messages for both failure modes.


62-90: Good security design with defense-in-depth.

The validation-first approach with validate_mount_path at the start ensures early failure. The comment on lines 71-73 correctly explains why direct interpolation is safe after validation. Using single quotes on line 77 provides additional defense-in-depth.


204-538: Excellent security test coverage.

The tests comprehensively cover:

  • Mount path validation (empty, relative, traversal, various injection patterns)
  • Command/args escaping via shlex (semicolons, substitutions, pipes, redirects, quotes, null bytes)
  • Real-world attack payloads

Well-organized with clear section headers.


96-99: Shlex version correctly supports try_join API.

The code properly uses shlex::try_join with the pinned dependency shlex = "1.3" in Cargo.toml, which is well above the minimum version 1.2.0 where this fallible variant was introduced. No changes needed.

crates/basilica-common/src/validation/input.rs (4)

63-105: User ID validation logic and tests are consistent and conservative.

The constraints (length, limited ASCII set, and explicit control-character rejection) align with the docs and tests, and are appropriately strict for identifiers that may flow into logs or labels.

Also applies to: 409-434


197-249: Image reference validation and docs now align and look solid for injection hardening.

The function clearly scopes itself to “safe for use in shell commands”, bans common metacharacters, restricts to a tight ASCII set, and explicitly defers strict OCI parsing to validate_docker_image(). Tests cover typical valid forms and several injection patterns; behavior matches the documentation.

Also applies to: 493-513


251-291: Env var name validation matches POSIX-style conventions and is well tested.

The rules (leading letter/underscore, alphanumeric/underscore only, reasonable max length) are standard, and the tests cover both valid names and typical invalid patterns (digit-start, hyphen, dot, space).

Also applies to: 515-535


293-304: Port validation is minimal but correct given u16 typing.

Rejecting only 0 and otherwise accepting the full 1–65535 range is consistent with typical TCP/UDP constraints, and the tests exercise both valid bounds and the zero case.

Also applies to: 536-548

@epappas epappas merged commit 50fab6c into main Dec 10, 2025
10 of 17 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants