-
Notifications
You must be signed in to change notification settings - Fork 7
feat(aggregator): managed machines sync #288
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
WalkthroughAdds end-to-end VIP rental support: new VIP config, CSV/S3/Mock data sources, in-memory cache, DB-backed rental operations, a reconciliation poller with optional billing, server wiring, CLI/SDK flags, and a DB migration for VIP rentals. Changes
Sequence Diagram(s)sequenceDiagram
participant Server as Server Startup
participant DataSource as DataSource (CSV/S3/Mock)
participant Cache as VipCache
participant DB as PostgreSQL
participant Poller as VipPoller
participant Billing as BillingClient (optional)
Server->>DataSource: init (config or mock)
Server->>Cache: rebuild_from_db()
Cache->>DB: query active VIP rentals
DB-->>Cache: VipRental rows
Server->>Poller: new(config, DataSource, Cache, DB, markup, billing?)
Server->>Poller: poll_once() [initial]
Poller->>DataSource: fetch_vip_rows()
DataSource-->>Poller: Vec<VipCsvRow>
loop per row
Poller->>Poller: validate_row()
alt valid
Poller->>Cache: get(vip_id)
alt cache hit
Poller->>DB: update_vip_rental_metadata() [if needed]
Poller->>Cache: insert/update
else cache miss
Poller->>DB: get_vip_rental_by_machine_id()
alt db found
Poller->>Cache: insert(relinked)
else new rental
Poller->>Poller: prepare_vip_rental()
Poller->>Billing: register_with_billing() [optional]
Poller->>DB: insert_vip_rental()
Poller->>Cache: insert(new)
end
end
else invalid
Poller->>Poller: skip row
end
end
loop stale rentals
Poller->>Billing: finalize_rental_billing() [best-effort]
Poller->>DB: close_vip_rental()
Poller->>Cache: remove(vip_id)
end
Poller-->>Server: PollStats
Server->>Server: spawn VipPollerTask (periodic)
Estimated code review effort🎯 4 (Complex) | ⏱️ ~60 minutes
Possibly related PRs
Suggested reviewers
Poem
Pre-merge checks and finishing touches✅ Passed checks (3 passed)
✨ Finishing touches
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 7
🧹 Nitpick comments (13)
crates/basilica-api/migrations/017_vip_rentals.sql (1)
1-18: ConsiderCREATE INDEX CONCURRENTLY(+-- no-transaction) to avoid write locks in prod. Ifsecure_cloud_rentalsis large/hot, the currentCREATE INDEXmay block.+-- no-transaction -- VIP Rental Support @@ -CREATE INDEX IF NOT EXISTS idx_secure_cloud_rentals_is_vip +CREATE INDEX CONCURRENTLY IF NOT EXISTS idx_secure_cloud_rentals_is_vip ON secure_cloud_rentals(is_vip) WHERE is_vip = TRUE; @@ -CREATE INDEX IF NOT EXISTS idx_secure_cloud_rentals_provider_instance_id_vip +CREATE INDEX CONCURRENTLY IF NOT EXISTS idx_secure_cloud_rentals_provider_instance_id_vip ON secure_cloud_rentals(provider_instance_id) WHERE provider_instance_id LIKE 'vip:%';crates/basilica-api/src/api/routes/secure_cloud.rs (1)
585-613: VIP stop prevention logic is correct; consider destructuring the tuple to avoid.0/.1/.2footguns.- let rental: (String, chrono::DateTime<Utc>, bool) = sqlx::query_as( + let (owner_user_id, created_at, is_vip): (String, chrono::DateTime<Utc>, bool) = sqlx::query_as( "SELECT user_id, created_at, is_vip FROM secure_cloud_rentals WHERE id = $1", ) @@ - if rental.0 != auth.user_id { + if owner_user_id != auth.user_id { @@ - if rental.2 { + if is_vip { @@ - let duration = stop_time.signed_duration_since(rental.1); + let duration = stop_time.signed_duration_since(created_at);crates/basilica-aggregator/src/vip/task.rs (1)
24-69: Consider adding graceful shutdown support.The polling loop runs indefinitely with no way to stop it gracefully. If the application needs to shut down cleanly (e.g., for zero-downtime deployments), consider accepting a cancellation token or shutdown signal.
+use tokio_util::sync::CancellationToken; + pub struct VipPollerTask { poller: Arc<VipPoller>, interval: Duration, + cancel_token: CancellationToken, } - pub fn start(self) -> JoinHandle<()> { + pub fn start(self) -> JoinHandle<()> { tokio::spawn(async move { let mut timer = interval(self.interval); timer.set_missed_tick_behavior(MissedTickBehavior::Skip); // ... loop { - timer.tick().await; + tokio::select! { + _ = self.cancel_token.cancelled() => { + tracing::info!("VIP poller task shutting down"); + break; + } + _ = timer.tick() => {} + } // ... rest of polling logic } }) }crates/basilica-aggregator/src/db.rs (1)
477-498: UseDeploymentStatusenum instead of string literal for consistency.Other queries in this file use
DeploymentStatus::*.as_str()(e.g.,get_all_active_deploymentsat line 417). Using the enum here ensures consistency and catches typos at compile time.pub async fn list_vip_rentals(&self) -> Result<Vec<Deployment>> { let rows = sqlx::query( r#" SELECT id, user_id, provider, provider_instance_id, offering_id, instance_type, location_code, status, hostname, ssh_key_id, ip_address, connection_info, raw_response, error_message, created_at, updated_at, is_vip FROM secure_cloud_rentals - WHERE is_vip = TRUE AND status != 'deleted' + WHERE is_vip = TRUE AND status != $1 ORDER BY created_at DESC "#, ) + .bind(DeploymentStatus::Deleted.as_str()) .fetch_all(&self.pool) .await?;crates/basilica-aggregator/src/vip/mod.rs (1)
8-13: Consider selective re-exports for clearer public API.Glob re-exports (
pub use *) work but can make it harder to understand the module's public surface. If this is an internal module, it's fine. For external consumers, selective re-exports improve discoverability.Example of selective re-exports:
pub use cache::VipCache; pub use poller::{VipPoller, PollStats}; pub use sheets::{GoogleSheetsClient, SheetsError}; pub use task::VipPollerTask; pub use types::{VipSheetRow, VipCacheEntry}; // etc.crates/basilica-aggregator/src/config.rs (1)
169-184: Consider validating VIP config invalidate()when enabled.The
validate()method checks provider configuration but doesn't validate VIP config. If VIP is enabled but misconfigured (e.g.,enabled = truewith emptygoogle_sheet_id), the error will only surface at runtime. Adding validation here would provide earlier feedback.pub fn validate(&self) -> Result<()> { // Check at least one provider is enabled (has auth configured) let any_enabled = self.providers.datacrunch.is_enabled() || self.providers.hyperstack.is_enabled() || self.providers.lambda.is_enabled() || self.providers.hydrahost.is_enabled(); if !any_enabled { return Err(AggregatorError::Config( "At least one provider must be configured (has auth)".to_string(), )); } + // Validate VIP config if enabled + if self.vip.enabled && !self.vip.is_configured() { + return Err(AggregatorError::Config( + "VIP is enabled but missing google_sheet_id or google_credentials_path".to_string(), + )); + } + Ok(()) }crates/basilica-aggregator/src/vip/sheets.rs (2)
60-65: Consider adding retry logic or timeout for API calls.The Google Sheets API call has no timeout or retry logic. Transient network issues could cause the poll cycle to fail. Since the caller (VipPollerTask) continues on error, this may be acceptable, but adding resilience here would reduce error noise in logs.
You could wrap the API call with a timeout and/or use a retry library:
use tokio::time::timeout; let result = timeout( Duration::from_secs(30), self.hub.spreadsheets().values_get(&self.sheet_id, range).doit() ).await .map_err(|_| SheetsError::Api("Request timed out".to_string()))? .map_err(|e| SheetsError::Api(e.to_string()))?;
67-67: Log when sheet returns no data.If
valuesisNone, the code silently returns an empty vector. This could mask configuration issues (wrong sheet ID, permissions, etc.). Consider logging at debug/warn level.- let values = result.1.values.unwrap_or_default(); + let values = match result.1.values { + Some(v) => v, + None => { + tracing::debug!("Google Sheet returned no values - sheet may be empty or inaccessible"); + return Ok(Vec::new()); + } + };crates/basilica-aggregator/src/vip/cache.rs (2)
22-80: VipCache API looks solid; consider whether async is necessary for all methods.
Given this is in-memory only, the ergonomics are good; cloning records on read is a reasonable tradeoff for avoiding lock leakage.
86-97: Considerquery_as!/query!(or tests) to catch schema drift early.
Right now column/type mismatches will surface only at runtime.Also applies to: 180-193
crates/basilica-aggregator/src/vip/poller.rs (1)
238-267: Re-link path likely needs a metadata sync as well.
When DB has an existing rental but cache is empty, you re-link without ensuring DB metadata matches the sheet. Consider callingupdate_vip_rental_metadatahere too (even unconditionally).crates/basilica-aggregator/src/vip/rental_ops.rs (2)
17-25: Define a rounding/quantization policy for marked-up hourly rates (if used for billing).
Currently this can produce arbitrary scale depending on input.
219-238: Assumes at most one active VIP rental per machine; enforce with DB constraints.
Without a uniqueness constraint,fetch_optionalcan behave unpredictably if duplicates exist.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
⛔ Files ignored due to path filters (1)
Cargo.lockis excluded by!**/*.lock
📒 Files selected for processing (17)
crates/basilica-aggregator/Cargo.toml(1 hunks)crates/basilica-aggregator/src/config.rs(7 hunks)crates/basilica-aggregator/src/db.rs(9 hunks)crates/basilica-aggregator/src/lib.rs(1 hunks)crates/basilica-aggregator/src/models.rs(1 hunks)crates/basilica-aggregator/src/service.rs(1 hunks)crates/basilica-aggregator/src/vip/cache.rs(1 hunks)crates/basilica-aggregator/src/vip/mod.rs(1 hunks)crates/basilica-aggregator/src/vip/poller.rs(1 hunks)crates/basilica-aggregator/src/vip/rental_ops.rs(1 hunks)crates/basilica-aggregator/src/vip/sheets.rs(1 hunks)crates/basilica-aggregator/src/vip/task.rs(1 hunks)crates/basilica-aggregator/src/vip/types.rs(1 hunks)crates/basilica-api/migrations/017_vip_rentals.sql(1 hunks)crates/basilica-api/src/api/routes/secure_cloud.rs(2 hunks)crates/basilica-api/src/config/mod.rs(3 hunks)crates/basilica-api/src/server.rs(2 hunks)
🧰 Additional context used
🧠 Learnings (1)
📚 Learning: 2025-10-13T15:34:46.544Z
Learnt from: epappas
Repo: one-covenant/basilica PR: 184
File: crates/basilica-validator/src/rental/billing.rs:129-130
Timestamp: 2025-10-13T15:34:46.544Z
Learning: In the basilica-validator crate, rental_id is a String type (not UUID) throughout the system - in RentalInfo struct, database rentals.id column, and all rental operations. The persistence layer methods should accept &str for rental_id parameters, not &Uuid, since they immediately convert to String for SQL binding anyway.
Applied to files:
crates/basilica-aggregator/src/vip/types.rscrates/basilica-api/src/api/routes/secure_cloud.rscrates/basilica-aggregator/src/vip/rental_ops.rs
🧬 Code graph analysis (8)
crates/basilica-aggregator/src/models.rs (1)
crates/basilica-billing/src/domain/rentals.rs (1)
provider_instance_id(208-212)
crates/basilica-aggregator/src/vip/task.rs (2)
crates/basilica-aggregator/src/vip/cache.rs (1)
new(16-20)crates/basilica-aggregator/src/vip/poller.rs (1)
new(48-60)
crates/basilica-aggregator/src/config.rs (1)
crates/basilica-api/src/config/mod.rs (6)
default(54-62)default(76-81)default(95-100)default(114-119)default(142-148)default(185-190)
crates/basilica-aggregator/src/db.rs (2)
crates/basilica-aggregator/src/models.rs (1)
format_vip_machine_id(155-157)crates/basilica-aggregator/src/vip/rental_ops.rs (1)
sqlx(80-112)
crates/basilica-aggregator/src/vip/sheets.rs (2)
crates/basilica-aggregator/src/vip/cache.rs (1)
new(16-20)crates/basilica-aggregator/src/vip/poller.rs (1)
new(48-60)
crates/basilica-aggregator/src/vip/cache.rs (1)
crates/basilica-aggregator/src/models.rs (2)
extract_vip_machine_id(160-162)from(206-214)
crates/basilica-aggregator/src/vip/rental_ops.rs (1)
crates/basilica-aggregator/src/models.rs (1)
format_vip_machine_id(155-157)
crates/basilica-aggregator/src/vip/poller.rs (3)
crates/basilica-aggregator/src/vip/rental_ops.rs (5)
close_vip_rental(132-217)insert_vip_rental(70-128)prepare_vip_rental(47-66)update_vip_rental_metadata(241-290)sqlx(80-112)crates/basilica-aggregator/src/vip/cache.rs (2)
new(16-20)default(196-198)crates/basilica-aggregator/src/vip/sheets.rs (1)
new(30-53)
🔇 Additional comments (14)
crates/basilica-aggregator/src/lib.rs (1)
12-12: Publicvipmodule export looks fine. Adds the intended integration surface at the crate root.crates/basilica-aggregator/src/service.rs (1)
528-528: Good: explicitis_vip: falseon normal deployments. Prevents accidental VIP semantics on standard provisioning.crates/basilica-api/src/config/mod.rs (1)
131-135: Config wiring looks correct and backwards-compatible (#[serde(default)]).Also applies to: 141-148, 350-351
crates/basilica-aggregator/Cargo.toml (1)
46-49: No additional configuration required.google-sheets4v5 andyup-oauth2v9 both default to thehyper-rustlsTLS stack, which aligns with the workspace's rustls-only strategy for MUSL compatibility. The existing implementation incrates/basilica-aggregator/src/vip/sheets.rsalready correctly constructs thehyper_rustls::HttpsConnector, confirming proper async runtime and TLS integration.crates/basilica-aggregator/src/vip/task.rs (1)
10-13: LGTM! Well-structured polling task.The struct cleanly encapsulates the poller and interval configuration. Converting
poll_interval_secstoDurationat construction time is a good practice.crates/basilica-aggregator/src/db.rs (2)
167-198: LGTM! Deployment creation properly includesis_vip.The INSERT statement and binding are correctly updated to support the new
is_vipfield.
454-475: LGTM! VIP machine lookup by prefixed ID.The method correctly uses
format_vip_machine_idto construct the lookup key, maintaining consistency with how VIP rentals are stored inrental_ops.rs.crates/basilica-aggregator/src/config.rs (1)
107-141: LGTM! Well-structured VIP configuration.The
VipConfigstruct has sensible defaults and theis_configured()method properly validates that all required fields are present when VIP is enabled. The#[serde(default)]attributes ensure backwards compatibility with existing config files.crates/basilica-aggregator/src/vip/sheets.rs (3)
9-21: LGTM! Comprehensive error types.The
SheetsErrorenum covers credential handling, authentication, API errors, and row parsing with good context (row number and message).
28-53: LGTM! Service account authentication setup.The authentication flow correctly reads the service account key, builds the authenticator, and configures the HTTPS connector with native roots.
88-140: LGTM! Clear row parsing with good documentation.The column mapping is well-documented in the comment. The
get_colclosure provides consistent error handling for required fields, and optionalnotesis handled appropriately.crates/basilica-aggregator/src/vip/types.rs (1)
1-57: Type definitions are clean and appropriately scoped.
Also aligns with system-wide “rental_id is a String” usage. (Based on retrieved learnings.)crates/basilica-aggregator/src/vip/rental_ops.rs (2)
44-67:rental_idas String is consistent with the broader system. (Based on retrieved learnings.)
Uuid::new_v4().to_string()is fine if UUID format is acceptable for downstream consumers.
130-217: Clarify intent: Is data loss acceptable if archival fails?Both
terminated_secure_cloud_rentalsandsecure_cloud_rentalstables are properly defined via migrations (015 and 013 respectively), with matching column schemas. However, the function silently loses data if archival fails—archival errors are logged but ignored (line 189-196), while deletion proceeds unconditionally (line 198-208). If archival encounters any error (permission, constraint violation, network issue), the rental record is deleted without being preserved in the archive table.If this best-effort archival is intentional, document it in a comment. If not, consider:
- Gating deletion on successful archival
- Making archival failures fatal to close_vip_rental
- Adding startup validation that both tables exist and have expected columns
Automatically sync team-managed GPU machines from a Google Sheet inventory into the rental system with polling, fail-safe updates, and startup recovery.
605ae25 to
57953a2
Compare
|
@coderabbitai resolve |
✅ Actions performedComments resolved. |
Add ability to test VIP managed machines without Google Sheets: - VipDataSource trait abstracts data fetching (sheets vs mock) - MockVipDataSource provides configurable test data - Enable via `vip.mock_user_id` in config Also fixes VIP rental display in list rentals endpoint.
Switch VIP machine inventory from Google Sheets API to CSV files: - Support local CSV files for development/testing - Support S3-hosted CSV for production - Remove google-sheets4 and yup-oauth2 dependencies - Add csv and aws-sdk-s3 dependencies
Update type naming and documentation to reflect CSV data source instead of Google Sheets terminology throughout the VIP module.
VIP rentals were using a hardcoded 20% markup while secure cloud rentals used the configurable pricing_config.secure_cloud_markup_percent (default 10%). This caused billing inconsistency between the two rental types. Now VIP rentals receive the same markup percentage from the API's PricingConfig, ensuring consistent pricing across all secure cloud rental types.
Remove get_rental_by_vip_machine_id and list_vip_rentals as they have no call sites in the codebase.
Simplifies VIP row filtering by using a boolean active field (0/1) instead of comparing against a configurable ready_value string. Also excludes VIP rentals from health checks since they're manually managed.
- Add vcpu_count and system_memory_gb fields to VIP CSV schema - Store and retrieve specs from rental raw_response JSON - Expose is_vip flag in API response for client awareness - Prevent users from stopping VIP rentals via exclude_vip filter
Resolve conflicts between VIP rental feature and SSH key schema changes: - Adopt main's ssh_public_key storage (dropped ssh_key_id FK) - Preserve VIP rental support (is_vip column, raw_response) - Renumber VIP migration to 018 - Remove obsolete ssh_key_id nullable migration
Registers VIP rentals with billing service before DB insert and finalizes billing when machines are removed from CSV.
VIP enablement is now determined by presence of the config rather than an explicit enabled boolean field.
Allows overriding the default AWS region when fetching VIP CSV from S3.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
crates/basilica-cli/src/cli/handlers/gpu_rental.rs (1)
692-694: Inconsistent VIP access control for restart operation.VIP rentals are excluded from the
downcommand (line 624 usesexclude_vip=true) but included in therestartcommand (line 692 usesexclude_vip=false). Since both are control operations that modify rental state, this creates an inconsistent authorization boundary.If VIP rentals are managed machines that users cannot stop, they likely should not be able to restart them either. The comment "though restart may fail" suggests this inconsistency may cause issues.
🔎 Recommended fix
// Single rental restart (no --all flag as per requirements) - // exclude_vip=false: VIP rentals included in selector (though restart may fail for secure cloud) + // exclude_vip=true: VIP rentals excluded from restart (consistent with down command) let (rental_id, _compute_type) = - resolve_target_rental_unified(target, None, &api_client, false).await?; + resolve_target_rental_unified(target, None, &api_client, true).await?;
♻️ Duplicate comments (10)
crates/basilica-api/src/server.rs (2)
134-140: VIP rentals should be excluded from billing telemetry.The query at lines 134-140 sends synthetic telemetry for all running rentals but doesn't exclude VIP rentals. VIP rentals are managed separately by the VIP poller and shouldn't receive synthetic billing telemetry.
This was flagged in a previous review.
SELECT r.id, COALESCE(o.gpu_count, 1) as gpu_count FROM secure_cloud_rentals r LEFT JOIN gpu_offerings o ON r.offering_id = o.id - WHERE r.status = 'running' + WHERE r.status = 'running' AND (r.is_vip = FALSE OR r.is_vip IS NULL)
232-234: VIP rentals should be excluded from credit exhaustion checks.The credit exhaustion query at line 233 doesn't filter out VIP rentals. VIP rentals have their own billing lifecycle managed by the VIP poller.
This was flagged in a previous review.
- match sqlx::query_as("SELECT id FROM secure_cloud_rentals WHERE status = 'running'") + match sqlx::query_as("SELECT id FROM secure_cloud_rentals WHERE status = 'running' AND (is_vip = FALSE OR is_vip IS NULL)")crates/basilica-aggregator/src/models.rs (1)
152-153: Add#[serde(default)]tois_vipfor backward compatibility.This issue was raised in a previous review. Without
#[serde(default)], deserializing older JSON or test fixtures that lack theis_vipfield will fail.pub updated_at: DateTime<Utc>, + #[serde(default)] pub is_vip: bool, }crates/basilica-aggregator/src/vip/cache.rs (1)
115-161: Lifetime bugs in JSON fallbacks and unsafe numeric casts remain unfixed.The issues previously flagged are still present:
- Lines 119, 148:
.unwrap_or(&row.ip_address.clone().unwrap_or_default())takes a reference to a temporary, which may not compile or could cause undefined behavior.- Lines 124, 153, 157, 161:
as u16/as u32casts can silently truncate values fromu64.Recommended fix
for row in rows { if let Some(vip_machine_id) = row .provider_instance_id .as_ref() .and_then(|id| extract_vip_machine_id(id)) { let vip_machine_id = vip_machine_id.to_string(); + let ip_fallback = row.ip_address.clone().unwrap_or_default(); + let instance_type_fallback = row.instance_type.clone().unwrap_or_default(); let connection = if let Some(conn_json) = &row.connection_info { let ssh_host = conn_json .get("ssh_host") .and_then(|v| v.as_str()) - .unwrap_or(&row.ip_address.clone().unwrap_or_default()) + .unwrap_or(&ip_fallback) .to_string(); let ssh_port = conn_json .get("ssh_port") .and_then(|v| v.as_u64()) - .unwrap_or(22) as u16; + .and_then(|p| u16::try_from(p).ok()) + .unwrap_or(22);Apply similar changes for
gpu_count,vcpu_count, andsystem_memory_gb.crates/basilica-aggregator/src/vip/poller.rs (5)
93-116: Stale-removal logic remains dangerously narrow.
seen_idsonly contains rows that are both active AND valid. Rows present on the sheet but skipped due toactive=falseor validation failure will be incorrectly treated as removed, triggering rental closure—a destructive side effect.Populate
seen_idsfor ALL rows present on the sheet (before filtering), and use a separate collection for active+valid machines.
146-148:poll_successis hardcoded totrueeven when errors occur.When
process_machine()orremove_stale_rental()fails (lines 120-127, 132-140), the error is logged butpoll_success = trueis still reported. This makes failures invisible to monitoring systems.Track errors with a flag and set
poll_success = !had_errors.
166-168: Avoid logging fullVipCsvRow; it contains user identifiers.
tracing::warn!(row = ?row, ...)can leakassigned_userinto logs, which is a compliance/privacy concern.- tracing::warn!(row = ?row, "Invalid row: missing vip_machine_id"); + tracing::warn!("Invalid row: missing vip_machine_id");
206-248:last_seen_atnot refreshed when metadata is unchanged.A machine with unchanged metadata won't have its
last_seen_atupdated, making it appear stale for observability purposes even though it's actively present in the data source.Consider always refreshing
last_seen_atto maintain an accurate heartbeat.
320-331:hourly_ratechanges are silently ignored.
metadata_differs()excludeshourly_ratefrom comparison, so rate changes in the CSV won't trigger updates. Sincehourly_rateis stored in the DB (viaraw_response) and used in billing calculations (seeapply_markup()in rental_ops.rs), this could lead to billing discrepancies.Either add
hourly_ratetometadata_differs()to propagate changes, or document that rates are only captured at creation time.crates/basilica-aggregator/src/vip/rental_ops.rs (1)
76-77: Documentation mismatch with actual usage.This concern was already raised in a previous review: the documentation states this is "typically called from basilica-api after billing registration succeeds," but the poller actually inserts rentals before billing registration, potentially creating orphan DB records if billing fails.
🧹 Nitpick comments (7)
crates/basilica-cli/src/cli/handlers/gpu_rental_helpers.rs (1)
242-244: Consider a more specific error message when VIP filtering excludes all rentals.When
exclude_vipis true and all active rentals are VIP, the generic "No active rentals found" message might confuse users who can see VIP rentals elsewhere. A more specific message could improve the user experience.🔎 Proposed enhancement for clearer error messaging
+ // Check if we filtered out VIP rentals + let has_vip_only = exclude_vip + && secure_rentals.as_ref() + .map(|s| s.rentals.iter().all(|r| r.is_vip)) + .unwrap_or(false); + if unified_items.is_empty() { - return Err(eyre!("No active rentals found")); + return if has_vip_only { + Err(eyre!("No eligible rentals found. Only VIP rentals are active.")) + } else { + Err(eyre!("No active rentals found")) + }; }crates/basilica-api/migrations/018_vip_rentals.sql (1)
5-18: Consider a CHECK constraint to enforce theis_vipand prefix relationship.The migration establishes a semantic link between
is_vip = TRUEand thevip:prefix inprovider_instance_id, but doesn't enforce it at the database level. Adding a CHECK constraint would prevent data inconsistencies if application logic has bugs or if data is modified directly.🔎 Suggested CHECK constraint
ALTER TABLE secure_cloud_rentals ADD COLUMN IF NOT EXISTS is_vip BOOLEAN NOT NULL DEFAULT FALSE; + +-- Enforce that is_vip flag matches provider_instance_id prefix +ALTER TABLE secure_cloud_rentals ADD CONSTRAINT chk_vip_prefix_consistency + CHECK ( + (is_vip = TRUE AND provider_instance_id LIKE 'vip:%') OR + (is_vip = FALSE AND provider_instance_id NOT LIKE 'vip:%') + );crates/basilica-aggregator/src/vip/types.rs (1)
23-42: Consider gatingtest_machinewith#[cfg(test)]or moving to a test module.The
test_machineconstructor is markedpuband available in production builds. If it's only intended for testing, consider:+#[cfg(test)] impl VipCsvRow { /// Create a test machine row with sensible defaults pub fn test_machine(id: &str, user: &str) -> Self {Alternatively, if it's needed for integration tests in other crates, the current approach is acceptable.
crates/basilica-aggregator/src/vip/task.rs (1)
24-71: Consider storing theJoinHandlefor graceful shutdown.The
start()method returns aJoinHandle<()>, but inserver.rs(lines 1042-1043 and 1106-1107), the handle is immediately dropped. While dropped handles don't cancel the task (it continues running), this means:
- No way to gracefully stop the poller on server shutdown
- No way to detect if the polling task panicked
If graceful shutdown isn't required, this is acceptable. Otherwise, consider storing the handle in the
Serverstruct'sbackground_tasksJoinSet.crates/basilica-api/src/api/routes/secure_cloud.rs (1)
253-260: Potential precision loss when parsing VIP hourly rate.The hourly rate is stored as a string in
raw_responsebut parsed tof64, which can lose precision for monetary values. SinceDecimalis used elsewhere for rates, consider usingDecimalparsing for consistency:- let vip_hourly_rate = raw_response - .as_ref() - .and_then(|r| r.get("hourly_rate")) - .and_then(|v| v.as_str()) - .and_then(|s| s.parse::<f64>().ok()) - .unwrap_or(0.0); + let vip_hourly_rate = raw_response + .as_ref() + .and_then(|r| r.get("hourly_rate")) + .and_then(|v| v.as_str()) + .and_then(|s| s.parse::<Decimal>().ok()) + .and_then(|d| d.to_f64()) + .unwrap_or(0.0);This is a minor issue since the final value is
f64anyway for the response, but usingDecimalparsing first preserves precision during the parse step.crates/basilica-aggregator/src/vip/cache.rs (1)
199-203: Return value may be misleading when rows are skipped.
countis set torows.len()before filtering, but the function returnscountafter potentially skipping rows that failextract_vip_machine_id. Consider returningentries.len()to reflect the actual number of records rebuilt into the cache.Suggested fix
- tracing::info!(count = count, "Rebuilt VIP cache from database"); + let rebuilt_count = entries.len(); + tracing::info!(db_rows = count, rebuilt = rebuilt_count, "Rebuilt VIP cache from database"); - Ok(count) + Ok(rebuilt_count)crates/basilica-aggregator/src/config.rs (1)
157-164: Consider validating S3 configuration completeness.
has_s3_source()requires boths3_bucketands3_keyto be non-empty, but doesn't verifys3_region. While the AWS SDK can use default region from the credential chain, explicitly documenting or validating this would prevent runtime surprises./// Check if S3 source is configured + /// Note: s3_region is optional as AWS SDK falls back to default credential chain region pub fn has_s3_source(&self) -> bool { self.s3_bucket .as_ref() .map(|s| !s.is_empty()) .unwrap_or(false) && self.s3_key.as_ref().map(|s| !s.is_empty()).unwrap_or(false) }
📜 Review details
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
⛔ Files ignored due to path filters (1)
Cargo.lockis excluded by!**/*.lock
📒 Files selected for processing (21)
.gitignore(1 hunks)crates/basilica-aggregator/Cargo.toml(2 hunks)crates/basilica-aggregator/src/config.rs(7 hunks)crates/basilica-aggregator/src/db.rs(9 hunks)crates/basilica-aggregator/src/lib.rs(1 hunks)crates/basilica-aggregator/src/models.rs(4 hunks)crates/basilica-aggregator/src/service.rs(1 hunks)crates/basilica-aggregator/src/vip/cache.rs(1 hunks)crates/basilica-aggregator/src/vip/csv.rs(1 hunks)crates/basilica-aggregator/src/vip/mod.rs(1 hunks)crates/basilica-aggregator/src/vip/poller.rs(1 hunks)crates/basilica-aggregator/src/vip/rental_ops.rs(1 hunks)crates/basilica-aggregator/src/vip/task.rs(1 hunks)crates/basilica-aggregator/src/vip/types.rs(1 hunks)crates/basilica-api/migrations/018_vip_rentals.sql(1 hunks)crates/basilica-api/src/api/routes/secure_cloud.rs(6 hunks)crates/basilica-api/src/config/mod.rs(3 hunks)crates/basilica-api/src/server.rs(4 hunks)crates/basilica-cli/src/cli/handlers/gpu_rental.rs(4 hunks)crates/basilica-cli/src/cli/handlers/gpu_rental_helpers.rs(3 hunks)crates/basilica-sdk/src/types.rs(1 hunks)
🚧 Files skipped from review as they are similar to previous changes (3)
- crates/basilica-aggregator/src/service.rs
- crates/basilica-aggregator/Cargo.toml
- crates/basilica-aggregator/src/lib.rs
🧰 Additional context used
🧠 Learnings (1)
📚 Learning: 2025-10-13T15:34:46.544Z
Learnt from: epappas
Repo: one-covenant/basilica PR: 184
File: crates/basilica-validator/src/rental/billing.rs:129-130
Timestamp: 2025-10-13T15:34:46.544Z
Learning: In the basilica-validator crate, rental_id is a String type (not UUID) throughout the system - in RentalInfo struct, database rentals.id column, and all rental operations. The persistence layer methods should accept &str for rental_id parameters, not &Uuid, since they immediately convert to String for SQL binding anyway.
Applied to files:
crates/basilica-cli/src/cli/handlers/gpu_rental.rscrates/basilica-sdk/src/types.rscrates/basilica-aggregator/src/vip/cache.rscrates/basilica-aggregator/src/vip/types.rscrates/basilica-aggregator/src/vip/rental_ops.rscrates/basilica-api/src/api/routes/secure_cloud.rs
🧬 Code graph analysis (7)
crates/basilica-cli/src/cli/handlers/gpu_rental.rs (1)
crates/basilica-cli/src/cli/handlers/gpu_rental_helpers.rs (1)
resolve_target_rental_unified(131-286)
crates/basilica-aggregator/src/models.rs (1)
crates/basilica-billing/src/domain/rentals.rs (1)
provider_instance_id(208-212)
crates/basilica-aggregator/src/vip/cache.rs (2)
crates/basilica-aggregator/src/models.rs (1)
extract_vip_machine_id(164-166)crates/basilica-aggregator/src/vip/rental_ops.rs (1)
sqlx(89-124)
crates/basilica-aggregator/src/vip/rental_ops.rs (1)
crates/basilica-aggregator/src/models.rs (1)
format_vip_machine_id(159-161)
crates/basilica-aggregator/src/vip/csv.rs (2)
crates/basilica-aggregator/src/vip/poller.rs (1)
new(55-71)crates/basilica-aggregator/src/vip/types.rs (1)
test_machine(25-41)
crates/basilica-aggregator/src/vip/poller.rs (1)
crates/basilica-aggregator/src/vip/rental_ops.rs (5)
close_vip_rental(144-230)get_vip_rental_by_machine_id(233-251)insert_vip_rental(78-140)update_vip_rental_metadata(254-306)sqlx(89-124)
crates/basilica-api/src/api/routes/secure_cloud.rs (2)
crates/basilica-billing/src/domain/rentals.rs (1)
offering_id(223-225)crates/basilica-api/src/api/middleware/balance.rs (1)
hourly_cost_with_markup(94-100)
🔇 Additional comments (36)
.gitignore (1)
137-137: LGTM!Adding
dev/to.gitignoreis appropriate for excluding development-time artifacts from version control. This aligns well with the existing patterns in the file.crates/basilica-cli/src/cli/handlers/gpu_rental.rs (3)
229-230: LGTM: Appropriate VIP inclusion for status viewing.Correctly allows VIP rentals to be viewed in status checks. Read-only operations should include VIP rentals for visibility.
362-363: LGTM: Appropriate VIP inclusion for log viewing.Correctly allows VIP rentals to be viewed in log operations. Read-only operations should include VIP rentals for observability.
624-626: LGTM: Appropriate VIP exclusion for stop operations.Correctly excludes VIP rentals from user-initiated stop operations. Managed machines should not be controllable by regular users for destructive operations.
crates/basilica-sdk/src/types.rs (1)
566-568: Field implementation is correct; verify stop rental endpoint enforces the is_vip check.The
is_vip: boolfield with#[serde(default)]is properly implemented for backward compatibility with clear documentation. However, ensure the stop rental endpoint or service handler explicitly checks this field and rejects stop requests for VIP rentals (managed machines). This is a critical security-related field that requires corresponding authorization logic in the API layer.crates/basilica-cli/src/cli/handlers/gpu_rental_helpers.rs (3)
686-688: LGTM! VIP access via SSH is correctly preserved.The implementation correctly passes
exclude_vip = falsewhen selecting a rental for SSH access, ensuring VIP rentals remain accessible. The inline comment clearly documents this behavior.
220-223: VIP exclusion logic looks correct for secure cloud.The implementation correctly skips VIP rentals when
exclude_vipis true. Note that this filtering is applied only to secure cloud rentals, not community cloud rentals (lines 191-210), which suggests VIPs are a secure-cloud-only concept.Verify that:
- The
is_vipfield exists on the secure cloud rental type and is correctly populated- VIP rentals are indeed exclusive to secure cloud (confirming the asymmetric filtering is intentional)
#!/bin/bash # Description: Verify the is_vip field exists on rental types and confirm VIP is secure-cloud-only # Find the rental type definition and check for is_vip field ast-grep --pattern 'struct $_ { $$$ is_vip: $_, $$$ }' # Search for VIP-related code to understand scope rg -n -C 3 '\bis_vip\b' --type rust
127-127: All call sites ofresolve_target_rental_unifiedhave been updated with the newexclude_vipparameter. The function signature requires 4 parameters, and all 5 call sites in the codebase pass the parameter correctly in the proper order, with context-appropriate values (true for the down command, false for other operations).crates/basilica-api/migrations/018_vip_rentals.sql (3)
5-6: LGTM!The
is_vipcolumn addition is well-designed with proper defaults and idempotency safeguards.
8-11: LGTM!The partial index on
is_vipis an excellent optimization if VIP rentals are uncommon, as it keeps the index small while accelerating VIP-specific queries.
13-16: The partial index onprovider_instance_idis redundant. A general index on this column already exists in migration 013. For equality lookups on VIP provider_instance_ids, the existing general index is more efficient since the query planner won't use a partial index when a general index covers the same operation on a larger dataset. Consider removing this partial index to avoid index maintenance overhead.crates/basilica-aggregator/src/vip/types.rs (1)
1-82: LGTM - Well-structured VIP data models.The type definitions are clean with appropriate separation between CSV input (
VipCsvRow), connection details (VipConnectionInfo), display metadata (VipDisplayInfo), validated state (ValidVipMachine), and cache records (VipRentalRecord). The use ofDecimalforhourly_rateensures precision for monetary values.crates/basilica-api/src/config/mod.rs (2)
131-146: LGTM - VIP configuration properly integrated.The
vipfield is correctly marked with#[serde(default)]for backward compatibility, and theDefaultimplementation initializes it toNone. This allows existing configurations to work without the VIP section.
350-350: LGTM - Config propagation is correct.The VIP configuration is properly cloned and passed through to the aggregator config.
crates/basilica-aggregator/src/db.rs (2)
415-429: LGTM - VIP exclusion in active deployments query.The
get_all_active_deploymentscorrectly excludes VIP rentals with(is_vip = FALSE OR is_vip IS NULL), handling both explicitfalsevalues andNULLfor backward compatibility with existing rows.
166-198: LGTM - is_vip properly bound in deployment creation.The
create_deploymentfunction correctly bindsis_vipat position 17, matching the updated INSERT statement.crates/basilica-aggregator/src/models.rs (2)
155-166: LGTM - VIP machine ID helpers are well-designed.The
VIP_MACHINE_ID_PREFIXconstant and associated helper functions (format_vip_machine_id,extract_vip_machine_id) provide a clean API for handling VIP-specific identifiers stored inprovider_instance_id.
15-17: LGTM - Provider::Vip variant properly integrated.The new
Vipvariant is correctly added with string conversion support in bothas_str()andFromStr, enabling VIP rentals to be stored withprovider = "vip".crates/basilica-aggregator/src/vip/task.rs (1)
35-69: LGTM - Resilient polling loop with good observability.The loop correctly:
- Uses
MissedTickBehavior::Skipto avoid backlog buildup- Measures and logs poll duration
- Logs changes at INFO level, no-changes at DEBUG level
- Continues polling after errors (non-fatal handling)
crates/basilica-api/src/server.rs (3)
1137-1144: LGTM - Health check correctly excludes VIP rentals.The secure cloud health check query properly filters out VIP rentals with
WHERE is_vip = FALSE. VIP machines are manually managed and don't need automated health checks.
993-1124: VIP poller startup logic is well-structured.The initialization properly:
- Logs configuration state for debugging
- Handles mock mode separately with clear logging
- Supports both S3 and local CSV sources
- Rebuilds cache from DB before starting
- Performs initial poll before starting periodic task
- Handles errors gracefully without crashing startup
1362-1398: LGTM - Mock VIP data helper is appropriate for testing.The
create_mock_vip_datafunction provides realistic test data with sensible GPU configurations and pricing.crates/basilica-api/src/api/routes/secure_cloud.rs (3)
644-650: LGTM - VIP rental stop guard is appropriate.VIP rentals are manually managed, so preventing user-initiated stops with a clear error message directing them to support is the correct behavior.
229-305: LGTM - VIP metadata extraction logic is correct.The conditional handling for VIP vs regular rentals properly extracts GPU details, resource specs, and pricing from different sources (raw_response for VIP, gpu_offerings for regular). The fallback values are sensible.
39-41: LGTM - RentalQueryRow correctly extended for VIP support.The tuple type alias is updated to include
is_vipandraw_responsefields, matching the updated SQL query.crates/basilica-aggregator/src/vip/cache.rs (1)
14-79: LGTM - Cache API is well-designed.The basic cache operations using
RwLock<HashMap>are correctly implemented with appropriate async read/write lock semantics. Theget_ids_not_inmethod elegantly supports stale entry detection for the polling sync logic.crates/basilica-aggregator/src/vip/mod.rs (1)
1-16: LGTM - Clean module organization.The module structure follows Rust best practices with clear separation of concerns and convenient re-exports for the public API surface.
crates/basilica-aggregator/src/config.rs (1)
107-181: LGTM - Well-designed configuration structure.The
VipConfigstruct provides flexible support for multiple data source modes (mock, local CSV, S3) with appropriate defaults and helper methods. The#[serde(default)]annotations ensure backward compatibility.crates/basilica-aggregator/src/vip/csv.rs (2)
46-77: LGTM - Clean data source abstraction.The
CsvDataSourceprovides a well-designed abstraction over local files and S3 with proper async handling. The S3 integration correctly uses the AWS SDK's default credential chain with optional region override.
245-298: LGTM - MockVipDataSource is well-suited for testing.The mock implementation with
Arc<RwLock<Vec>>provides thread-safe mutation methods (set_rows,add_row,remove_row,update_row) that enable comprehensive test scenarios for the poller.crates/basilica-aggregator/src/vip/poller.rs (2)
54-71: LGTM - Constructor provides clean dependency injection.The
VipPoller::new()constructor accepts all required dependencies including optional billing client, enabling flexible composition and testability.
359-416: LGTM - Billing integration follows correct order of operations.The billing registration happens BEFORE the DB insert (lines 282-295), ensuring we don't create orphaned DB records if billing fails. The
register_with_billingmethod correctly constructs the protobuf request with appropriate resource specs.crates/basilica-aggregator/src/vip/rental_ops.rs (4)
1-27: LGTM! Solid error handling and monetary calculation.The error enum is well-structured with thiserror, and
apply_markupcorrectly usesDecimalfor monetary calculations with proper overflow checking and conversion validation.
29-74: LGTM! Proper rental preparation with markup.The
PreparedVipRentalstruct andprepare_vip_rentalfunction correctly handle rental ID generation and markup application. The use ofStringforrental_idis consistent with the system-wide convention.
176-221: Verify best-effort archival strategy is acceptable for compliance/audit requirements.The function archives to
terminated_secure_cloud_rentalson a best-effort basis (line 203), logging failures but proceeding with deletion from the active table regardless of archival success. If archiving fails due to database errors or constraints, the rental record is permanently lost after deletion.Confirm this design is acceptable given your compliance and audit requirements. If the terminated table is critical for billing history or audit trails, consider failing the closure operation when archiving fails.
232-306: LGTM! Clean query functions with proper parameterization.Both
get_vip_rental_by_machine_idandupdate_vip_rental_metadataare well-implemented with proper SQL parameterization, appropriate filtering, and suitable logging levels.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (4)
crates/basilica-api/src/api/routes/secure_cloud.rs (2)
279-286: Consider logging when VIP hourly_rate is missing or malformed.Silent fallback to
0.0could mask data integrity issues in VIP rentals. When the hourly rate cannot be parsed, a warning log would help diagnose issues with VIP rental data.🔎 Proposed improvement
// VIP hourly rate from raw_response (marked up rate stored at insert time) - let vip_hourly_rate = raw_response + let vip_hourly_rate = match raw_response .as_ref() .and_then(|r| r.get("hourly_rate")) .and_then(|v| v.as_str()) .and_then(|s| s.parse::<f64>().ok()) - .unwrap_or(0.0); + { + Some(rate) => rate, + None => { + tracing::warn!( + "VIP rental {} missing or invalid hourly_rate in raw_response, defaulting to 0.0", + rental_id + ); + 0.0 + } + };
647-676: VIP stop guard correctly implemented.The guard appropriately blocks user-initiated stops for VIP rentals with a clear error message.
Minor readability suggestion: consider using a named struct instead of tuple indices (
rental.0,rental.2) to improve code clarity, though this follows the existing pattern in the codebase.🔎 Optional: Use named struct for clarity
+ struct RentalOwnership { + user_id: String, + created_at: chrono::DateTime<Utc>, + is_vip: bool, + } + // 1. Get rental and verify ownership - let rental: (String, chrono::DateTime<Utc>, bool) = sqlx::query_as( + let rental: RentalOwnership = sqlx::query_as( "SELECT user_id, created_at, is_vip FROM secure_cloud_rentals WHERE id = $1", ) // ... fetch logic ... - if rental.0 != auth.user_id { + if rental.user_id != auth.user_id { // ... } - if rental.2 { + if rental.is_vip { // ... } - let duration = stop_time.signed_duration_since(rental.1); + let duration = stop_time.signed_duration_since(rental.created_at);crates/basilica-aggregator/src/vip/csv.rs (2)
110-131: Consider validating CSV header columns.The parser assumes a specific column order but doesn't validate the header row matches expectations. If the CSV column order differs, data will be silently parsed into wrong fields.
Optional enhancement to validate headers
Before the loop at line 117, add header validation:
fn parse_csv(&self, content: &str) -> Result<Vec<VipCsvRow>, DataSourceError> { let mut reader = csv::Reader::from_reader(content.as_bytes()); // Validate headers let headers = reader.headers() .map_err(|e| DataSourceError::CsvParse(e.to_string()))?; let expected = vec![ "vip_machine_id", "assigned_user", "active", "ssh_host", "ssh_port", "ssh_user", "gpu_type", "gpu_count", "region", "hourly_rate", "vcpu_count", "system_memory_gb", "notes" ]; if headers.len() < expected.len() - 1 { // -1 because notes is optional return Err(DataSourceError::CsvParse( format!("Expected at least {} columns, found {}", expected.len() - 1, headers.len()) )); } let mut rows = Vec::new(); // ... rest of method }
156-165: Consider case-insensitive parsing for the active field.The current implementation only accepts lowercase "true"/"false". CSV data from various sources might use "True", "FALSE", etc.
Optional: make parsing case-insensitive
- let active = match active_str.as_str() { - "1" | "true" => true, - "0" | "false" => false, + let active = match active_str.to_lowercase().as_str() { + "1" | "true" => true, + "0" | "false" => false, other => { return Err(DataSourceError::RowParse { row: row_num, message: format!("Invalid 'active' value '{}', expected 0/1", other), }) } };
📜 Review details
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
⛔ Files ignored due to path filters (1)
Cargo.lockis excluded by!**/*.lock
📒 Files selected for processing (4)
crates/basilica-aggregator/src/vip/csv.rs(1 hunks)crates/basilica-api/src/api/routes/secure_cloud.rs(6 hunks)crates/basilica-api/src/config/mod.rs(3 hunks)crates/basilica-api/src/server.rs(4 hunks)
🚧 Files skipped from review as they are similar to previous changes (2)
- crates/basilica-api/src/config/mod.rs
- crates/basilica-api/src/server.rs
🧰 Additional context used
🧠 Learnings (1)
📚 Learning: 2025-10-13T15:34:46.544Z
Learnt from: epappas
Repo: one-covenant/basilica PR: 184
File: crates/basilica-validator/src/rental/billing.rs:129-130
Timestamp: 2025-10-13T15:34:46.544Z
Learning: In the basilica-validator crate, rental_id is a String type (not UUID) throughout the system - in RentalInfo struct, database rentals.id column, and all rental operations. The persistence layer methods should accept &str for rental_id parameters, not &Uuid, since they immediately convert to String for SQL binding anyway.
Applied to files:
crates/basilica-api/src/api/routes/secure_cloud.rs
🧬 Code graph analysis (2)
crates/basilica-api/src/api/routes/secure_cloud.rs (2)
crates/basilica-api/src/api/query.rs (1)
gpu_type(21-25)crates/basilica-api/src/api/middleware/balance.rs (1)
hourly_cost_with_markup(94-100)
crates/basilica-aggregator/src/vip/csv.rs (1)
crates/basilica-aggregator/src/vip/types.rs (1)
test_machine(25-41)
🔇 Additional comments (8)
crates/basilica-api/src/api/routes/secure_cloud.rs (3)
39-41: LGTM!The type alias correctly extends
RentalQueryRowwith the new VIP-related fields, maintaining proper ordering that matches the SQL query.
178-196: LGTM!The SQL query correctly fetches the new VIP fields. The
LEFT JOINis appropriate since VIP rentals may not have correspondinggpu_offeringsentries.
359-359: LGTM!The
is_vipfield is correctly propagated to the response item.crates/basilica-aggregator/src/vip/csv.rs (5)
9-27: LGTM! Clean error model and trait design.The error types comprehensively cover all failure modes (file I/O, CSV parsing, row-level errors, S3, UTF-8), and the async trait is properly bounded for concurrent use.
174-176: The previous concern about these fields has been resolved.The doc comment at line 137 correctly documents only
notesas optional, and the implementation correctly treatsvcpu_countandsystem_memory_gbas required fields usingget_col(). The code and documentation are now consistent.
182-217: Excellent error handling for field parsing.Each numeric field parse includes the row number and specific field name in error messages, making CSV validation issues easy to diagnose. Using
Decimalforhourly_rateis the correct choice for financial data.
245-298: LGTM! Comprehensive mock implementation for testing.The mock provides all necessary mutations (add, remove, update, set) and correctly uses
Arc<RwLock<>>for thread-safe access. Cloning the vec on fetch (line 296) is appropriate for test scenarios where data volumes are small.
64-66: The region configuration is correct. The AWS SDK for Rust usesaws_sdk_s3::config::Region::new()to create Region instances for the config builder method, and this pattern works correctly with both static and dynamic region strings.
Prevent silent mutations of critical VIP rental fields by detecting when immutable data (assigned user, connection info, display specs) changes mid-rental. Conflicting rentals are now closed rather than updated. Also adds terminated rental lookup to block ID reuse with different data, fixes billing finalization to block on errors, and properly rolls back DB inserts when billing registration fails.
Only finalize+close when assigned_user changes; update SSH info, GPU specs, and other metadata in-place to avoid premature billing stops.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
♻️ Duplicate comments (4)
crates/basilica-aggregator/src/vip/rental_ops.rs (1)
76-78: Update documentation to reflect actual execution order.The docstring states this is "typically called from basilica-api after billing registration succeeds" (line 77), but the poller actually calls
insert_vip_rentalfirst at line 623, then attempts billing registration at line 634. If billing fails,delete_vip_rentalis called to rollback (lines 642-650). The documentation should reflect this DB-first, billing-second pattern with rollback on failure.crates/basilica-aggregator/src/vip/cache.rs (1)
114-204: Fix lifetime bugs in JSON fallbacks and use checked numeric conversions.Lines 119, 148:
unwrap_or(&row.ip_address.clone().unwrap_or_default())and similar patterns take references to temporaries, causing lifetime issues.Lines 124, 153, 157, 161: Numeric casts using
as u16/as u32can silently truncate large values from JSON.Additionally, line 205 logs
count(total rows fetched) rather thanentries.len()(rows successfully inserted), which can differ if some rows fail validation.crates/basilica-aggregator/src/vip/poller.rs (2)
103-199: Track all sheet rows separately from validation/processing outcomes to prevent incorrect stale removal.
seen_ids(line 125) is only populated for machines that both validate successfully (line 151) AND don't error during processing. This means CSV rows that are present but temporarily invalid or encounter transient processing errors will be treated as "stale" at line 167 and have their rentals closed—a destructive operation triggered by data quality issues rather than actual removal from the sheet.Additionally,
poll_successis hardcoded totrueat line 184 even whenprocess_machine(line 155) orremove_stale_rental(line 170) return errors, making failures invisible to monitoring.
201-207: Avoid logging full VipCsvRow; it contains user identifiers.Line 205:
tracing::warn!(row = ?row, ...)logs the completeVipCsvRow, which includesassigned_user—a user identifier that should not appear in logs. Log onlyvip_machine_idinstead, consistent with lines 209 and 213.
🧹 Nitpick comments (1)
crates/basilica-aggregator/src/vip/poller.rs (1)
494-503: Refreshlast_seen_aton every poll cycle for accurate heartbeat tracking.When a cached rental's metadata hasn't changed (line 503),
last_seen_atis not updated, making the field stale for observability and debugging. While this doesn't affect functional behavior (stale removal is based onseen_ids, notlast_seen_at), updatinglast_seen_aton every poll cycle would provide accurate "last seen" timestamps for monitoring.🔎 Proposed fix: Always refresh last_seen_at
// Normalize cached hourly rate to marked-up value if cached.display.hourly_rate != expected_marked_up { let mut updated = cached.clone(); updated.display.hourly_rate = expected_marked_up; updated.last_seen_at = Utc::now(); self.cache.insert(updated).await; + } else { + // Refresh last_seen_at even when metadata matches + let mut updated = cached.clone(); + updated.last_seen_at = Utc::now(); + self.cache.insert(updated).await; } return Ok(ProcessOutcome::Seen);
📜 Review details
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (3)
crates/basilica-aggregator/src/vip/cache.rs(1 hunks)crates/basilica-aggregator/src/vip/poller.rs(1 hunks)crates/basilica-aggregator/src/vip/rental_ops.rs(1 hunks)
🧰 Additional context used
🧠 Learnings (1)
📚 Learning: 2025-10-13T15:34:46.544Z
Learnt from: epappas
Repo: one-covenant/basilica PR: 184
File: crates/basilica-validator/src/rental/billing.rs:129-130
Timestamp: 2025-10-13T15:34:46.544Z
Learning: In the basilica-validator crate, rental_id is a String type (not UUID) throughout the system - in RentalInfo struct, database rentals.id column, and all rental operations. The persistence layer methods should accept &str for rental_id parameters, not &Uuid, since they immediately convert to String for SQL binding anyway.
Applied to files:
crates/basilica-aggregator/src/vip/rental_ops.rscrates/basilica-aggregator/src/vip/cache.rs
🧬 Code graph analysis (2)
crates/basilica-aggregator/src/vip/rental_ops.rs (1)
crates/basilica-aggregator/src/models.rs (1)
format_vip_machine_id(159-161)
crates/basilica-aggregator/src/vip/poller.rs (3)
crates/basilica-aggregator/src/models.rs (2)
format_vip_machine_id(159-161)from(210-218)crates/basilica-aggregator/src/vip/rental_ops.rs (5)
apply_markup(19-27)close_vip_rental(173-259)delete_vip_rental(144-169)insert_vip_rental(78-140)sqlx(89-124)crates/basilica-aggregator/src/vip/csv.rs (1)
new(253-257)
🔇 Additional comments (13)
crates/basilica-aggregator/src/vip/rental_ops.rs (6)
18-27: LGTM! Proper overflow protection.The markup calculation correctly uses
checked_multo prevent overflow and handles conversion failures appropriately.
49-74: LGTM!The preparation logic correctly generates a rental ID, applies markup, and bundles all necessary data for insertion.
142-169: LGTM!The rollback deletion correctly removes the rental without archiving, which is appropriate since the rental never fully started due to billing failure.
171-259: LGTM!The three-step close operation (update → archive → delete) is properly structured. Best-effort archiving with non-fatal errors is a reasonable approach.
261-280: LGTM!The query correctly filters for active VIP rentals and returns the necessary identifiers.
282-335: LGTM!The metadata update correctly refreshes connection and display information. Note that
hourly_rateis stored inraw_responsebut doesn't trigger billing recalculations—this appears intentional as rates are applied only at rental creation.crates/basilica-aggregator/src/vip/cache.rs (2)
9-79: LGTM!The cache API provides all necessary operations with proper async synchronization using
RwLock.
210-230: LGTM!The database row struct and Default implementation follow standard patterns.
crates/basilica-aggregator/src/vip/poller.rs (5)
21-31: LGTM!The error enum provides appropriate variants for all failure modes in the poller lifecycle.
33-69: LGTM!The stats and snapshot types provide clear tracking of poll outcomes and proper separation between DB and domain models.
71-101: LGTM!The poller structure uses proper dependency injection, and the optional billing client enables graceful degradation when billing is unavailable.
236-284: LGTM!The helper functions correctly apply markup, compare user assignments, and match metadata. The
parse_decimalfunction handles multiple JSON representations gracefully.
697-807: LGTM!The billing integration correctly sequences operations (billing finalization before DB close) and handles errors appropriately. Blocking closes on billing finalization ensures users are charged correctly, with retries on subsequent poll cycles.
Summary by CodeRabbit
New Features
Chores
Chore (CLI)
✏️ Tip: You can customize this high-level summary in your review settings.