[WIP] Cloud drives MVP - OneDrive vertical slice (BYO)#3064
Draft
slvnlrt wants to merge 18 commits into
Draft
Conversation
Full audit of cloud drive state (backend, frontend, git history, maturity) and research on OAuth native apps, OpenDAL 0.55, change detection, and reference implementations (rclone, object_store, Nextcloud, Syncthing, Cyberduck, Kopia). Contains the 8 commit-set operational plan for the OneDrive vertical slice MVP with BYO client_id architecture. No code changes; baseline for the MVP implementation that follows.
Upgrades OpenDAL to 0.55 and adapts the two jiff::Timestamp call sites. Wraps every CloudBackend operator with a retry+timeout+concurrent-limit+tracing layer stack to gain resilience, logging, and protection against rate limits. Introduces BackendFeatures (server_side_copy, stable_file_id, change_notifications, content_hash, case_insensitive, supports_duplicates, max_file_size, multipart_threshold) per backend, consumed by later work in file ops and indexing. Enriches BackendMetadata with etag/version/content_md5/provider_file_id so the indexer can short-circuit rehashing when providers expose content hashes. Fixes CloudBackend::exists swallowing authentication errors as "not found", a silent data-integrity hazard. Part of the OneDrive MVP vertical slice. See .investigations/cloud-drives/06-mvp-onedrive-vertical-slice.md#set-1.
Replaces the two FileCopyJob panic sites with a branch that uses server-side copy when both endpoints are the same cloud backend with server_side_copy capability, and stream-copies through 8 MiB chunked writers with concurrency 4 otherwise. Wires CreateFolderAction to the existing CloudBackend::create_directory instead of returning an Internal error. Adds CloudBackend::copy as the public entry point guarded by BackendFeatures::server_side_copy, so callers short-circuit with a typed error rather than discovering limitations deep inside OpenDAL. Closes FILE-003. Part of the OneDrive MVP vertical slice. See .investigations/cloud-drives/06-mvp-onedrive-vertical-slice.md#set-2.
Set 2 added the CloudBackend::copy primitive, wired CreateFolderAction to cloud, and made FileCopyJob::new_rename/MoveJob::rename total for SdPath::Cloud (eliminating two daemon panics). It did not add a CloudCopyStrategy or router integration, so end-to-end file copy involving cloud paths still fails with a typed error. Reverts the over-marked acceptance criteria in FILE-003 and CLOUD-003 to unchecked and records the partial progress in a dedicated section so the tracker stays honest.
Adds a provider-agnostic OAuth 2.0 subsystem under core/src/ops/cloud/oauth with the authorization-code-plus-PKCE flow from RFC 8252 wired to a one-shot loopback HTTP server. CloudOauthStartAction binds a loopback listener, generates a CSRF state token and a PKCE S256 challenge, returns the authorization URL for the frontend to open, and spawns a detached task that drives the flow to completion. CloudOauthPollQuery exposes the in-memory flow state machine to the UI. CloudOauthCancelAction aborts a pending flow. A background CloudTokenRefreshTask iterates over cloud_credentials, refreshes OAuth tokens about to expire, and writes the rotated tokens back through CloudCredentialManager. The task is a no-op until Set 4 registers OneDrive in the OauthProviderRegistry. BYO architecture preserved: client_id and client_secret are parameters on every call, never stored on the provider. CloudStorageConfig::OneDrive schema unchanged so existing volumes continue to load. Creates CLOUD-004 to track the infrastructure separately from CLOUD-003 (volume wiring) and FILE-003 (file ops). See .investigations/cloud-drives/06-mvp-onedrive-vertical-slice.md#set-3.
Remove #[serde(flatten)] from OauthFlowStatus::Completed. specta does not honour serde flatten attributes, so the generated TS type exposed `tokens: TokenSet` while the runtime JSON hoisted TokenSet fields to the top level — the frontend would fail to parse poll responses. Keep tokens nested, matching what specta already generated. Add a regression test asserting the exact wire shape.
The .tasks/ tree is maintained upstream by the founder as part of the project tracker, not by feature branches. Our earlier cloud-drives commits inadvertently edited CLOUD-003 and FILE-003 and created CLOUD-004 in that tree. Revert CLOUD-003 and FILE-003 to the upstream/main state and relocate CLOUD-004 to .investigations/cloud-drives/CLOUD-004-oauth-infrastructure-proposal.md so the tracker stays clean. Task status updates will be proposed through the PR description for the founder to accept.
Codifies the single-agent-at-a-time discipline, the agent-prompt template, the per-set workflow, and the concurrency incident from 2026-04-18. Future sets follow this playbook rather than ad-hoc prompting.
Implements OauthProvider trait for OneDrive personal (tenant=common): - PKCE S256 authorization URL with prompt=select_account - Authorization-code and refresh-token exchange - Refresh-token preservation when Microsoft omits it on refresh - Graph /me displayName lookup with userPrincipalName fallback - 5 loopback ports (53682-53686) matching Microsoft exact-match registered URIs Registered in OauthProviderRegistry at core startup so the refresh task can rotate credentials on first tick. 10 wiremock-backed tests cover id, loopback ports, URL shape, exchange happy/invalid_grant, refresh rotation/preservation, display_name priority/fallback/HTTP-error. Set 4 of 8 for cloud-drives MVP (OneDrive vertical slice). See .investigations/cloud-drives/06-mvp-onedrive-vertical-slice.md#set-4.
Replaces the OneDrive paste-tokens UI with a browser-driven BYO OAuth
flow so non-developers can connect their account without digging into
Microsoft Graph. Google Drive and Dropbox keep the paste-tokens UX and
gain a "Browser sign-in coming soon" banner pointing at the future
generalisation.
Created:
packages/interface/src/routes/explorer/components/OneDriveConnectForm.tsx
- 613 LOC self-contained React component.
- Orchestrates cloud.oauth.start -> platform.openLink -> cloud.oauth.poll
(refetchInterval 1s) -> volumes.add_cloud, reusing the parent's
useForm + onSubmitCloud handler so volume creation stays DRY.
- Collapsible 7-step Azure AD tutorial with placeholder screenshot
blocks (real PNGs can be dropped in later without code changes).
- "Open Azure Portal" and "Copy redirect URIs" helpers pre-fill the
five Microsoft-registered loopback URIs (53682-53686).
- UUID v4 validation on client_id, password Input for client_secret.
- Cancel button + unmount cleanup call cloud.oauth.cancel so the
daemon does not hold the loopback port for the full 5-min TTL.
- Semantic Tailwind classes only, no var() / hex / inline <style>,
no any / as any. Types from @sd/ts-client.
Modified:
packages/interface/src/routes/explorer/components/AddStorageModal.tsx
- Split isOAuthType into isOneDrive + isLegacyOAuthType.
- OneDrive branch early-returns with the new form mounted inside
StorageDialog, hiding default buttons and keeping Back.
- Legacy branch (gdrive, dropbox) gains the "coming soon" banner
and a TODO(cloud-mvp) cross-referencing section 9 of the MVP plan.
TypeScript: no new errors on changed files; pre-existing unrelated
ambient-declaration errors for @sd/assets PNG imports remain.
Refs: .investigations/cloud-drives/06-mvp-onedrive-vertical-slice.md
Introduces the foundation for incremental cloud indexing via Microsoft Graph's /me/drive/root/delta endpoint. Adds a cloud_sync_state table and a provider_file_id column on entries, a provider-agnostic ChangeDetector trait, the OneDriveChangeDetector implementation with 410/429/401 handling, a CloudSyncStateRepository, and a delta-pass orchestrator that runs as a pre-pass in the processing phase to advance the delta cursor. Full re-hash skipping is deferred (would require OpenDAL listing changes beyond this set's scope); the machinery is in place for a follow-up commit to consume the change stream.
- Cloud volumes get a 'Disconnect' context menu item (purges credentials via volumes.remove_cloud); Speed Test and Eject are hidden for them. - Remove the broken GroupType::Cloud option from AddGroupModal and SpaceCustomizationPanel dropdowns (no CloudGroup renderer yet). Keep the Rust variant so existing library JSON rows still deserialize; added TODO(cloud-mvp) doc-comment in core/src/domain/space.rs. - Unify getVolumeIcon: delete the name-substring duplicate in DevicePanel.tsx and redirect VolumeBar.tsx to the canonical scheme-based helper in packages/ts-client. Widen its param so callers no longer need 'as any'. - Add 9 bun tests asserting a renamed cloud volume still maps to its provider icon.
Local and cloud volumes can now be mixed in FileCopyJob without the strategy layer rejecting cloud paths as non-local. CloudCopyStrategy handles the four transfer shapes: local upload, cloud download, same-backend server-side copy (falling back to client-side streaming when the provider capability is disabled), and cross-backend streaming. The router intercepts cloud endpoints before the device-slug branch so they stop falling through to LocalStreamCopyStrategy. 12 new tests cover each shape plus routing selection, progress callback cadence, and the cloud_key trim helper. Checksum verification on cloud paths is stubbed with a warn! and a TODO pointing to set-8b for ETag/content_md5 wiring.
Integration test onedrive_end_to_end_test covers the OAuth journey end-to-end against wiremocked Microsoft endpoints: start flow, simulate loopback redirect, poll until Completed, tokens extracted, volume registered, then torn down. Two adjacent tests pin the cancel path and the token-exchange failure path. Unit coverage (10 tests in change_detection/onedrive.rs, 10 in providers/onedrive.rs, 12 in copy strategy) covers the HTTP surface the integration test cannot touch without promoting private constructors. docs/core/cloud-integration.mdx rewritten against reality: removes false claims (40+ providers, OS keyring, cross-cloud moves, thumbnails, cost tracking), adds honest supported-services matrix, BYO Azure AD setup, security model, change detection wiring, and known limitations. From 381 lines of marketing to 178 lines of fact. CLOUD-004 proposal in .investigations/ finalised with accurate delivery status. Sibling CLOUD-003 and FILE-003 update proposals list which acceptance criteria the MVP has closed so the founder can copy them into the upstream tracker. MVP-CHANGELOG.md captures the full OneDrive vertical slice for the PR description. E2E test surfaced a pre-existing VolumeAddCloudAction issue: the OneDrive branch forwards both access_token and refresh_token to OpenDAL 0.55's Onedrive builder, which rejects that combination. The test therefore completes the OAuth journey in OneDrive shape but registers the resulting volume as S3 to exercise the rest of the action handler. Fix is out of scope here and tracked for a follow-up commit.
OpenDAL's Onedrive and Gdrive builders treat access_token and (refresh_token + client_id + client_secret) as mutually exclusive configurations; passing both fails the builder with a config validation error. CloudBackend::new_onedrive and new_google_drive were forwarding every field from CloudStorageConfig, which prevented either provider from constructing once a refresh_token was present (always, after the OAuth flow). Switch both to the long-lived refresh path so OpenDAL rotates access tokens internally. Argument stays in the signature for caller stability. Surfaced by the Set 8b integration test. Dropbox was already using refresh-only, so no change there. Documented in docs/core/cloud-integration.mdx known limitations is now unnecessary for this case since the constraint is handled.
… to investigations docs/core/cloud-integration.mdx describes the aspirational vision (target state) and should not be rewritten to match current reality — that conflates the upstream specification with MVP progress. Restored to upstream. The factual rewrite lands at .investigations/cloud-drives/cloud-integration-current-state.mdx with a note cross-referencing the target doc, so the PR makes the vision-vs-reality gap explicit without overwriting the specification.
Sweep across the 12 files produced or extensively modified by Sets 1-8 to remove comments that paraphrased visible code, restated field or variant names, or included verbose factory/JSDoc blocks. Keeps every WHY-focused justification, every error-handling strategy note, every platform-specific caveat, and every TODO(cloud-mvp) cross-reference into the plan doc. Net -145 LOC (113+/258-). Tests still 404 passing, clippy clean, no logic changes.
|
Important Review skippedDraft detected. Please check the settings in the CodeRabbit UI or the ⚙️ Run configurationConfiguration used: Repository UI Review profile: CHILL Plan: Pro Run ID: You can disable this status message by setting the Use the checkbox below for a quick retry:
✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
2473222 to
5c44106
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Cloud drives MVP — OneDrive vertical slice (BYO)
Summary
First end-to-end cloud drive integration on the daemon: a user can register a OneDrive (personal) account through a browser OAuth flow, list, copy and create folders on it through the normal file pipeline, and pick up remote changes via the Microsoft Graph Delta API on the next index pass. BYO Azure AD — no Spacedrive-owned provider credentials shipped, no founder action required to merge.
Google Drive and Dropbox keep their paste-tokens flow with a "browser sign-in coming soon" banner; their providers plug into the same infrastructure in a later PR (
CLOUD-004).Seventeen commits on top of
fb5d4d7e7. 404 lib tests + 3 E2E tests passing, zero panics.Context
The cloud-drive subsystem was investigated before any code was written. Reports in
.investigations/cloud-drives/established that cloud drives were ~55 % functional scaffolding with five critical blockers: two daemon-crashingpanic!s, no OAuth flow, lost B2/Wasabi/Spaces rehydration on restart, disconnectedcreate_folderfor cloud, and rehashed-on-every-pass indexing.Key reading:
00-executive-summary.md— one-page state of affairs.06-mvp-onedrive-vertical-slice.md— eight commit-sets planned, with acceptance criteria.cloud-integration-current-state.mdx— factual current-state documentation, kept outsidedocs/so the target-statedocs/core/cloud-integration.mdxstays aspirational.What ships
User-visible
client_id+client_secretobtained via the in-app Azure AD tutorial (seven steps, five redirect URIs to copy at once), click Connect with Microsoft, approve in the default browser, the daemon picks up the redirect on a loopback port, stores tokens encrypted, and the volume appears.FileCopyJob. Folder creation works./me/drive/root/deltaand advance the cursor. Throttled (429 / Retry-After respected), resilient to token invalidation (410 resyncRequiredtriggers a clean resync).volumes.remove_cloud, wiping credentials and the volume row.getVolumeIconinDevicePanel.tsxwas the culprit; only the scheme-based canonical version inpackages/ts-client/src/volumeIcons.tssurvives).Developer-visible
core/src/ops/cloud/oauth/. New providers plug in by implementingOauthProviderand registering at startup; the UI hits the same three wire methods (cloud.oauth.start,cloud.oauth.poll,cloud.oauth.cancel). No changes toCloudStorageConfigvariants — the browser flow populates the existing fields on the existingVolumeAddCloudAction.BackendFeaturesmetadata layer atcore/src/volume/backend/mod.rs. Every backend advertises server-side copy/rename, stable-file-id guarantees, change-notification kind, content-hash algorithms, case sensitivity, duplicate tolerance, max file size, and multipart threshold. Enables capability-driven code paths rather than per-provider branches.CloudCopyStrategyextending the existing strategy pattern. Four shapes (local↔cloud, cloud↔local, same-backend cloud↔cloud, cross-backend cloud↔cloud) with 8 MiB chunked streaming, 4-way writer concurrency, and progress reporting.ChangeDetectortrait +OneDriveChangeDetectorso future providers slot in without touching the indexer.TracingLayer,ConcurrentLimitLayer(16),TimeoutLayer(30s/30s),RetryLayer(3, jitter)— wrapping every cloud operator.cloud_sync_statetable). Encrypted credential storage via SeaORM + XChaCha20-Poly1305 already existed.wiremock, all deterministic.Scope boundaries
tenant=common) only. Google Drive and Dropbox keep paste-tokens with a "coming soon" banner. Microsoft Business / SharePoint needs a tenant-specific issuer; stubbed as TODO.CloudStorageConfig::*variant already expectedclient_id+client_secret; validation was in place. Swapping to Spacedrive-owned credentials later is a ~5-line UI diff; the server side is forward-compatible.VolumeAddCloudActionstep uses S3 becauseopendal 0.55hardcodesgraph.microsoft.comfor OneDrive data-plane calls and cannot be wiremocked. OneDrive file I/O is covered at unit level againstservices::Memory.docs/core/cloud-integration.mdxis the target-state vision and stays unchanged. A current-state snapshot lives at.investigations/cloud-drives/cloud-integration-current-state.mdxfor PR review.Known tech debt
Every item has a
TODO(cloud-mvp): ...comment at its call site referencing the plan doc.cloud_credentialsbutCloudBackendreads tokens at construction; long-running indexing passes won't pick up rotation mid-flight. TODOs atcore/src/ops/cloud/change_detection/onedrive.rs:51andcore/src/ops/indexing/phases/processing.rs:83.provider_file_idon listing (it doesn't today) or a cross-reference layer. Machinery is in place.Modified.ChangeKind::Renamed { from_path }exists but the OneDrive detector never constructs it — a single Graph delta response doesn't carry the previous path. Needs a per-provider_file_idpath cache. The indexer still reconciles renames via stable id.content_md5checksum verification on cloud paths (TODO atcore/src/ops/files/copy/strategy.rs:844). Currently warns and proceeds; integrity guaranteed by OpenDAL's chunked writer.core/src/volume/manager.rs:335-338. Not fixed in this PR; constructors exist for S3, GoogleDrive, OneDrive, Dropbox, AzureBlob, GCS.LocalBackend.LocalBackend::server_side_copy.How to test manually
Prerequisites
http://127.0.0.1:53682,:53683,:53684,:53685,:53686as Web (not SPA).Files.ReadWrite.All,offline_access,User.Read. Grant admin consent.Test sequence
cargo build(orcargo run --bin sd-cli -- restartif a daemon is already running).cd apps/tauri && bun run tauri:dev.shasum).cloud_credentialsrow removed.Failure-mode checks
netstat -an | findstr 5368returns nothing within ~10 s).cloud_credentialsrow (garbagerefresh_token), restart the daemon. Refresh task should log an auth failure and not crash.How to review
Commit narrative order
Each commit builds, tests and lints cleanly on its own.
Reading paths by concern
core/src/ops/cloud/oauth/(start, poll, cancel, complete, provider trait, loopback, flow store, refresh, providers/onedrive).core/src/ops/cloud/change_detection/(types, detector trait, OneDrive impl, repository, delta pass).core/src/ops/files/copy/strategy.rs+routing.rs.packages/interface/src/routes/explorer/components/OneDriveConnectForm.tsx+AddStorageModal.tsx.core/src/infra/db/migration/m20260418_000001_*.rsandm20260418_000002_*.rs.core/tests/onedrive_end_to_end_test.rs.Risk and rollback
Result.git revert f6234dee1..fe5a5331d^. No data-loss migrations;cloud_sync_stateand theprovider_file_idcolumn onentriesare purely additive.LocalBackendwas updated forBackendFeaturesparity but its behaviour is identical.Dependency additions
opendal 0.54 → 0.55(already a direct dep)oauth2 = "5"webbrowser = "1"dashmap = "6"subtle = "2"(constant-time CSRF)base64 = "0.22"wiremock = "0.6"(dev-dep only)All MIT/Apache-2.0.
Questions for reviewers
.investigations/policy. Keep it in the tree, or drop before merge? Options: (a) keep as project memory; (b) move to an external design repo; (c) keep only the plan and proposals, drop the state-of-art reports..investigations/cloud-drives/*-proposal.md. Applied as a follow-up commit on this branch, or a separate maintainer PR?