Skip to content

[Research] Shipper Integration Plan #906

@EffortlessSteven

Description

@EffortlessSteven

Shipper Analysis

What Shipper Does

Shipper is a publishing reliability layer for Rust workspaces that focuses on making publishing safe to start and safe to re-run. It handles:

  • Deterministic publish planning: Builds dependency-first ordering for workspace crates
  • Pre-flight checks: Git cleanliness, publishability, registry reachability, version existence
  • Optional ownership verification: Validates crate permissions when token is available
  • Resumable publishing: Persists progress to disk, supports interruption and resume
  • Retry/backoff handling: Applies exponential backoff for retryable failures
  • Publish verification: Confirms completion via registry API before declaring success
  • Evidence capture: Maintains event logs and receipts for audit trails
  • Readiness checking: Ensures registry visibility before publishing dependent crates
  • Parallel publishing: Supports concurrent publishing of independent packages
  • Configurable policies: Safe/balanced/fast modes with configurable timeouts

What Shipper Does NOT Do

Shipper is intentionally narrow and does NOT:

  • Bump versions
  • Generate changelogs
  • Create git tags
  • Create GitHub releases

These are left to other tools (cargo-release, release-plz, or custom workflows).

Configuration

Shipper uses a .shipper.toml configuration file in the workspace root with sections for:

  • [policy]: Publishing mode (safe/balanced/fast)
  • [verify]: Verification mode (workspace/package/none)
  • [readiness]: Registry visibility checks (api/index/both)
  • [retry]: Retry behavior (max_attempts, delays, strategy)
  • [flags]: allow_dirty, skip_ownership_check, strict_ownership
  • [parallel]: Concurrent publishing settings
  • [registry]: Custom registry configuration

Auth Tokens

Shipper resolves tokens from the same places Cargo uses:

  • CARGO_REGISTRY_TOKEN (crates.io)
  • CARGO_REGISTRIES__<NAME>__TOKEN (other registries)
  • $CARGO_HOME/credentials.toml (created by cargo login)

Tokens are treated as opaque strings and sent as Authorization headers matching Cargo's registry API model.

Current tokmd Release Workflow

From issue #901 and RELEASE.md:

Version Management

cargo xtask bump <MAJOR.MINOR.PATCH>
  • Updates [workspace.package].version in root Cargo.toml
  • Updates all [workspace.dependencies] versions for internal crates
  • Updates Node package.json files (tokmd-node, npm package)
  • Supports optional schema version bumps with --schema flag

Pre-flight Checks

cargo xtask publish --dry-run
  • Git clean check
  • Workspace version consistency validation
  • CHANGELOG.md version existence check
  • Full workspace tests (--all-features)
  • Local package validation (cargo package --list) per crate

Publishing

cargo xtask publish --yes
  • Publishes all crates in dependency order (computed from workspace topology)
  • Supports --from <crate> to resume from specific crate
  • Optional --tag flag to create git tag

Tag-Driven Release (Production Path)

git tag vX.Y.Z
git push origin vX.Y.Z

Triggers .github/workflows/release.yml which:

  • Builds cross-platform release binaries (Linux, macOS, Windows, WASM)
  • Creates GitHub release with artifacts and auto-generated notes
  • Runs cargo xtask publish --yes --skip-tests --verbose to publish to crates.io
  • Builds and pushes Docker image to GHCR
  • Updates major version tag

Integration Strategy

Goal

Replace tokmd's current custom xtask publish implementation with shipper for the actual crates.io publishing operations, while keeping tokmd's version bump, preflight, and tag-driven release workflow.

What Changes

  1. xtask publish: Delegate to shipper for plan/preflight/publish operations
  2. CI workflow: Use shipper in release.yml instead of current xtask publish
  3. Configuration: Add .shipper.toml for shipper settings
  4. Resume handling: Use shipper's state/resume instead of custom --from logic

What Stays the Same

  1. Version bumping: Keep cargo xtask bump unchanged
  2. CHANGELOG validation: Keep CHANGELOG.md checks
  3. Full workspace tests: Keep --all-features test validation
  4. Tag-driven releases: Keep git tag trigger mechanism
  5. GitHub releases: Keep release binary builds and artifact generation
  6. Node package publishing: Keep manual npm publish for now

Required Changes

1. xtask/src/tasks/publish.rs

Replace custom publish implementation with shipper CLI wrapper:

pub fn run(args: PublishArgs) -> Result<()> {
    // Keep version bumping logic if present
    // Keep CHANGELOG validation
    // Keep workspace tests

    // Replace plan computation with shipper
    if args.plan {
        run_shipper("plan")?;
        return Ok(());
    }

    // Replace preflight with shipper
    if args.dry_run {
        run_shipper("preflight")?;
        return Ok(());
    }

    // Replace publish execution with shipper
    if args.yes {
        run_shipper("publish")?;
    }

    Ok(())
}

fn run_shipper(subcommand: &str) -> Result<()> {
    let mut cmd = Command::new("shipper");
    cmd.arg(subcommand);
    // Add appropriate flags based on args
    cmd.status()?.context("shipper failed")?;
    Ok(())
}

2. .shipper.toml

Create configuration file matching tokmd's current behavior:

schema_version = "shipper.config.v1"

[policy]
mode = "safe"  # Verify+strict, matches current xtask behavior

[verify]
mode = "workspace"  # Safest option

[readiness]
enabled = true
method = "both"  # Most reliable for 60+ crate workspace
max_total_wait = "5m"
poll_interval = "2s"

[retry]
max_attempts = 6
base_delay = "2s"
max_delay = "2m"
strategy = "exponential"

[flags]
allow_dirty = false
skip_ownership_check = false
strict_ownership = true  # Fail if ownership checks fail

[parallel]
enabled = true  # Enable for 60+ crate workspace
max_concurrent = 4
per_package_timeout = "30m"

3. .github/workflows/release.yml

Update publish job:

publish-crates:
  runs-on: ubuntu-latest
  steps:
    - uses: actions/checkout@v4
    - uses: actions-rs/toolchain@v1
      with:
        toolchain: stable
    - name: Install shipper
      run: cargo install --locked shipper-cli
    - name: Publish to crates.io
      env:
        CARGO_REGISTRY_TOKEN: ${{ secrets.CARGO_REGISTRY_TOKEN }}
      run: |
        shipper plan
        shipper preflight
        shipper publish

4. Documentation Updates

  • Update RELEASE.md to reference shipper instead of xtask publish implementation
  • Add note about .shipper.toml configuration
  • Document resume command (shipper resume) for failure recovery

Migration Path

Phase 1: Evaluate (Low Risk)

  1. Run shipper plan in tokmd workspace to verify dependency ordering
  2. Run shipper preflight to check compatibility with existing checks
  3. Compare plan output with current xtask publish plan
  4. Generate .shipper.toml via shipper config init and tune settings

Acceptance Criteria:

  • Shipper plan matches xtask plan ordering
  • Preflight passes without errors
  • Configuration captures all required settings

Phase 2: Add Shipper as Alternative (Low Risk)

  1. Add --use-shipper flag to xtask publish
  2. Implement dual-path logic: use shipper if flag set, otherwise current code
  3. Test dry-run with shipper
  4. Test actual publish in test crate environment

Acceptance Criteria:

  • Both paths work correctly
  • Shipper produces same results as xtask
  • Resume functionality verified

Phase 3: Shipper as Default (Medium Risk)

  1. Change default to use shipper, keep --legacy flag for fallback
  2. Update release workflow to use shipper directly (bypass xtask)
  3. Run production release with shipper
  4. Monitor for issues, have rollback ready

Acceptance Criteria:

  • Production release succeeds with shipper
  • No regression in publishing reliability
  • Resume functionality tested in production

Phase 4: Remove Legacy (Low Risk)

  1. Remove xtask publish implementation
  2. Remove fallback flags
  3. Update documentation
  4. Archive old code

Acceptance Criteria:

  • Code cleanup complete
  • Documentation updated
  • No legacy code remains

Risks and Mitigations

Risk 1: Shipper Behavior Differences

Risk: Shipper's retry/backoff or readiness checks behave differently than xtask's custom logic.

Mitigation:

  • Phase 1 evaluation identifies differences early
  • Phase 2 dual-path allows A/B testing
  • Phase 3 runs with rollback ready

Risk 2: Configuration Complexity

Risk: .shipper.toml adds new configuration surface that needs maintenance.

Mitigation:

  • Generate initial config with defaults
  • Document all settings
  • Keep xtask flags as thin wrappers initially

Risk 3: Resume Compatibility

Risk: Shipper's resume state format incompatible with xtask's --from logic.

Mitigation:

  • Phase 2 validates resume works correctly
  • Document migration path from xtask state to shipper state
  • Keep legacy resume available during migration

Risk 4: CI Timeout Changes

Risk: Shipper's parallel publishing or readiness checks change CI duration significantly.

Mitigation:

  • Phase 2 measures timing differences
  • Configure parallel settings appropriately for CI constraints
  • Use balanced policy if safe is too slow

Risk 5: Token Resolution Differences

Risk: Shipper token resolution behaves differently than Cargo's in CI.

Mitigation:

  • Test token resolution in CI environment early
  • Verify CARGO_REGISTRY_TOKEN secret works correctly
  • Check shipper doctor output in CI

Open Questions

  1. Should shipper handle CHANGELOG.md validation, or keep in xtask preflight?

    • Recommendation: Keep in xtask preflight for now, shipper focuses on publishing
  2. Should shipper handle full workspace tests, or keep separate?

    • Recommendation: Keep separate - shipper handles publish-specific checks, xtask handles full test suite
  3. Should we enable parallel publishing immediately or start sequential?

    • Recommendation: Start sequential, enable parallel in Phase 2 after baseline established
  4. Should .shipper.toml be committed to repo?

    • Recommendation: Yes, committed with defaults that match production requirements
  5. How to handle Node package publishing with shipper?

    • Recommendation: Keep separate for now, shipper is Rust-only. Can revisit for integrated workflow later.

Next Steps

  1. Phase 1 Evaluation: Run shipper plan and preflight in tokmd workspace
  2. Generate Config: Create .shipper.toml via shipper config init
  3. Compare Plans: Verify shipper plan ordering matches xtask plan
  4. File Tracking Issue: Split into sub-issues for each phase if needed
  5. Draft PR: Prepare initial integration with --use-shipper flag

Context: This issue summarizes research on integrating shipper into tokmd's release workflow to improve publishing reliability and reduce custom code maintenance.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions