Overview
This issue summarizes research findings on tokmd's testing infrastructure and recommends concrete improvements to reduce CI time, increase test coverage visibility, and enhance automation.
Current CI/Test Setup Inventory
CI Configuration (.github/workflows/ci.yml)
Jobs (15+):
- MSRV Check (ubuntu-latest)
- Build & Test (ubuntu-latest, windows-latest - matrix)
- Build & Test (macos-latest - push only)
- Feature Boundaries (ubuntu-latest)
- Wasm Compile & Test (ubuntu-latest)
- Quality Gate (ubuntu-latest)
- Cargo Deny (ubuntu-latest)
- Typos (ubuntu-latest)
- Proptest Smoke (ubuntu-latest)
- Publish Plan (ubuntu-latest)
- Version Consistency (ubuntu-latest)
- Docs Check (ubuntu-latest)
- Nix PR Package Gate (ubuntu-latest)
- Mutation Testing (Required, PR only)
- CI (Required - aggregates all above)
Additional Workflows:
test-action.yml - GitHub action testing
mutants.yml - Full mutation testing (on-demand)
fuzz.yml - Nightly fuzz testing (9 targets)
cockpit.yml - PR cockpit report generation
Test Setup Across Crates
Scale:
- 67 workspace members (crates)
- 57 test directories (
crates/*/tests/)
- 928 integration test files (
crates/*/tests/*.rs)
Testing Tools:
- Property testing: proptest (256 cases, 10s timeout per case)
- Snapshot testing: insta v1.47.0 (configured in 14+ crates)
- Mutation testing: cargo-mutants v26.1.2 (PR-scoped, changed files only)
- Fuzz testing: cargo-fuzz (nightly builds, 9 targets)
Test Types:
- Unit tests (inline in
src/)
- Integration tests (
tests/ directories)
- Property-based tests (
tests/properties.rs)
- Snapshot tests (insta assertions)
- Fuzz targets (
fuzz/ directory)
Current Test Execution
Command: cargo test --all-features --verbose
Platforms:
- Ubuntu latest (primary)
- Windows latest (matrix)
- macOS latest (push only, slower runner)
- Wasm32 (via wasm-pack)
Performance:
- Tests run sequentially (no parallelization)
- No test categorization (unit vs integration vs slow)
- No coverage measurement
Identified Bottlenecks and Gaps
1. Test Parallelization (High Impact)
Problem:
- 928 integration tests run sequentially on each platform
cargo test is single-threaded by default
- Large test suite = longer CI time, especially on slower runners (macOS)
Evidence:
# Current CI runs tests sequentially on 3 platforms
- name: Run tests
run: cargo test --all-features --verbose
2. Missing Coverage Measurement (Medium Impact)
Problem:
- No code coverage visibility
- No coverage gates (e.g., require 80% coverage)
- Cannot track coverage trends over time
- Mutation testing exists but doesn't provide coverage metrics
Evidence:
- No
cargo-tarpaulin or cargo-llvm-cov in workflows
- No coverage artifacts or reports
- No coverage comments on PRs
3. Limited Integration Test Automation (Medium Impact)
Problem:
- No matrix for testing different feature combinations
- Feature boundary tests run manually, not systematically
- No explicit test categorization (unit/integration/e2e)
Evidence:
# Only runs "all-features" and "no-default-features" for tokmd-analysis
- name: tokmd-analysis with all features
run: cargo test -p tokmd-analysis --all-features --verbose
- name: tokmd-analysis with no default features
run: cargo test -p tokmd-analysis --no-default-features --verbose
4. Slow Test Identification (Low Impact)
Problem:
- No test categorization (slow tests not isolated)
- No test duration tracking
- All tests run in every CI job
Evidence:
- No
#[ignore] or #[slow] attributes found
- No test timing reports in CI output
5. Snapshot Test Workflow Gaps (Low Impact)
Problem:
- insta is configured but no CI workflow for snapshot review
- No automated snapshot update process
- Risk of snapshot drift
Recommended Improvements
Priority 1: Adopt cargo-nextest for Parallelization ⚡
Implementation:
# Install nextest
- name: Install cargo-nextest
uses: taiki-e/install-action@v2
with:
tool: cargo-nextest
# Run tests in parallel
- name: Run tests (nextest)
run: cargo nextest run --all-features --workspace --verbose
Benefits:
- 3-5x faster test execution on multi-core runners
- Better test failure reporting
- Test timing data out of the box
- Smart test sharding for parallel CI jobs
Estimated Impact:
- Current: ~10-15 minutes per platform
- With nextest: ~3-5 minutes per platform
- Overall CI reduction: 60-70%
Priority 2: Add Code Coverage Measurement 📊
Implementation (Option A - cargo-tarpaulin):
- name: Generate coverage report
run: |
cargo install cargo-tarpaulin
cargo tarpaulin --workspace --all-features \
--out Xml --output-dir ./coverage \
--ignore-tests --timeout 300
- name: Upload coverage to Codecov
uses: codecov/codecov-action@v4
with:
files: ./coverage/cobertura.xml
Implementation (Option B - cargo-llvm-cov):
- name: Generate coverage report (llvm-cov)
run: cargo llvm-cov --workspace --all-features --html --lcov
- name: Upload coverage artifacts
uses: actions/upload-artifact@v4
with:
name: coverage-report
path: target/llvm-cov/html/
Benefits:
- Visibility into test coverage gaps
- Coverage trends over time
- PR comments showing coverage impact
- Can set minimum coverage thresholds
Estimated Impact:
- 15-30% overhead per job (acceptable trade-off)
- Enables coverage gates (e.g., require 80% line coverage)
Priority 3: Test Categorization & Smart CI 🎯
Implementation:
// Mark slow or expensive tests
#[test]
#[ignore = "slow - run only in nightly CI"]
fn expensive_integration_test() { /* ... */ }
// Mark integration tests
#[tokio::test]
#[cfg_attr(test, ignore = "integration - run in dedicated job")]
async fn api_integration_test() { /* ... */ }
CI Workflow:
# Split into faster unit tests and slower integration tests
unit-tests:
run: cargo nextest run --workspace --all-features --no-fail-fast --lib
integration-tests:
run: cargo nextest run --workspace --all-features --test-threads=1
slow-tests:
if: github.event_name == 'schedule' || contains(github.event.head_commit.message, '[run-slow]')
run: cargo nextest run --workspace --all-features --run-ignored
Benefits:
- Faster PR feedback (unit tests run first)
- Reduced resource usage
- Better test organization
Priority 4: Feature Matrix Testing 🧪
Implementation:
feature-matrix:
strategy:
matrix:
features:
- "--all-features"
- "--no-default-features"
- "--features git"
- "--features wasm"
run: cargo nextest run --workspace ${{ matrix.features }}
Benefits:
- Catch feature boundary issues earlier
- Ensure feature flags work in isolation
- Prevent feature combinatorial bugs
Priority 5: Snapshot Test Automation 📸
Implementation:
# In CI, check snapshots (don't update)
env:
INSTA_UPDATE: no
# In PR comment, show snapshot diffs
- name: Review snapshot changes
if: failure()
run: |
cargo insta review --exit-code
# On explicit approval, update snapshots
- name: Update snapshots
if: github.event_name == 'workflow_dispatch'
env:
INSTA_UPDATE: always
run: cargo insta test --accept --unreferenced=auto
Benefits:
- Prevent snapshot drift
- Clear review process for snapshot changes
- Automated snapshot updates with approval
Estimated CI Time Reduction
Current State (Estimates)
- Ubuntu Build & Test: ~8-12 minutes
- Windows Build & Test: ~10-15 minutes (slower runner)
- macOS Build & Test: ~12-18 minutes (slowest runner)
- Total test time per PR: ~30-45 minutes
With cargo-nextest (Priority 1)
- Ubuntu Build & Test: ~2-4 minutes (-70%)
- Windows Build & Test: ~3-5 minutes (-70%)
- macOS Build & Test: ~4-6 minutes (-70%)
- Total test time per PR: ~9-15 minutes (-60-70%)
With All Improvements (Priorities 1-3)
- Unit tests (fast): ~2-3 minutes total (all platforms)
- Integration tests (medium): ~4-6 minutes total
- Slow tests (nightly): ~10-15 minutes (runs only on schedule)
- PR feedback time: ~6-9 minutes (-70-80%)
Implementation Roadmap
Phase 1: Quick Wins (1-2 days)
Phase 2: Coverage (2-3 days)
Phase 3: Test Organization (3-5 days)
Phase 4: Advanced (1-2 days)
Next Steps
- Decision Point: Do we want to adopt cargo-nextest? (High confidence, low risk)
- Decision Point: Which coverage tool? (tarpaulin = simpler, llvm-cov = faster/more accurate)
- Decision Point: Should we gate on minimum coverage? (Requires consensus on threshold)
- Action: Start with Phase 1 (nextest adoption) as a proof-of-concept
Related Issues
- None yet (this is the research baseline)
References
Overview
This issue summarizes research findings on tokmd's testing infrastructure and recommends concrete improvements to reduce CI time, increase test coverage visibility, and enhance automation.
Current CI/Test Setup Inventory
CI Configuration (.github/workflows/ci.yml)
Jobs (15+):
Additional Workflows:
test-action.yml- GitHub action testingmutants.yml- Full mutation testing (on-demand)fuzz.yml- Nightly fuzz testing (9 targets)cockpit.yml- PR cockpit report generationTest Setup Across Crates
Scale:
crates/*/tests/)crates/*/tests/*.rs)Testing Tools:
Test Types:
src/)tests/directories)tests/properties.rs)fuzz/directory)Current Test Execution
Command:
cargo test --all-features --verbosePlatforms:
Performance:
Identified Bottlenecks and Gaps
1. Test Parallelization (High Impact)
Problem:
cargo testis single-threaded by defaultEvidence:
2. Missing Coverage Measurement (Medium Impact)
Problem:
Evidence:
cargo-tarpaulinorcargo-llvm-covin workflows3. Limited Integration Test Automation (Medium Impact)
Problem:
Evidence:
4. Slow Test Identification (Low Impact)
Problem:
Evidence:
#[ignore]or#[slow]attributes found5. Snapshot Test Workflow Gaps (Low Impact)
Problem:
Recommended Improvements
Priority 1: Adopt cargo-nextest for Parallelization ⚡
Implementation:
Benefits:
Estimated Impact:
Priority 2: Add Code Coverage Measurement 📊
Implementation (Option A - cargo-tarpaulin):
Implementation (Option B - cargo-llvm-cov):
Benefits:
Estimated Impact:
Priority 3: Test Categorization & Smart CI 🎯
Implementation:
CI Workflow:
Benefits:
Priority 4: Feature Matrix Testing 🧪
Implementation:
Benefits:
Priority 5: Snapshot Test Automation 📸
Implementation:
Benefits:
Estimated CI Time Reduction
Current State (Estimates)
With cargo-nextest (Priority 1)
With All Improvements (Priorities 1-3)
Implementation Roadmap
Phase 1: Quick Wins (1-2 days)
cargo testwithcargo nextest runPhase 2: Coverage (2-3 days)
Phase 3: Test Organization (3-5 days)
Phase 4: Advanced (1-2 days)
Next Steps
Related Issues
References