Skip to content

Conversation

@BowTiedRadone
Copy link
Contributor

@BowTiedRadone BowTiedRadone commented Dec 31, 2025

This PR implements failure persistence and regression testing for Rendezvous. When tests fail, the failing seeds and configuration used are now automatically saved to .rendezvous-regressions/, allowing users to replay them on subsequent runs to prevent regressions.

Key additions

  • New --regr CLI flag to control test execution: if present, run only the regression tests
  • Persistent failure storage grouped by test type (invariant/property) with automatic replay
  • Failures sorted by timestamp for chronological tracking

Sample failure file
.rendezvous-regressions/ST1PQHQKV0RJXZFY1DGX8MNSNYVE3VGZJSRTPGZGM.counter.json:

{
  "invariant": [
    {
      "seed": 901717247,
      "dial": "example/sip010.cjs",
      "numRuns": 15,
      "timestamp": 1767886534833
    },
    {
      "seed": -1374686468,
      "numRuns": 9,
      "timestamp": 1767886531457
    },
    {
      "seed": 1298457354,
      "dial": "example/sip010.cjs",
      "numRuns": 20,
      "timestamp": 1767883389583
    },
  ],
  "test": [
    {
      "seed": 1656313995,
      "numRuns": 6,
      "timestamp": 1767886553125
    },
    {
      "seed": 64830639,
      "numRuns": 11,
      "timestamp": 1767886546477
    },
    {
      "seed": 1593583466,
      "numRuns": 3,
      "timestamp": 1767886542907
    },
  ]
}

This brings Rendezvous closer to production-grade fuzzing by ensuring discovered bugs stay caught, while improving internal code organization and maintainability.

Closes #130.

@BowTiedRadone BowTiedRadone force-pushed the feat/failure-persistence branch from 97dbfad to 0fa8979 Compare January 5, 2026 19:10
@BowTiedRadone BowTiedRadone force-pushed the feat/failure-persistence branch from 65ce54d to ea642ea Compare January 6, 2026 21:57
@BowTiedRadone
Copy link
Contributor Author

Since --mode is now binary and --mode=new doesn't have any effect, a --regr option would be more suitable. To update.

@BowTiedRadone BowTiedRadone changed the title [DRAFT] Add failure persistence support for regression testing Add failure persistence support for regression testing Jan 8, 2026
@BowTiedRadone BowTiedRadone marked this pull request as ready for review January 8, 2026 18:57
@BowTiedRadone BowTiedRadone requested a review from a team as a code owner January 8, 2026 18:57
// If the number of runs that failed is less than 100, set it to the
// default value of 100. If more runs were needed to reproduce the
// failure, use the number of runs that failed.
runs: regression.numRuns < 100 ? 100 : regression.numRuns,
Copy link
Collaborator

@moodmosaic moodmosaic Jan 12, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If a failure occurred at run 50, why force 100 runs? This could mask issues where failures only occur early in the sequence.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is an extra safety mechanism. In practice, it means a historical failure might have failed after 5 runs, but the same sequence of events (seed) can be considered passing if, for a given user configuration (seed, dial, etc.), it passes the default number of runs used by Rendezvous. However, you raised a great point; this behavior should probably be documented more clearly/explicitly.

Copy link
Contributor Author

@BowTiedRadone BowTiedRadone Jan 12, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let me know what you think, but please also consider the fact that only unique seeds are stored per test type (invariant/test). If a new failure happens for a different number of runs but same seed, it won't be persisted.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Support failure persistence via seed replay

3 participants