Skip to content

Conversation

@AlexJones0
Copy link
Contributor

@AlexJones0 AlexJones0 commented Feb 5, 2026

This PR makes a few changes to the sparse-fsm-encode.py util script with the primary goals of improving the reproducibility of the generated encodings, as well as cleaning up a few other miscellaneous things (long arguments, code formatting). See the commit messages for more details.

Importantly:

  • We record the git hash (where it exists), as otherwise we could have issues where encodings are generated on one version of the script but committed after some meaningful change to the script. This was motivated by the encodings in https://github.com/lowRISC/opentitan/blob/b70dde65943eb0dc10b4a1efaf9282b808e8fd6c/sw/device/silicon_creator/lib/drivers/alert.h, which were generated before the changes in f2c848c but committed after, making this difficult to reproduce without bisecting.
  • We record the Python version to account for changes in the standard library (e.g. to handle the case where random could unexpectedly have a breaking change).
  • We record whether the --avoid-zero option was given, as this changes the script's operation.
  • We pin the random seed version to 2 (the current version), to avoid breakages if a new seed version is implemented in the future.

Generated encodings now look something like:

/*
 * Encoding generated at commit c0750210a1 using Python 3.10.19 with:
 * $ ./util/design/sparse-fsm-encode.py --language=c \
 *     --seed 3775359077 --distance 2 --states 5 --bits 8
 *
 * Hamming distance histogram:
 *
 *  0: --
 *  1: --
 *  2: ||||||||||||| (20.00%)
 *  3: ||||||||||||| (20.00%)
 *  4: |||||||||||||||||||| (30.00%)
 *  5: ||||||||||||| (20.00%)
 *  6: |||||| (10.00%)
 *  7: --
 *  8: --
 *
 * Minimum Hamming distance: 2
 * Maximum Hamming distance: 6
 * Minimum Hamming weight: 3
 * Maximum Hamming weight: 6
 */
typedef enum my_state {
  kMyState0 = 0xee,
  kMyState1 = 0x64,
  kMyState2 = 0xa7,
  kMyState3 = 0x32,
  kMyState4 = 0x70,
} my_state_t;

jwnrt added 3 commits February 5, 2026 15:11
The short arguments are too ambiguous.

Signed-off-by: James Wainwright <[email protected]>
Allows the stdout to be copied directly without the instructions.

Signed-off-by: James Wainwright <[email protected]>
The generated code did not pass `rustfmt`.

Signed-off-by: James Wainwright <[email protected]>
jwnrt and others added 4 commits February 5, 2026 16:38
There is no strict guarantee that the semantics of Python's `random`
module (or the rest of the standard library) will remain consistent
across versions. Although breaking changes to randomness are handled
with care and provide appropriate version arguments, we should still log
the Python version for the sake of clear reproducibility and not losing
any information.

Signed-off-by: Alex Jones <[email protected]>
This avoids the skew issue where an FSM encoding is generated using the
script, then some change to the encoding logic is made (and committed),
and then later the FSM encoding is commited. In such a case, without
bisecting to find the relevant encoding logic change, you cannot
reproduce the results.

By recording the commit hash you can always reproduce by using the
provided command, git commit and Python version.

Signed-off-by: Alex Jones <[email protected]>
This hasn't changed since Python 3.2, but we really should be pinning
the random seed version to ensure deterministic and reproducible output
across different Python versions. This way, even if a new seed version
is introduced, we should be using the same seeding logic and so
deterministically generating the same numbers, even on later Python
versions.

Signed-off-by: Alex Jones <[email protected]>
Copy link
Contributor

@rswarbrick rswarbrick left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks good to me, and I've just gone through carefully and couldn't find anything I didn't love! Nice!

Copy link
Contributor

@jwnrt jwnrt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks

@AlexJones0 AlexJones0 added this pull request to the merge queue Feb 6, 2026
Merged via the queue into lowRISC:master with commit 284ec94 Feb 6, 2026
77 of 78 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants