-
Notifications
You must be signed in to change notification settings - Fork 950
[util] Improve sparse-fsm-encode.py reproducibility + small cleanups
#29259
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
The short arguments are too ambiguous. Signed-off-by: James Wainwright <[email protected]>
Allows the stdout to be copied directly without the instructions. Signed-off-by: James Wainwright <[email protected]>
The generated code did not pass `rustfmt`. Signed-off-by: James Wainwright <[email protected]>
jwnrt
reviewed
Feb 5, 2026
Signed-off-by: James Wainwright <[email protected]>
There is no strict guarantee that the semantics of Python's `random` module (or the rest of the standard library) will remain consistent across versions. Although breaking changes to randomness are handled with care and provide appropriate version arguments, we should still log the Python version for the sake of clear reproducibility and not losing any information. Signed-off-by: Alex Jones <[email protected]>
This avoids the skew issue where an FSM encoding is generated using the script, then some change to the encoding logic is made (and committed), and then later the FSM encoding is commited. In such a case, without bisecting to find the relevant encoding logic change, you cannot reproduce the results. By recording the commit hash you can always reproduce by using the provided command, git commit and Python version. Signed-off-by: Alex Jones <[email protected]>
This hasn't changed since Python 3.2, but we really should be pinning the random seed version to ensure deterministic and reproducible output across different Python versions. This way, even if a new seed version is introduced, we should be using the same seeding logic and so deterministically generating the same numbers, even on later Python versions. Signed-off-by: Alex Jones <[email protected]>
8e49add to
c075021
Compare
rswarbrick
approved these changes
Feb 5, 2026
Contributor
rswarbrick
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks good to me, and I've just gone through carefully and couldn't find anything I didn't love! Nice!
jwnrt
approved these changes
Feb 5, 2026
Contributor
jwnrt
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, thanks
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This PR makes a few changes to the
sparse-fsm-encode.pyutil script with the primary goals of improving the reproducibility of the generated encodings, as well as cleaning up a few other miscellaneous things (long arguments, code formatting). See the commit messages for more details.Importantly:
randomcould unexpectedly have a breaking change).--avoid-zerooption was given, as this changes the script's operation.Generated encodings now look something like: