Skip to content

Continuous Benchmarking using Github Actions #2134

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 6 commits into from
Jun 11, 2025

Conversation

pablo-gf
Copy link
Contributor

@pablo-gf pablo-gf commented May 5, 2025

This PR aims to provide a new benchmarking approach for liboqs. It uses the Continuous Benchmarking action from Marketplace, like the mlkem-native repository. For speed benchmarking of the current algorithms in the library, 3 new files are included:

  • scripts/parse_liboqs_speed.py : Retrieves the benchmarking data from speed_kem and speed_sig and outputs it in a json file that matches the format required by the continuous benchmarking action.

  • workflows/kem-bench.yml: Iterates through the different KEM algorithms executing the speed test and gathering its information using parse_liboqs_speed.py. It then pushes the output json file to a gh-pages branch using the Continuous Benchmarking action.

  • workflows/sig-bench.yml: Same kem-bench.yml but for signature algorithms.

To complete the benchmarking, it is required to create a new gh-pages branch so that the workflows generate and continuously update a Github page with the visualization of the benchmarking results. I have adapted the html file to include some additional features here . I can include these changes once the new gh-pages branch for liboqs is set up. You can see an example of what the final output should look like here.

Let me know if you have any questions or suggestions!

  • Does this PR change the input/output behaviour of a cryptographic algorithm (i.e., does it change known answer test values)? (If so, a version bump will be required from x.y.z to x.(y+1).0.)
  • Does this PR change the list of algorithms available -- either adding, removing, or renaming? Does this PR otherwise change an API? (If so, PRs in fully supported downstream projects dependent on these, i.e., oqs-provider will also need to be ready for review and merge by the time this is merged.)

pablo-gf added 3 commits May 5, 2025 16:23
Signed-off-by: Pablo Gutiérrez Félix <[email protected]>
Signed-off-by: Pablo Gutiérrez Félix <[email protected]>
Copy link
Member

@SWilson4 SWilson4 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for your patience, @pablo-gf! I've taken a look at the GitHub warnings; I think I have an idea how to resolve each of them.

@dstebila
Copy link
Member

I think it would be good if the build & configuration information that's currently in the expandable "Latest commit build information" is displayed directly at the top of the page.

In the pop-ups that show when you hover over a datapoint, it looks like all the commits have been authored by you. Is that placeholder information? Or should something else be showing here?

Signed-off-by: Pablo Gutiérrez <[email protected]>
@pablo-gf
Copy link
Contributor Author

@dstebila I have added the build information at the top, let me know if that works: https://pablo-gf.github.io/liboqs/dev/bench/. As for your second comment, that is placeholder information. The idea is that those pop-ups will contain the details of each commit made to the library starting from the first commit after this continuous benchmarking framework is deployed.

@dstebila
Copy link
Member

@dstebila I have added the build information at the top, let me know if that works: https://pablo-gf.github.io/liboqs/dev/bench/. As for your second comment, that is placeholder information. The idea is that those pop-ups will contain the details of each commit made to the library starting from the first commit after this continuous benchmarking framework is deployed.

Looks good, thanks!

Signed-off-by: Pablo Gutiérrez <[email protected]>
@pablo-gf
Copy link
Contributor Author

@SWilson4 I fixed the security warnings that popped-up after my last commit. Let me know if you have any comments or suggestions. As I mentioned at the beginning, to make the entire process work we would also need to create a new gh-pages branch so that the workflows generate and continuously update a Github page with the visualization of the benchmarking results.

@SWilson4
Copy link
Member

Thanks for the updates, @pablo-gf! Are you able to merge this branch into main of your fork so we can test the commit-to-main flow and see if it's working as expected?

Signed-off-by: Pablo Gutiérrez <[email protected]>
@pablo-gf
Copy link
Contributor Author

@SWilson4 @dstebila The tests now run successfully in the main branch of my fork (except for the basic downstream tests, as expected). I had to adjust a couple of minor details. Let me know if you have any suggestions before moving this PR to ready for review.

@pablo-gf pablo-gf marked this pull request as ready for review June 10, 2025 14:10
Copy link
Member

@SWilson4 SWilson4 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. @pablo-gf is there anything remaining to be done from an admin point of view to enable this?

@pablo-gf
Copy link
Contributor Author

Thank you @SWilson4! What's left is to create a branch called gh-pages so that the benchmarking results are posted there.

@SWilson4
Copy link
Member

Thank you @SWilson4! What's left is to create a branch called gh-pages so that the benchmarking results are posted there.

Done! I also set up branch protection so that we don't accidentally delete it.

@pablo-gf
Copy link
Contributor Author

Thank you @SWilson4! What's left is to create a branch called gh-pages so that the benchmarking results are posted there.

Done! I also set up branch protection so that we don't accidentally delete it.

Awesome! Should be good to merge now.

@SWilson4
Copy link
Member

Merging, thanks @pablo-gf for the contribution!

@SWilson4 SWilson4 merged commit d745d35 into open-quantum-safe:main Jun 11, 2025
79 checks passed
@dstebila
Copy link
Member

In a merge today we got a whole bunch of alerts about speed regression, which I think is related to this continuous benchmarking PR landing. @pablo-gf do you have any idea what's going on here?

@pablo-gf
Copy link
Contributor Author

pablo-gf commented Jun 12, 2025

In a merge today we got a whole bunch of alerts about speed regression, which I think is related to this continuous benchmarking PR landing. @pablo-gf do you have any idea what's going on here?

Yes, @dstebila. There is an option to throw an alert if the algorithm speed decreases a certain value (I believe it's a specific percentage). I can definitely look into that if it's something we are not interested in.

@dstebila
Copy link
Member

Yes, @dstebila. There is an option to throw an alert if the algorithm speed decreases a certain value (I believe it's a specific percentage). I can definitely look into that if it's something we are not interested in.

In principle I think we'd be interested in that. But the alerts being thrown in commit I linked to seem to be too sensitive. Would you be able to check how the thresholds are configured?

@pablo-gf
Copy link
Contributor Author

Yes, @dstebila. There is an option to throw an alert if the algorithm speed decreases a certain value (I believe it's a specific percentage). I can definitely look into that if it's something we are not interested in.

In principle I think we'd be interested in that. But the alerts being thrown in commit I linked to seem to be too sensitive. Would you be able to check how the thresholds are configured?

Yes @dstebila , here is some information from the page:

"This action can raise an alert to the commit when its benchmark results are worse than previous exceeding a specified threshold. By default, this action marks the result as performance regression when it is worse than the previous exceeding 200% threshold. For example, if the previous benchmark result was 100 iter/ns and this time it is 230 iter/ns, it means 230% worse than the previous and an alert will happen. The threshold can be changed by alert-threshold input."

I believe the current parameters are set to alert-threshold: 50%. I set it that way to make sure it was working during testing, but let me know the value you would like it to be increased to. I believe the default value in the ml-kem repository is 103%.

@SWilson4
Copy link
Member

In the past we deemed 15% as an acceptable variation. I think I would lower it now that we have some stable algorithms in the library—a 14% performance drop might not be cause for alarm for MAYO, but it certainly would be for ML-KEM. How about setting alert-threshold to 105% for now? If we find that we're getting a bunch of warnings for more experimental algs, we can raise it further.

@pablo-gf
Copy link
Contributor Author

@SWilson4 Sounds good. Would you like me to create a new PR for that?

@SWilson4
Copy link
Member

@SWilson4 Sounds good. Would you like me to create a new PR for that?

That would be great, thanks @pablo-gf.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants