Skip to content

Continuous Benchmarking using Github Actions #2134

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Draft
wants to merge 5 commits into
base: main
Choose a base branch
from
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
6 changes: 6 additions & 0 deletions .github/workflows/commit-to-main.yml
Original file line number Diff line number Diff line change
Expand Up @@ -22,3 +22,9 @@ jobs:
basic-downstream:
uses: ./.github/workflows/downstream-basic.yml
secrets: inherit

call-kem-benchmarking:
uses: ./.github/workflows/kem-bench.yml

call-sig-benchmarking:
uses: ./.github/workflows/sig-bench.yml
121 changes: 121 additions & 0 deletions .github/workflows/kem-bench.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,121 @@
name: kem benchmark

on:
workflow_dispatch:
workflow_call:

permissions:
contents: read

jobs:
build:
runs-on: ubuntu-latest
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Eventually we'll want to have this working on multiple runners. How scalable is the approach taken here?

I walked through the PQCP setup with Matthias and Ry last week; once open-quantum-safe/tsc#180 lands we'll be able to work with a similar setup.

Copy link
Contributor Author

@pablo-gf pablo-gf May 21, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@SWilson4 Initially, my idea would be to create another matrix for the different runners that we will be using for benchmarking, but of course any ideas are welcome. I am not too familiar working with external runners, but we could follow a similar approach as https://github.com/pq-code-package/mlkem-native, where there are different workflows depending on the runner selected, and they all call an action which executes the benchmarking.

Also, I fixed your other comments. Let me know if you have any other feedback!

steps:
# Checkout repository
- name: Checkout repository
uses: actions/checkout@692973e3d937129bcbf40652eb9f2f61becf3332 # pin@v4
with:
fetch-depth: 0

# Set up dependencies
- name: Install dependencies
run: |
sudo apt-get update
sudo apt-get install -y cmake ninja-build gcc g++ python3 python3-pip
sudo apt-get install -y python3-cpuinfo

# Build the speed_kem binary only
- name: Build speed_kem binary
run: |
mkdir -p build
cd build
cmake -GNinja .. -DBUILD_SHARED_LIBS=OFF
ninja speed_kem

# Copy the parse_liboqs_speed.py script
- name: Copy parse_liboqs_speed.py
run: |
cp scripts/parse_liboqs_speed.py build/tests/

# Upload the built binary and script as an artifact
- name: Upload artifacts
uses: actions/upload-artifact@1746f4ab65b179e0ea60a494b83293b640dd5bba # pin@v4
with:
name: built-binary
path: build/tests/

benchmark:
needs: build
runs-on: ubuntu-latest
permissions:
contents: write
strategy:
matrix:
algorithm: [ # List of available KEMs to perform the benchmarking on
"BIKE-L1",
"BIKE-L3",
"BIKE-L5",
"Classic-McEliece-348864",
"Classic-McEliece-348864f",
"Classic-McEliece-460896",
"Classic-McEliece-460896f",
"Classic-McEliece-6688128",
"Classic-McEliece-6688128f",
"Classic-McEliece-6960119",
"Classic-McEliece-6960119f",
"Classic-McEliece-8192128",
"Classic-McEliece-8192128f",
"Kyber512",
"Kyber768",
"Kyber1024",
"ML-KEM-512",
"ML-KEM-768",
"ML-KEM-1024",
"sntrup761",
"FrodoKEM-640-AES",
"FrodoKEM-640-SHAKE",
"FrodoKEM-976-AES",
"FrodoKEM-976-SHAKE",
"FrodoKEM-1344-AES",
"FrodoKEM-1344-SHAKE"
]
max-parallel: 1 # No parallel jobs to not compromise the pull-push operations of the benchmarking actions below

steps:
# Ensure the repository is checked out
- name: Checkout repository
uses: actions/checkout@692973e3d937129bcbf40652eb9f2f61becf3332 # pin@v4
with:
fetch-depth: 0

# Download the built binary and script
- name: Download artifacts
uses: actions/download-artifact@1746f4ab65b179e0ea60a494b83293b640dd5bba # pin@v4
with:
name: built-binary
path: build/tests/

# Set execute permissions for the binary
- name: Set execute permissions
run: chmod +x build/tests/speed_kem

# Run speed_kem tests for each algorithm
- name: Run speed_kem tests
run: |
cd build/tests
./speed_kem "${{matrix.algorithm}}" > ${{matrix.algorithm}}_output.txt
python3 parse_liboqs_speed.py ${{matrix.algorithm}}_output.txt --algorithm ${{matrix.algorithm}}

# Push to GitHub pages using continuous-benchmark
- name: Store benchmark result
uses: benchmark-action/github-action-benchmark@d48d326b4ca9ba73ca0cd0d59f108f9e02a381c7
with:
name: ${{matrix.algorithm}}
tool: "customSmallerIsBetter"
output-file-path: build/tests/${{matrix.algorithm}}_formatted.json
github-token: ${{ secrets.GITHUB_TOKEN }}
auto-push: true
comment-on-alert: true
summary-always: true
alert-threshold: 50%
comment-always: true
151 changes: 151 additions & 0 deletions .github/workflows/sig-bench.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,151 @@
name: sig benchmark

on:
workflow_dispatch:
workflow_call:

permissions:
contents: read

jobs:
build:
runs-on: ubuntu-latest
steps:
# Checkout repository
- name: Checkout repository
uses: actions/checkout@692973e3d937129bcbf40652eb9f2f61becf3332 # pin@v4
with:
fetch-depth: 0

# Set up dependencies
- name: Install dependencies
run: |
sudo apt-get update
sudo apt-get install -y cmake ninja-build gcc g++ python3 python3-pip
sudo apt-get install -y python3-cpuinfo

# Build the speed_sig binary only
- name: Build speed_sig binary
run: |
mkdir -p build
cd build
cmake -GNinja .. -DBUILD_SHARED_LIBS=OFF
ninja speed_sig

# Copy the parse_liboqs_speed.py script
- name: Copy parse_liboqs_speed.py
run: |
cp scripts/parse_liboqs_speed.py build/tests/

# Upload the built binary and script as an artifact
- name: Upload artifacts
uses: actions/upload-artifact@1746f4ab65b179e0ea60a494b83293b640dd5bba # pin@v4
with:
name: built-sig-binary
path: build/tests/

benchmark:
needs: build
runs-on: ubuntu-latest
permissions:
contents: write
strategy:
matrix:
algorithm: [ # List of available signatures to perform the benchmarking on
"Dilithium2",
"Dilithium3",
"Dilithium5",
"ML-DSA-44",
"ML-DSA-65",
"ML-DSA-87",
"Falcon-512",
"Falcon-1024",
"Falcon-padded-512",
"Falcon-padded-1024",
"SPHINCS+-SHA2-128f-simple",
"SPHINCS+-SHA2-128s-simple",
"SPHINCS+-SHA2-192f-simple",
"SPHINCS+-SHA2-192s-simple",
"SPHINCS+-SHA2-256f-simple",
"SPHINCS+-SHA2-256s-simple",
"SPHINCS+-SHAKE-128f-simple",
"SPHINCS+-SHAKE-128s-simple",
"SPHINCS+-SHAKE-192f-simple",
"SPHINCS+-SHAKE-192s-simple",
"SPHINCS+-SHAKE-256f-simple",
"SPHINCS+-SHAKE-256s-simple",
"MAYO-1",
"MAYO-2",
"MAYO-3",
"MAYO-5",
"cross-rsdp-128-balanced",
"cross-rsdp-128-fast",
"cross-rsdp-128-small",
"cross-rsdp-192-balanced",
"cross-rsdp-192-fast",
"cross-rsdp-192-small",
"cross-rsdp-256-balanced",
"cross-rsdp-256-fast",
"cross-rsdp-256-small",
"cross-rsdpg-128-balanced",
"cross-rsdpg-128-fast",
"cross-rsdpg-128-small",
"cross-rsdpg-192-balanced",
"cross-rsdpg-192-fast",
"cross-rsdpg-192-small",
"cross-rsdpg-256-balanced",
"cross-rsdpg-256-fast",
"cross-rsdpg-256-small",
"OV-Is",
"OV-Ip",
"OV-III",
"OV-V",
"OV-Is-pkc",
"OV-Ip-pkc",
"OV-III-pkc",
"OV-V-pkc",
"OV-Is-pkc-skc",
"OV-Ip-pkc-skc",
"OV-III-pkc-skc",
"OV-V-pkc-skc"
]
max-parallel: 1 # No parallel jobs to not compromise the pull-push operations of the benchmarking actions below

steps:
# Ensure the repository is checked out
- name: Checkout repository
uses: actions/checkout@692973e3d937129bcbf40652eb9f2f61becf3332 # pin@v4
with:
fetch-depth: 0

# Download the built binary and script
- name: Download artifacts
uses: actions/download-artifact@1746f4ab65b179e0ea60a494b83293b640dd5bba # pin@v4
with:
name: built-sig-binary
path: build/tests/

# Set execute permissions for the binary
- name: Set execute permissions
run: chmod +x build/tests/speed_sig

# Run speed_sig tests for each algorithm
- name: Run speed_sig tests
run: |
cd build/tests
./speed_sig "${{matrix.algorithm}}" > ${{matrix.algorithm}}_output.txt
python3 parse_liboqs_speed.py ${{matrix.algorithm}}_output.txt --algorithm ${{matrix.algorithm}}

# Push to GitHub pages using continuous-benchmark
- name: Store benchmark result
uses: benchmark-action/github-action-benchmark@d48d326b4ca9ba73ca0cd0d59f108f9e02a381c7
with:
name: ${{matrix.algorithm}}
tool: "customSmallerIsBetter"
output-file-path: build/tests/${{matrix.algorithm}}_formatted.json
github-token: ${{ secrets.GITHUB_TOKEN }}
auto-push: true
comment-on-alert: true
summary-always: true
alert-threshold: 50%
comment-always: true
71 changes: 71 additions & 0 deletions scripts/parse_liboqs_speed.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,71 @@
# SPDX-License-Identifier: MIT

import json
import re
import argparse
from enum import Enum

class State(Enum):
starting=0
config=1
parsing=2

data=[]

# Parse command-line arguments
parser = argparse.ArgumentParser(description="Parse speed_kem output and extract cycles.")
parser.add_argument("logfile", help="Log file to parse")
parser.add_argument("--algorithm", help="Algorithm name (e.g., BIKE-L1)", required=True)
args = parser.parse_args()

fn = args.logfile
alg = args.algorithm
state = State.starting

config = ''

with open(fn) as fp:
while True:
line = fp.readline()
if not line:
break
# Remove newlines
line = line.rstrip()
if state==State.starting:
if line.startswith("Configuration info"):
state=State.config
fp.readline()
elif state==State.config:
if line=="\n": # Skip forward
fp.readline()
fp.readline()
if line.startswith("-------"):
state=State.parsing
elif line.startswith("Started at"):
fp.readline()
elif ":" in line:
config = config + line[:line.index(":")] + ": " + line[line.index(":")+1:].lstrip() + " | " # Retrieve build configuration

elif state==State.parsing:
if line.startswith("Ended"): # Finish
break
else:
alg = line[:line.index(" ")]
p = re.compile('\S+\s*\|')
for i in 0,1,2: # Iterate through the different operations under each algorithm
x=p.findall(fp.readline().rstrip())
tag = x[0][:x[0].index(" ")] # keygen, encaps, decaps
iterations = float(x[1][:x[1].index(" ")]) # Iterations
total_t = float(x[2][:x[2].index(" ")]) # Total time
mean_t = float(x[3][:x[3].index(" ")]) # Mean time in microseconds
cycles = int(x[5][:x[5].index(" ")]) # Cycles
val = iterations/total_t # Number of iterations per second

data.append({"name": alg + " " + tag, "value": cycles, "unit": "cycles", "extra": config})
else:
print("Unknown state: %s" % (line))

# Dump data
output_file = f"{alg}_formatted.json"
with open(output_file, 'w') as outfile:
json.dump(data, outfile)
Loading