-
Notifications
You must be signed in to change notification settings - Fork 3
Add script to summarize dashboard results in a table #93
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
📝 WalkthroughWalkthroughAdds util/summarize.py, a script that iterates git commits affecting dashboard/dashboard-results.yml, loads each version, counts PASS statuses for FP*-prefixed checks across ontologies, and writes a date-keyed TSV (summary.tsv) with totals per check and overall totals. Changes
Sequence Diagram(s)sequenceDiagram
autonumber
actor User
participant Script as summarize.py
participant Git as git
participant YAML as PyYAML
participant FS as File System
User->>Script: Run main()
Script->>Git: git log --follow --reverse dashboard/dashboard-results.yml
Git-->>Script: [ (date, commit) ... ]
loop For each commit
Script->>Git: git show <commit>:dashboard/dashboard-results.yml
Git-->>Script: YAML content
Script->>YAML: yaml.safe_load(content)
YAML-->>Script: data (ontologies, results)
Script->>Script: summarize_run(data)\n- count PASS for FP* keys\n- compute totals
end
Script->>FS: Write summary.tsv (tab-delimited rows with date + counts)
FS-->>Script: Success
Script-->>User: Exit
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes Pre-merge checks and finishing touches✅ Passed checks (3 passed)
✨ Finishing touches
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 3
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
util/summarize.py(1 hunks)
🧰 Additional context used
🪛 Ruff (0.13.3)
util/summarize.py
1-1: Shebang is present but file is not executable
(EXE001)
18-18: subprocess call: check for execution of untrusted input
(S603)
19-19: Starting a process with a partial executable path
(S607)
61-61: subprocess call: check for execution of untrusted input
(S603)
62-62: Starting a process with a partial executable path
(S607)
🔇 Additional comments (2)
util/summarize.py (2)
58-67: LGTM!The function correctly retrieves file content from a specific commit and safely parses it using
yaml.safe_load, which prevents arbitrary code execution from malicious YAML.
14-34: Confirm subprocess usage in list_commits is safe
list_commitsis only invoked with a hardcoded path inmain(), so there’s no untrusted input risk.
| @@ -0,0 +1,94 @@ | |||
| #!/usr/bin/env python3 | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Make the script executable.
The shebang is present but the file lacks execute permissions.
Run this command to fix:
chmod +x util/summarize.py🧰 Tools
🪛 Ruff (0.13.3)
1-1: Shebang is present but file is not executable
(EXE001)
🤖 Prompt for AI Agents
In util/summarize.py at line 1 the shebang is present but the file lacks execute
permissions; make the script executable by setting the user (and optionally
group/other) execute bit, e.g. run chmod +x util/summarize.py from the repo root
and verify the permission change with ls -l util/summarize.py.
| def summarize_run(data): | ||
| """Given a dashboard-results YAML | ||
| return a dictionary from principles to counts | ||
| of FP* checks that passed, | ||
| and the total number of ontologies.""" | ||
| result = defaultdict(int) | ||
| for ontology in data["ontologies"]: | ||
| result["total"] += 1 | ||
| if "results" not in ontology: | ||
| continue | ||
| for key, value in ontology["results"].items(): | ||
| if not key.startswith("FP"): | ||
| continue | ||
| if "status" not in value: | ||
| continue | ||
| if value["status"] != "PASS": | ||
| continue | ||
| result[key] += 1 | ||
| return result |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add validation for required data structure.
The function accesses data["ontologies"] without validation. If the YAML file is malformed or missing this key in an old commit, the script will crash with a KeyError.
Apply this diff to add validation:
def summarize_run(data):
"""Given a dashboard-results YAML
return a dictionary from principles to counts
of FP* checks that passed,
and the total number of ontologies."""
result = defaultdict(int)
+ if "ontologies" not in data:
+ return result
for ontology in data["ontologies"]:
result["total"] += 1
if "results" not in ontology:📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| def summarize_run(data): | |
| """Given a dashboard-results YAML | |
| return a dictionary from principles to counts | |
| of FP* checks that passed, | |
| and the total number of ontologies.""" | |
| result = defaultdict(int) | |
| for ontology in data["ontologies"]: | |
| result["total"] += 1 | |
| if "results" not in ontology: | |
| continue | |
| for key, value in ontology["results"].items(): | |
| if not key.startswith("FP"): | |
| continue | |
| if "status" not in value: | |
| continue | |
| if value["status"] != "PASS": | |
| continue | |
| result[key] += 1 | |
| return result | |
| def summarize_run(data): | |
| """Given a dashboard-results YAML | |
| return a dictionary from principles to counts | |
| of FP* checks that passed, | |
| and the total number of ontologies.""" | |
| result = defaultdict(int) | |
| if "ontologies" not in data: | |
| return result | |
| for ontology in data["ontologies"]: | |
| result["total"] += 1 | |
| if "results" not in ontology: | |
| continue | |
| for key, value in ontology["results"].items(): | |
| if not key.startswith("FP"): | |
| continue | |
| if "status" not in value: | |
| continue | |
| if value["status"] != "PASS": | |
| continue | |
| result[key] += 1 | |
| return result |
🤖 Prompt for AI Agents
In util/summarize.py around lines 37 to 55, the function reads
data["ontologies"] without validating input; add a guard that verifies data is a
dict and contains an "ontologies" key whose value is a list. If the check fails,
either raise a clear ValueError (e.g. "invalid dashboard-results: missing or
non-list 'ontologies'") or return an empty defaultdict(int) so the caller can
handle malformed YAML, then proceed with the existing loop using the validated
value.
| fieldnames = list(rows[-1].keys()) | ||
| fieldnames.sort(key=str.casefold) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Handle empty commits list.
If no commits are found for the file (e.g., file doesn't exist at path, or git history issue), rows[-1] will raise an IndexError.
Apply this diff to add validation:
fieldnames = list(rows[-1].keys())
+ if not rows:
+ print(f"No commits found for {path}")
+ return
+ fieldnames = list(rows[-1].keys())
fieldnames.sort(key=str.casefold)Wait, let me fix the placement:
+ if not rows:
+ print(f"No commits found for {path}")
+ return
+
fieldnames = list(rows[-1].keys())
fieldnames.sort(key=str.casefold)📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| fieldnames = list(rows[-1].keys()) | |
| fieldnames.sort(key=str.casefold) | |
| if not rows: | |
| print(f"No commits found for {path}") | |
| return | |
| fieldnames = list(rows[-1].keys()) | |
| fieldnames.sort(key=str.casefold) |
🤖 Prompt for AI Agents
In util/summarize.py around lines 83-84, the code assumes rows is non-empty and
uses rows[-1] which will raise IndexError when no commits are found; update the
function to first check if rows is empty and handle that case (e.g., return
early with an empty result or raise a clear ValueError/RuntimeError with a
descriptive message), otherwise proceed to compute fieldnames from rows[-1] and
sort them; ensure callers of this function can handle the new empty-case
behavior.
|
@jamesaoverton, have you generated the TSV file? Running this script would be a bit more complicated because we use a Docker image. I might run it only on the other repository used for the releases. |
|
I barely remember this PR. I wanted a graph to show improvement in the dashboard scores over time, so I wrote this script, and put it here in case it's useful. You can do what you want with it, or just close the PR without merging. |
Read the git history of the
dashboard/dashboard-results.ymlfile,sum the number of passes for each FP* check,
and generate a
summary.tsvtable with rows for each date.Summary by CodeRabbit