Skip to content

Conversation

@tristan-f-r
Copy link
Collaborator

@tristan-f-r tristan-f-r commented Oct 10, 2025

We also make graph statistics lazy. Laziness isn't used in summary.py, but I assume that we'll have more computationally expensive graph statistics as SPRAS develops, especially when it can take long to compute for our larger graphs.

Most importantly, this separates graph statistics into a separate function, so we can reuse the code for graph heuristic pruning.

@tristan-f-r tristan-f-r added tuning Workflow-spanning algorithm tuning refactor Changes that don't actually improve anything except for code quality. labels Oct 10, 2025
@read-the-docs-community
Copy link

read-the-docs-community bot commented Oct 10, 2025

Documentation build overview

📚 spras | 🛠️ Build #30218572 | 📁 Comparing c675ece against latest (c3b02cd)


🔍 Preview build

No files changed.

@tristan-f-r tristan-f-r requested a review from ntalluri October 14, 2025 17:38
@tristan-f-r tristan-f-r added the P-medium medium prirotity; this is needed for some external service or another PR label Oct 14, 2025
@tristan-f-r
Copy link
Collaborator Author

tristan-f-r commented Oct 14, 2025

Building on top of this PR allows me to add graph heuristics.

Most likely, every tuning PR will be at least marked with P-medium unless it's an end result.

@github-actions github-actions bot added the merge-conflict This PR has merge conflicts. label Oct 30, 2025
@tristan-f-r tristan-f-r mentioned this pull request Oct 30, 2025
1 task
@github-actions github-actions bot removed the merge-conflict This PR has merge conflicts. label Oct 30, 2025
@agitter
Copy link
Collaborator

agitter commented Nov 7, 2025

Before I can review the implementation of the change, I need to better understand what problem we are tying to solve with the change. Where will laziness be needed in the future?

we can reuse the code for graph heuristic pruning

Do we envision calling graph statistic computation twice per graph? After we compute these statistics on a graph once, shouldn't that be sufficient for an entire pass of a workflow?

@tristan-f-r
Copy link
Collaborator Author

I was going to ask @ntalluri about this, since I wasn't quite sure if we will have expensive graph heuristics or not.

Do we envision calling graph statistic computation twice per graph? After we compute these statistics on a graph once, shouldn't that be sufficient for an entire pass of a workflow?

I did decouple this from analysis: summary: enabled: true, and I imagined it like this. I didn't think about that, though: would it make sense to have graph summary statistics always enabled the moment any heuristics are enabled?

@agitter
Copy link
Collaborator

agitter commented Nov 8, 2025

There could be more than one way to design this sensibly. One would be that if heuristics are enabled in the config file, that automatically generates the graph summary table. The produces more output than requested, which is slightly undesirable.

Another could be to move the heuristic calculations inside each --parameters> subdirectory, which may be where you are headed. If that is written as a file for that one pathway, it could be consumed for heuristics (or used for heuristics and then written to disk). Later, if the graph summary table is requested, it would grab the precomputed statistics from those files in the subdirectories.

@tristan-f-r
Copy link
Collaborator Author

I'll mark this as a draft for now and design something in line with your second proposal.

@tristan-f-r tristan-f-r marked this pull request as draft November 8, 2025 08:06
@ntalluri ntalluri removed the P-medium medium prirotity; this is needed for some external service or another PR label Nov 19, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

refactor Changes that don't actually improve anything except for code quality. tuning Workflow-spanning algorithm tuning

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants