-
Notifications
You must be signed in to change notification settings - Fork 25
refactor: separate statistic computation #411
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
we also make it lazy
|
Building on top of this PR allows me to add graph heuristics. Most likely, every |
|
Before I can review the implementation of the change, I need to better understand what problem we are tying to solve with the change. Where will laziness be needed in the future?
Do we envision calling graph statistic computation twice per graph? After we compute these statistics on a graph once, shouldn't that be sufficient for an entire pass of a workflow? |
|
I was going to ask @ntalluri about this, since I wasn't quite sure if we will have expensive graph heuristics or not.
I did decouple this from |
|
There could be more than one way to design this sensibly. One would be that if heuristics are enabled in the config file, that automatically generates the graph summary table. The produces more output than requested, which is slightly undesirable. Another could be to move the heuristic calculations inside each --parameters> subdirectory, which may be where you are headed. If that is written as a file for that one pathway, it could be consumed for heuristics (or used for heuristics and then written to disk). Later, if the graph summary table is requested, it would grab the precomputed statistics from those files in the subdirectories. |
|
I'll mark this as a draft for now and design something in line with your second proposal. |
We also make graph statistics lazy. Laziness isn't used in
summary.py, but I assume that we'll have more computationally expensive graph statistics as SPRAS develops, especially when it can take long to compute for our larger graphs.Most importantly, this separates graph statistics into a separate function, so we can reuse the code for graph heuristic pruning.