Skip to content

Conversation

@ntalluri
Copy link
Collaborator

Working on #370

@ntalluri ntalluri changed the title feat: adding new ensemble parameter tuning baseline feat: add new ensemble parameter tuning baseline Aug 14, 2025
@read-the-docs-community
Copy link

read-the-docs-community bot commented Aug 14, 2025

Documentation build overview

📚 spras | 🛠️ Build #29331954 | 📁 Comparing a57ef19 against latest (2857eb1)


🔍 Preview build

Show files changed (3 files in total): 📝 3 modified | ➕ 0 added | ➖ 0 deleted
File Status
genindex.html 📝 modified
fordevs/modules.html 📝 modified
fordevs/spras.html 📝 modified

@ntalluri
Copy link
Collaborator Author

Current visualization on the egfr dataset

Screenshot 2025-08-14 at 15 25 35

@ntalluri ntalluri closed this Aug 14, 2025
@ntalluri ntalluri reopened this Aug 14, 2025
@tristan-f-r tristan-f-r added the tuning Workflow-spanning algorithm tuning label Aug 15, 2025
@agitter
Copy link
Collaborator

agitter commented Aug 15, 2025

I thought this pull request was for something else. Could we rename it to be more like "feat: add input nodes as evaluation baseline"?

# TODO what if the node_ensemble is all frequency = 0.0, that will be the new source/target/prize/ baseline?

# Set frequency to 1.0 for matching nodes
prize_node_ensemble_df.loc[
Copy link
Collaborator Author

@ntalluri ntalluri Aug 18, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

make the baseline 1 baseline per dataset not per algorithm. All the sources/prizes/targest/active = 1.0 and everything else is 0. But still calculate the precision and recall.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this will be algorithm agnostic.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another option also maybe needed is to only evaluate the internal nodes (set the source/target/prizes to 0.0).

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we might need both of these. One will be a baseline on the current ensemble pr. New figures for all the evaluation for only evaluating the internal nodes.

@ntalluri ntalluri changed the title feat: add new ensemble parameter tuning baseline feat: add input nodes as evaluation baseline Aug 19, 2025
@ntalluri ntalluri added analysis Analysis of PRA outputs needed for benchmarking Priority PRs needed for the benchmarking paper and removed tuning Workflow-spanning algorithm tuning labels Aug 19, 2025
Copy link
Collaborator

@agitter agitter left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is getting close.

# the Input_Nodes_Baseline PR curve highlights their overlap with the gold standard.
if prc_input_nodes_baseline_df is None:
input_nodes_set = set(input_nodes['NODEID'])
input_nodes_gold_intersection = input_nodes_set & gold_standard_nodes # TODO should this be all inputs nodes or the intersection with the gold standard for this baseline? I think it should be the intersection
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very good question. For a synthetic dataset like Panther pathways, the inputs are sampled from the pathway nodes that make up the gold standard so it doesn't matter.

For an omics input like EGFR, it matters substantially. What makes you prefer the intersection? I was inclined to say all input nodes because we cannot have a baseline algorithm that makes use of gold standard information as part of its ranking. I could create a valid pathway reconstruction algorithm that takes the input nodes and simply returns those. I can't use a gold standard in a valid pathway reconstruction algorithm, however.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In my opinion, the only input nodes that matter for evaluation are those that overlap with the gold standard. While it’s true that an algorithm could trivially return all input nodes, our precision recall evaluation is only defined with respect to the gold standard. Input nodes that aren’t in the gold standard don’t contribute to true positives, so including them in the baseline wouldn’t be meaningful.

That said, I also see the case for using all input nodes as it represents a valid baseline algorithm where an algorithm could simply return the given inputs without any reconstruction.

Maybe the difference is that the intersection provides an upper bound, while all input nodes provides a lower bound on what you could do without doing any reconstruction and we should provide both?

Copy link
Collaborator Author

@ntalluri ntalluri Aug 28, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

think through 3 baselines and explain why each of these can be used/needed

  1. the no intersection, but input nodes by itself
  2. the intersection of the gold standard and input nodes
  3. the the gold standard by itself (which I think is what baseline is)
  • or do we want to make an ensemble of the gold standard as 1.0 and everything else 0.0 and do a PRC

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Deciding this is the last point of feedback and last decision to finalize. Then I can do a last careful review and we should be ready to merge.

In our meeting, we discussed how options 1 and 2 will give the same recall. The only difference is that one will have some precision value and the other always has precision of 1.0.

@ntalluri ntalluri requested a review from agitter August 28, 2025 15:37
@github-actions github-actions bot added the merge-conflict This PR has merge conflicts. label Oct 3, 2025
@agitter
Copy link
Collaborator

agitter commented Oct 3, 2025

@ntalluri is this waiting for my review or your updates? We haven't touched it in a while.

Comment on lines 318 to 324
raise ValueError(
f"Cannot compute PR curve or generate node ensemble. Input network for dataset \"{dataset_file.split('-')[0]}\" is empty."
f"Cannot compute PR curve or generate node ensemble. The input network is empty."
)
if node_table.empty:
raise ValueError(
f"Cannot compute PR curve or generate node ensemble. Gold standard associated with dataset \"{dataset_file.split('-')[0]}\" is empty."
f"Cannot compute PR curve or generate node ensemble. The gold standard is empty."
)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should still supply the dataset name in this function to preserve the error information.

Comment on lines +406 to +407
# Dropping last elements because scikit-learn adds (1, 0) to precision/recall for plotting, not tied to real thresholds
prc_input_nodes_baseline_data = {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(The comment should be moved up)

Suggested change
# Dropping last elements because scikit-learn adds (1, 0) to precision/recall for plotting, not tied to real thresholds
prc_input_nodes_baseline_data = {
# Dropping last elements because scikit-learn adds (1, 0) to precision/recall for plotting, not tied to real thresholds
# https://scikit-learn.org/stable/modules/generated/sklearn.metrics.precision_recall_curve.html#sklearn.metrics.precision_recall_curve:~:text=Returns%3A-,precision,predictions%20with%20score%20%3E%3D%20thresholds%5Bi%5D%20and%20the%20last%20element%20is%200.,-thresholds
prc_input_nodes_baseline_data = {

def get_interesting_input_nodes(self) -> pd.DataFrame:
"""
Returns: a table listing the input nodes considered as starting points for pathway reconstruction algorithms,
restricted to nodes that have at least one of the specified attributes.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
restricted to nodes that have at least one of the specified attributes.
or all of the nodes in this dataset that have at least one 'interesting' attribute as specified in-code.

@param node_ensembles: dict of the pre-computed node_ensemble(s)
@param node_table: gold standard nodes
@param input_nodes: the input nodes (sources, targets, prizes, actives) used for a specific dataset
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
@param input_nodes: the input nodes (sources, targets, prizes, actives) used for a specific dataset
@param input_nodes: the input nodes (usually from `Dataset#get_interesting_input_nodes`) used for a specific dataset

@tristan-f-r tristan-f-r added the awaiting-author Author of the PR needs to fix something from a review / etc. label Oct 14, 2025
@ntalluri
Copy link
Collaborator Author

@ntalluri is this waiting for my review or your updates? We haven't touched it in a while.

@agitter I had a couple things to think about and fix for this PR; It hasn't been my main priority to finish yet.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

analysis Analysis of PRA outputs awaiting-author Author of the PR needs to fix something from a review / etc. merge-conflict This PR has merge conflicts. needed for benchmarking Priority PRs needed for the benchmarking paper

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants