Skip to content

Add operation to derive chunks#147

Merged
dkhofer merged 7 commits intomainfrom
derive-chunks
Feb 24, 2025
Merged

Add operation to derive chunks#147
dkhofer merged 7 commits intomainfrom
derive-chunks

Conversation

@dkhofer
Copy link
Contributor

@dkhofer dkhofer commented Feb 13, 2025

Example usage:

cargo run -- init
cargo run -- import --fasta fixtures/simple.fa
# Create two example integration points
cargo run -- update --fasta fixtures/aa.fa --start 3 --end 5 --region-name m123 --new-sample test1
cargo run -- update --fasta fixtures/aa.fa --start 15 --end 20 --region-name m123 --sample test1 --new-sample test2
# Derive chunks with breakpoints in between the edits
cargo run -- derive-chunks --new-sample splits --region m123 --breakpoints 1,8,25 --sample test2
# View one of the new chunks
cargo run -- view -s splits m123.2


#[allow(clippy::too_many_arguments)]
pub fn derive_subgraph(
pub fn derive_chunks(
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Most of the changes to this method are just generalizing from one subgraph/chunk to many

conn,
PATH_START_NODE_ID,
-1,
0,
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using -1 was the wrong choice, it prevents edges from the start node from showing up when the graph is loaded. I have a to-do to look through the code and see where else this is being done

@dkhofer dkhofer requested a review from Chris7 February 19, 2025 16:14
new_sample_name: &str,
region_name: &str,
backbone: Option<&str>,
breakpoints: &str,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is there a reason this is a string vs. a vec here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, and vec is better. I moved the parsing into main.rs

Copy link
Contributor

@Chris7 Chris7 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a few comments. I think the range functions are pretty similar but have different behaviors around the edges, which seems like it might make sense to unify them. For errors, I think we need to start using the errors defined more vs. panics (you can see the stuff like FastaError and what I just did in #150). It lets us test the code more effectively, sets us up better for library code, and lets various callers handle the errors in the way that makes sense.

end: path_length,
});

chunk_ranges
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this supposed to be sorted in some way? or a check that the breakpoints are within the length of the path?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it seems like this and the above are pretty similar, should these be merged to one function?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I reworked a bunch of it and moved the range calculation into main.rs, it was mostly straightforward and I think it's better to have it there

if (start_coordinate < 0 || start_coordinate > current_path_length)
|| (end_coordinate < 0 || end_coordinate > current_path_length)
{
panic!(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

since you are using result here, can you define error functions for these? It makes it so we can test the code better and handle errors better by delegating the message/etc. to whatever invoked the function.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point, done. I have a to-do to do that wherever else it's needed

@dkhofer dkhofer merged commit 7e37ee6 into main Feb 24, 2025
1 check passed
@dkhofer dkhofer deleted the derive-chunks branch February 24, 2025 20:44
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants