Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Provide consuming movement methods so that NodeMut can act as a cursor. #33

Merged
merged 2 commits into from
Aug 27, 2024
Merged

Provide consuming movement methods so that NodeMut can act as a cursor. #33

merged 2 commits into from
Aug 27, 2024

Conversation

adamreichold
Copy link
Member

@adamreichold adamreichold commented Aug 26, 2024

Still a draft because I added only the into_next_sibling method instead of handling all axes until we agreed that this is a helpful approach. Added the missing methods after trying out a bit deduplication of internals.

@LoZack19
Copy link
Contributor

This is perfect. As I said, into_* functions are very much needed in NodeMut, I like this approach

@cfvescovo
Copy link
Member

This is an excellent approach! Huge thanks @adamreichold, your contributions are invaluable!

@adamreichold adamreichold marked this pull request as ready for review August 26, 2024 14:45
Copy link
Contributor

@LoZack19 LoZack19 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Excellent work! I appreciate the symmetry you've achieved. Additionally, I think the axis and into_axis functions enhance the code's readability.

@LoZack19
Copy link
Contributor

Do you think that we could also take advantage of this to implement mutable and owned iterators for NodeMut?

@LoZack19 LoZack19 linked an issue Aug 27, 2024 that may be closed by this pull request
@cfvescovo cfvescovo changed the title RFC: Provide consuming movement methods so that NodeMut can act as a cursor. Provide consuming movement methods so that NodeMut can act as a cursor. Aug 27, 2024
@adamreichold
Copy link
Member Author

Do you think that we could also take advantage of this to implement mutable and owned iterators for NodeMut?

You mean something like

impl<'a, T: 'a> NodeMut<'a, T> {
  pub fn children(&self) -> ChildrenMut<'a, T>;
}

which yields distinct instances of NodeMut<'a, T> as items?

If so, then I don't think these methods help to implement this safely as while these methods can "advance" a NodeMut through the children of a given node, I cannot hand out that current NodeMut by returning it from fn next(&mut self) -> Option<NodeMut<'a, T>>; because I then have no basis to compute the next child node when next is called again.

This would only work with a lending iterator which yields a &'b mut NodeMut<'a, T> borrowing from &'b self which coincidentally is exactly the pattern that std's CursorMut::next uses.

@adamreichold
Copy link
Member Author

Realising how important it is to hold on to an existing mutable node reference, I added a second commit which extends the signature of these methods to

fn into_axis<F>(mut self, f: F) -> Result<Self, Self>;

return the current reference as Err(self) if navigation is not possible so that information is never lost if the calling code wants to recover from such a situation.

@LoZack19
Copy link
Contributor

If so, then I don't think these methods help to implement this safely as while these methods can "advance" a NodeMut through the children of a given node, I cannot hand out that current NodeMut by returning it from fn next(&mut self) -> Option<NodeMut<'a, T>>; because I then have no basis to compute the next child node when next is called again.

This is true. NodeMut needs to remain inside the state of the iterator in order to compute the next one. This excludes the prospect of an owned iterator. It could still be useful, though, to have a mutable iterator which follows the CursorMut::next pattern, as you suggested.

@adamreichold
Copy link
Member Author

adamreichold commented Aug 27, 2024

It could still be useful, though, to have a mutable iterator which follows the CursorMut::next pattern, as you suggested.

Note that this would not be an iterator though, i.e. not an implementation of the Iterator trait, because that trait cannot support lending (items can borrow from the container, but not from the iterator itself).

NodeMut basically is our equivalent to CursorMut, and our equivalent to CursorMut::next is

node = match node.into_next_sibling() {
  Ok(node) => node,
  Err(node) => node,
};

We could of course provide methods like

pub fn to_next_sibling(&mut self) -> Result<(), ()>;

which modify self in-place (or not if movement is not possible) instead of moving out of self. We could then implement the into_* methods on top of those.

Would you prefer that? If so, just the to_* variants or both to_* and into_*?

@cfvescovo
Copy link
Member

Realising how important it is to hold on to an existing mutable node reference, I added a second commit which extends the signature of these methods to

fn into_axis<F>(mut self, f: F) -> Result<Self, Self>;

return the current reference as Err(self) if navigation is not possible so that information is never lost if the calling code wants to recover from such a situation.

This is a good idea which would immensely benefit from better documentation

@cfvescovo
Copy link
Member

It could still be useful, though, to have a mutable iterator which follows the CursorMut::next pattern, as you suggested.

Agreed

@adamreichold
Copy link
Member Author

This is a good idea which would immensely benefit from better documentation

Will amend the documentation if we decide to stick to the into_* methods instead of switching over to to_*/in-place modification.

@LoZack19
Copy link
Contributor

LoZack19 commented Aug 27, 2024

Would you prefer that? If so, just the to_* variants or both to_* and into_*?

Personally I would love to have both implementations. I'm just a bit concerned about code duplication. I think that, to avoid overcomplicating the NodeMut API, it is convenient to abandon in-place modification through the to_* functions and stick to into_*

When chosing, I prefer the alternative which hides the least state possible.

@LoZack19
Copy link
Contributor

If you all agree, I approve to merge this PR into master

@adamreichold
Copy link
Member Author

When chosing, I prefer the alternative which hides the least state possible.

While trying, I also realized that the implement of to_axis is requires additional unsafe code to massage the lifetimes which another argument against it. So I guess we should indeed stick to the into_* methods and I will amend the doc comments w.r.t. the Result<Self, Self> pattern.

@adamreichold
Copy link
Member Author

adamreichold commented Aug 27, 2024

If you all agree, I approve to merge this PR into master

I amended the docs as requested, so from my point of view, this is good to go.

@LoZack19
Copy link
Contributor

Totally agree. Let's finish documenting this code and merge. I just want to point out that I have created a readme in which I try to summarize the design choises made in creating ego-tree, so possibly if there are choices to point out in this PR I would suggest doing so in the readme.md

@cfvescovo
Copy link
Member

Totally agree. Let's finish documenting this code and merge. I just want to point out that I have created a readme in which I try to summarize the design choises made in creating ego-tree, so possibly if there are choices to point out in this PR I would suggest doing so in the readme.md

@adamreichold rebase your branch on top of master

@cfvescovo
Copy link
Member

I have rechecked, everything LGTM

@cfvescovo cfvescovo merged commit 65df834 into rust-scraper:master Aug 27, 2024
5 checks passed
@adamreichold
Copy link
Member Author

Thanks. I would be glad if we could make a point release (for the FusedIterator impls) so that I could eventually recover the dependent FusedIterator impls in scraper.

@cfvescovo
Copy link
Member

cfvescovo commented Aug 27, 2024

Releasing 0.8.0 ASAP

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

How to insert child at index?
3 participants