Skip to content

fix: change handling #167

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 9 commits into from
Jan 11, 2025
Merged

fix: change handling #167

merged 9 commits into from
Jan 11, 2025

Conversation

psteinroe
Copy link
Collaborator

@psteinroe psteinroe commented Jan 6, 2025

iteration on the change handling. after struggling with it for a few days, I simplified it now. its now much simpler and stable, but in certain cases will not be optimal. not optimal means that instead of having one ModifiedStatement, we might get up to three StatementDeleted + AddedStatement. This is relevant for treesitter, where a change can be applied faster if we pass in the prev text and the change. but: since we just parse statement by statement, it should still be fast enough.

Also refactored the Document API to make it much much simpler. Thanks for the input here @juleswritescode!

Screen.Recording.2025-01-10.at.12.48.37.mov

todos:

  • pr description
  • hands on test
Screenshot 2025-01-09 at 10 41 03

9g40bi

@psteinroe psteinroe marked this pull request as draft January 6, 2025 19:06
@psteinroe psteinroe assigned psteinroe and unassigned psteinroe Jan 10, 2025
@psteinroe psteinroe marked this pull request as ready for review January 10, 2025 13:51
@psteinroe psteinroe changed the title lets get into the mess... fix: change handling Jan 10, 2025
Copy link
Collaborator

@juleswritescode juleswritescode left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I still need to takea look at apply_change in change.rs, but I have to get going now, so here's already a part of the review :)

I love that it's much simpler now! And I love that auto completion seems to friggin' work.

Comment on lines +29 to +30
.map(|n| n.0.to_enum())
.ok_or_else(|| Error::Parse("Unable to find root node".to_string()))
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(I am not even sure why this is an issue. my assumption was that if pg_query parses something it always has at least one node, but that does not seem to be the case.)

@@ -368,7 +368,7 @@ impl Pattern {
/// `Pattern` using the default match options (i.e. `MatchOptions::new()`).
pub fn matches_path(&self, path: &Path) -> bool {
// FIXME (#9639): This needs to handle non-utf8 paths
path.to_str().map_or(false, |s| self.matches(s))
path.to_str().is_some_and(|s| self.matches(s))
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

these changes come from the lint in the justfile, right? awesome!

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes! I don't know why they weren't already reported in a prev pr though.

} else {
None
}
pub fn iter_statements(&self) -> impl Iterator<Item = Statement> + '_ {
Copy link
Collaborator

@juleswritescode juleswritescode Jan 10, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: I believe it's idiomatic that an iter iterates over &T (except if T is copy?)... so idiomatically, this would be an Iterator<Item = &Statement> :)

no need to change it though!

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah, but we create a new Statement so we cannot return a ref to it. we might be able to optimise this is caching the Statements but its not worth it I think :D

}
}
#[derive(Debug, PartialEq, Eq)]
pub struct ModifiedStatement {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

aaah where you inspired by the slatedb approach? Nice!!

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not explicitly, this was just the result of simplifying the Document api. the enum was there before already. but maybe it did subconsciously. :D

let mut prev_index = None;
let mut next_index = None;

for (index, (_, pos_range)) in self.positions.iter().enumerate() {
Copy link
Collaborator

@juleswritescode juleswritescode Jan 10, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Food for thought: Would it be safer to gather the affected IDs and then maybe compare with a previous set?
This obviously works, but after the change is applied, the indices of the statements will be different – which could lead to bugs further down the road.
Just a feeling!

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do you mean bugs within change.rs, or in other parts of the workspace?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

within change.rs and anything that manages changes.
It feels to me like mutating an array while iterating over it – but again, just a feeling 👍

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

mhh yes I think this entire method can be improved. I didn't find a good way to do "atomic" mutations yet though.

Comment on lines +141 to +142
next_index = Some(index);
break;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: we're implicitly assuming that the StatementPositions are sorted ascending by range.

If we change the order that the StatementSplitter returns statements, this will break.

Should we make the ordering more explicit?

Copy link
Collaborator Author

@psteinroe psteinroe Jan 10, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good point. I added a test to the statement splitter to ensure the output is always sorted. is that sufficient? I think it's easier to make sure its sorted there than to handle it everywhere.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah, great solution 👍

Copy link
Collaborator

@juleswritescode juleswritescode left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great work! Looks much simpler now!


self.debug_statements();
let new_id = self.id_generator.next();
self.positions[affected_idx] = (new_id, new_range);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

beautiful!

}
changed.push(StatementChange::Modified(ModifiedStatement {
old_stmt: Statement {
id: old_id,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

at this point, that id is only stored here, because statement matching that id in the document is overwritten, right?
Do we at some point after this try to find the old statement by id?

@psteinroe psteinroe merged commit 20ee92b into main Jan 11, 2025
5 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants