-
Notifications
You must be signed in to change notification settings - Fork 164
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Enable Garbage Collection for Interned Values #602
Conversation
✅ Deploy Preview for salsa-rs canceled.
|
CodSpeed Performance ReportMerging #602 will degrade performances by 24.4%Comparing Summary
Benchmarks breakdown
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice. This looks good to me after adding a few tests.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, looks good to me!
I see there is a test failure....have you investigated it at all? |
My hunch is that the test is testing behavior that's not relevant anymore but I can look. |
@nikomatsakis I'm also hitting the assertion error in this example: #[salsa::tracked]
fn function<'db>(db: &'db dyn Database, input: Input) -> Interned<'db> {
Interned::new(db, 0)
}
let input = Input::new(&db, 0);
function(&db, input);
input.set_field1(&mut db).to(1);
function(&db, input); The function does not read the input so the interned value is memoized, meaning it is not re-interned in R2. What should be done in this case (and the similar failing test)? I'm also a little unclear about the testing instructions. Should I add a log event for when values are re-interned, or is there another way I can test that? I'm also not sure how to access the |
That would sound reasonable to me, similar to the backdate event. We could also consider adding an event when a value is garbage collected. |
If I understand it correctly the problem is that interned ingredients created outside of a query are never revalidated because there's no corresponding query that creates them. I see two options:
|
I took a closer look at I think the problem here is that |
We had a brief discussion in our meeting today--- @ibraheemdev I think values that are interned outside of any salsa query have to be concerned immortal. |
The This one seems trickier. The query does not have a dependency on its input. However, changing the input still causes the database revision to be advanced. I'm not exactly sure why, but I believe the second query call only performs a shallow memoization check, meaning that |
src/interned.rs
Outdated
|
||
// Record a dependency on this value. | ||
let index = self.database_key_index(id); | ||
zalsa_local.report_tracked_read(index, Durability::MAX, current_revision); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just making sure, is current_revision
the correct argument to pass here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This feels incorrect, should it be the last_interned_at
value?
Oh this is interesting. I have to take a closer look. We probably have a similar problem for when the input has a higher durability and Salsa skips the deep verification because of it. |
Okay, I think I now understand the underlying problem. I'm not yet sure what the right fix is. I'll reach out in https://salsa.zulipchat.com/#narrow/channel/333573-Contributing-to-Salsa/topic/LRU.20.28GC.29.20for.20interned.20values.20.23597 |
I thought more about this and I feel somewhat confident that we should store the What's different for interned values is that multiple queries can create the same tracked struct and the queries might also change their durability over time. For now, I'd suggest that we store the highest durability across all reads because we have to be pessimistic and assume the interned value only gets recreated after a HIGH durability change if we have one HIGH and two low durability queries that both create the same interned value. The read path should compare against the most recent revision for the interned value's durability instead of using the current revision. This should unblock this PR. What's unclear is how garbage collection would work because there's no guarantee that a query has to run in every revision. That means it's possible for interned values never to have been read in the last 1000 revisions, and they might get garbage collected because of it. However, the result of a cached query might still depend on the interned value (or even reference it in the output). |
@MichaReiser @ibraheemdev -- just to confirm, this is still pending the changes we discussed in our last sync meeting, correct? |
@nikomatsakis yes, I'll be working on it this week. |
Hmm, the accumulator regression is surprising. Did we accidentally break the optimization that skips traversing queries that are known not to have any accumulated values? |
I don't see anything obvious that should have caused the regression but I suspect it's simply the fact that we now need to look up the revision from the active query? This otherwise looks good to me. Do we need to remove any code? I remember you mentioned that we don't need the "immortal" handling. |
I'm going to try to add some more tests before this gets merged. |
@MichaReiser this should be good to merge now, I'm not sure what the CI failures are about. |
We don't have the lockfile committed so one of our transitive dependencies bumped their msrv in a point release which CI now pulls |
b644d7e
to
9f7496f
Compare
I can reproduce the perf regression locally but I think it's inherent to the change. The main reasons for the regression are:
I could see how these numbers might motivate us to support interned structs with and without garbage collection. For now, I think what we have here is good enough and we can iterate on the details in separate PRs. |
Per the mentoring instructions in #597.
Still needs some tests. Note this does not actually implement garbage collection but adds the necessary plumbing to do so.