Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enable Garbage Collection for Interned Values #602

Merged
merged 11 commits into from
Mar 17, 2025

Conversation

ibraheemdev
Copy link
Contributor

@ibraheemdev ibraheemdev commented Oct 23, 2024

Per the mentoring instructions in #597.

Still needs some tests. Note this does not actually implement garbage collection but adds the necessary plumbing to do so.

Copy link

netlify bot commented Oct 23, 2024

Deploy Preview for salsa-rs canceled.

Name Link
🔨 Latest commit 6a9dca6
🔍 Latest deploy log https://app.netlify.com/sites/salsa-rs/deploys/67d7e75e89aa280008929fd6

Copy link

codspeed-hq bot commented Oct 23, 2024

CodSpeed Performance Report

Merging #602 will degrade performances by 24.4%

Comparing ibraheemdev:gc (6a9dca6) with master (1d1523b)

Summary

❌ 2 (👁 2) regressions
✅ 10 untouched benchmarks

Benchmarks breakdown

Benchmark BASE HEAD Change
👁 accumulator 3.4 ms 4.5 ms -24.4%
👁 new[InternedInput] 5.3 µs 5.9 µs -9.23%

Copy link
Contributor

@MichaReiser MichaReiser left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice. This looks good to me after adding a few tests.

Copy link
Member

@nikomatsakis nikomatsakis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, looks good to me!

@nikomatsakis
Copy link
Member

I see there is a test failure....have you investigated it at all?

@nikomatsakis
Copy link
Member

My hunch is that the test is testing behavior that's not relevant anymore but I can look.

@ibraheemdev
Copy link
Contributor Author

ibraheemdev commented Nov 4, 2024

@nikomatsakis I'm also hitting the assertion error in this example:

#[salsa::tracked]
fn function<'db>(db: &'db dyn Database, input: Input) -> Interned<'db> {
    Interned::new(db, 0)
}

let input = Input::new(&db, 0);
function(&db, input);
input.set_field1(&mut db).to(1);
function(&db, input);

The function does not read the input so the interned value is memoized, meaning it is not re-interned in R2. What should be done in this case (and the similar failing test)?

I'm also a little unclear about the testing instructions. Should I add a log event for when values are re-interned, or is there another way I can test that? I'm also not sure how to access the reset method that I added in tests.

@MichaReiser
Copy link
Contributor

I'm also a little unclear about the testing instructions. Should I add a log event for when values are re-interned, or is there another way I can test that?

That would sound reasonable to me, similar to the backdate event. We could also consider adding an event when a value is garbage collected.

@MichaReiser
Copy link
Contributor

If I understand it correctly the problem is that interned ingredients created outside of a query are never revalidated because there's no corresponding query that creates them.

I see two options:

  1. We forbid creating interned values outside of queries. That would match the behavior with tracked structs. The downside is that it technically becomes impossible to call a root salsa query with more than one argument because multi-argument functions intern the arguments (outside the query).
  2. We disable garbage collection for ingredients created outside of queries because salsa can't track their creation. Salsa has to consider them as always referenced. That's a bit of a footgun but probably fine?

@MichaReiser
Copy link
Contributor

I took a closer look at test_leaked_inputs_ignored and this is a slightly different variation of the above problem in the sense that the multi-argument-query is called as part of another query, but said query never re-executes and id_to_input panics.

I think the problem here is that function::maybe_changed_after never calls id_to_input when it short-circuits where it probably should, to at least mark the input still as used. However, it's unclear to me how that would work when the query using an interner is deeper in the tree (but the outermost query short-circuits, e.g. because the durability is higher)

@nikomatsakis
Copy link
Member

We had a brief discussion in our meeting today--- @ibraheemdev I think values that are interned outside of any salsa query have to be concerned immortal.

@ibraheemdev
Copy link
Contributor Author

ibraheemdev commented Jan 12, 2025

The test_leaked_inputs_ignored test is now passing, but I added a test that still fails.

This one seems trickier. The query does not have a dependency on its input. However, changing the input still causes the database revision to be advanced. I'm not exactly sure why, but I believe the second query call only performs a shallow memoization check, meaning that maybe_changed_after is never called for the interned value, so it's revision is never updated, and the read panics.

src/interned.rs Outdated

// Record a dependency on this value.
let index = self.database_key_index(id);
zalsa_local.report_tracked_read(index, Durability::MAX, current_revision);
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just making sure, is current_revision the correct argument to pass here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This feels incorrect, should it be the last_interned_at value?

@MichaReiser
Copy link
Contributor

Oh this is interesting. I have to take a closer look. We probably have a similar problem for when the input has a higher durability and Salsa skips the deep verification because of it.

@MichaReiser
Copy link
Contributor

Okay, I think I now understand the underlying problem. I'm not yet sure what the right fix is. I'll reach out in https://salsa.zulipchat.com/#narrow/channel/333573-Contributing-to-Salsa/topic/LRU.20.28GC.29.20for.20interned.20values.20.23597

@MichaReiser
Copy link
Contributor

I thought more about this and I feel somewhat confident that we should store the Durability on the interned value, similar to what's done for tracked structs.

https://github.com/MichaReiser/salsa/blob/3e2d7f351894ec5b75df5bc1181cdf22b3243e95/src/tracked_struct.rs#L191-L196

What's different for interned values is that multiple queries can create the same tracked struct and the queries might also change their durability over time. For now, I'd suggest that we store the highest durability across all reads because we have to be pessimistic and assume the interned value only gets recreated after a HIGH durability change if we have one HIGH and two low durability queries that both create the same interned value.

The read path should compare against the most recent revision for the interned value's durability instead of using the current revision.

This should unblock this PR. What's unclear is how garbage collection would work because there's no guarantee that a query has to run in every revision. That means it's possible for interned values never to have been read in the last 1000 revisions, and they might get garbage collected because of it. However, the result of a cached query might still depend on the interned value (or even reference it in the output).

@nikomatsakis
Copy link
Member

@MichaReiser @ibraheemdev -- just to confirm, this is still pending the changes we discussed in our last sync meeting, correct?

@ibraheemdev ibraheemdev marked this pull request as draft January 27, 2025 21:47
@ibraheemdev
Copy link
Contributor Author

@nikomatsakis yes, I'll be working on it this week.

@MichaReiser
Copy link
Contributor

Hmm, the accumulator regression is surprising. Did we accidentally break the optimization that skips traversing queries that are known not to have any accumulated values?

@MichaReiser
Copy link
Contributor

I don't see anything obvious that should have caused the regression but I suspect it's simply the fact that we now need to look up the revision from the active query?

This otherwise looks good to me. Do we need to remove any code? I remember you mentioned that we don't need the "immortal" handling.

@ibraheemdev
Copy link
Contributor Author

I'm going to try to add some more tests before this gets merged.

@ibraheemdev
Copy link
Contributor Author

@MichaReiser this should be good to merge now, I'm not sure what the CI failures are about.

@Veykril
Copy link
Member

Veykril commented Mar 15, 2025

We don't have the lockfile committed so one of our transitive dependencies bumped their msrv in a point release which CI now pulls

@ibraheemdev ibraheemdev force-pushed the gc branch 2 times, most recently from b644d7e to 9f7496f Compare March 15, 2025 16:29
@MichaReiser
Copy link
Contributor

I can reproduce the perf regression locally but I think it's inherent to the change. The main reasons for the regression are:

  • Salsa now records a dependency for each interned value, whereas before, it recorded a dependency on the ingredient. That means that salsa has to track more fine-grained dependencies than before and that book keeping comes at a cost
  • Similar to recording dependencies, maybe_changed_after now has to load the value where it previously was sufficient to check when the table was last reset. That comes at an extra cost.

I could see how these numbers might motivate us to support interned structs with and without garbage collection. For now, I think what we have here is good enough and we can iterate on the details in separate PRs.

@MichaReiser MichaReiser added this pull request to the merge queue Mar 17, 2025
@MichaReiser MichaReiser removed this pull request from the merge queue due to a manual request Mar 17, 2025
@MichaReiser MichaReiser added this pull request to the merge queue Mar 17, 2025
Merged via the queue into salsa-rs:master with commit a86db59 Mar 17, 2025
13 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants