Refactor accumulators #575

nikomatsakis · 2024-09-21T11:07:21Z

Refactor how accumulators are implemented. Instead of storing the values accumulated by a particular function X in a central map in the ingredient, attach them to the memo for X. This moves us a big step closer towards having all relevant data stored in tables and will help with speculative execution, lightweight dependencies, and a number of forward goals.

Review requested from @MichaReiser, if you have time, and others too =)

This will allow us to store the id in the struct itself.

There is no need for IndexMap here, we don't iterate or care about indices.

We used to have the concept of "resetting" a tracked struct, which removed all instances. We don't have that concept anymore. Therefore we don't need to record a read on the tracked struct table itself.

This is a step towards refactoring `fetch` to return the memo itself and not the value.

Change the core operation to be `refresh_memo`. This is a nice refactoring on its own but it will also support the refactoring of how we manage accumulator values.

We used to store values in a central map, but now each memo has an `AccumulatorMap` that maps accumulated values (if any). The primary goals of this change are * forward compatible with speculative execution because it puts more data into tables; * step towards a refactoring to stop tracking outputs in the same array as inputs and thus to simplify how we do versioning. We will no longer need to walk the function's outputs and refresh their versions and so forth because they are stored in the function memo and so they get refreshed automatically when the memo is refreshed.

netlify · 2024-09-21T11:07:36Z

✅ Deploy Preview for salsa-rs canceled.

Name	Link
🔨 Latest commit	`b5af1d8`
🔍 Latest deploy log	https://app.netlify.com/sites/salsa-rs/deploys/66eebb070b2bfb0008f71a85

codspeed-hq · 2024-09-21T11:09:30Z

CodSpeed Performance Report

Merging #575 will not alter performance

_{Comparing nikomatsakis:refactor-accumulators (b5af1d8) with master (2af849b)}

Summary

✅ 8 untouched benchmarks

MichaReiser

The code changes look good to me.

What I remember from the table refactor is that it avoids a hash map lookup by storing the result of query(a) -> b at the index a. This means the design is based on the assumption that queries are called for most of their arguments (a densely populated map).

I understand that we know do the same for accumulated values: The accumulated values for query(a) are stored at the index a. I think this could be problematic because the main use case for accumulators are diagnostics and diagnostics are rare. Only few queries will have accumulated values. The worst case is that a is stored at a very high index, requiring salsa to allocate many pages to store a single accumulated value/diagnostic.

Do you foresee changes to the table representation that can better support sparsely populated tables?

Related: it would also be nice if we could avoid the Arc inside of Memo but I don't think that's possible with today's table design because we relocate the Memo when storing a new value.

MichaReiser · 2024-09-21T13:51:48Z

src/function/accumulated.rs

+        // NOTE: We don't have a precise way to track accumulated values at present,
+        // so we report any read of them as an untracked read.
+        //
+        // Like tracked struct fields, accumulated values are essentially a "side channel output"
+        // from a tracked function, hence we can't report this as a read of the tracked function(s)
+        // whose accumulated values we are probing, since the accumulated values may have changed
+        // even when the the main return value of the function has not changed.
+        //
+        // Unlike tracked struct fields, we don't have a distinct id or ingredient to represent
+        // "the values of type A accumulated by tracked function X". Typically accumulated values
+        // are read from outside of salsa anyway so this is not a big deal.


I believe this is an existing comment and unrelated to the changes at this PR. At least I remember coming across it when evaluating if Ruff should use accumulators.

We decided to not use accumulators and instead return the results as part of the function's output because of the added untracked_read dependency when accumulators are read inside a query. Maybe we're using them incorrectly but our use case is:

A check_file query that calls lint and infer

the lint query creates LintDiagnostics

the infer query creates TypeDiagnostic

MichaReiser · 2024-09-21T14:19:50Z

I should have said that the code changes are great! I find it overall easier to follow what's happening.

But I am interested on your perspective on whether the "densely populated" tables assumption is an issue for accumulators. It just feels wasteful to me to allocate a bunch of pages to store a single diagnostic for the file with index 100_000.

MichaReiser

The dense table discussion is something that we can resolve separately from this PR.

One more data point for that discussion. Ruff has a check_file(File) query that runs for first-party files only. I used countme to get some rough numbers and check_file is called for 1538 out of the 3750 files (first party, third party and standard library files)

nikomatsakis · 2024-09-22T11:26:20Z

@MichaReiser

Actually, that's not quite right. The accumulators are not stored "densely" in the same way as tracked functions memos are stored. Every tracked function memo has a hashmap, yes, but the hashmap is empty most of the time. We only add an entry when you actually accumulate something. So it's sparse in the sense that we don't assume most tracked functions will accumulate something from most accumulators.

I feel like what is needed here is probably some set of diagrams, as using English paragraphs feels like a fairly imprecise way to describe what is going on, I am afraid we may be saying similar things in different ways.

That said, let me give you an example. Suppose there are four tracked functions (F1, F2, F3, and F4) and 3 accumulators (A1, A2, A3), and suppose that

F1 accumulates values for A1, A2, and A3
F2 accumulates values for A2 and A3
F3 accumulates values for A3 only
F4 accumulates nothing

In that case, the storage would like like

F1's memo has a FxHashMap like { A1: [...}, A2: [...], A3: [...] } (to use Python notation :)
F2's memo has a FxHashMap like { A2: [...], A3: [...] }
F3's memo has a FxHashMap like { A3: [...] }
F4's memo has an empty FxHashMap like { }

In contrast, with tracked structs, at least the way it's setup now, we use a vector of memos. So if we have a tracked struct S with 5 instances (S1, S2, S3, S4, and S5), and...

F1 and F2 are invoked on S1
F2 and F3 are invoked on S2
F3 and F4 are invoked on S3
F4 is invoked on S4
and nothing is invoked on S5

then the structs' memo listings would look like:

S1 would have [memo for F1, memo for F2] (length of 2, 0% not utilized)
S2 would have [null, memo for F2, memo for F3] (length of 3, 33% not utilized)
S3 would have [null, null, memo for F3, memo for F4] (length of 4, 50% not utilized)
S4 would have [null, null, null, memo for F4] (length of 4, 75% unutilized)
S5 would have [] (length of 0, 0% unutilized)

I strongly suspect we should use some other system for memo storage. That said, the above case doesn't look too bad to me, what's really painful is that we assign globally unique indices to functions, even though every function can be used with at most one kind of tracked struct. We should let the indices overlap. In other words, if we had two kinds of tracked structs (S and T) and a tracked function FT that gets invoked on T instances, then EVERY instance of T would have a memo table that starts with [null, null, null, null] because it has to leave space for memos from F1..F4, even though those functions can only be invoked on S instances. Obvious place to optimize.

nikomatsakis · 2024-09-22T11:30:41Z

I believe this is an existing comment and unrelated to the changes at this PR. At least I remember coming across it when evaluating if Ruff should use accumulators.

Correct.

We decided to not use accumulators and instead return the results as part of the function's output because of the added untracked_read dependency when accumulators are read inside a query.

Hmm. So, with the old representation, it was very hard to determine when the set of values accumulated by a function had changed, which is why I adopted this "unversioned" approach. In the new representation, however, it's actually fairly easy. We have the old memo, and we have its AccumulatorMap. We also have the new AccumulatorMap. We can just compare them and take that into account for the last changed revision of the function.

That said, in the code I've written, I've only used accumulated for errors, and I always read errors from outside of tracked functions (e.g., in the "main IDE driver"). This also means that I structure these values differently. For example, in my dada compiled, all spans are stored relative to the enclosing item, so that they don't change when new lines are inserted, but spans for errors are AbsoluteSpan values that have absolute byte indices. If I included those in the output of the function, it would make for a lot of unnecessary churn.

Perhaps we want to make it configurable?

nikomatsakis · 2024-09-22T11:32:30Z

My hunch is that the use case you are describing is somewhat different, but worth figuring out. I am imagining something like an "accumulation field" on a tracked struct, versus a global accumulator. Maybe we should dive into your particular case a bit deeper.

This is also suggesting to me that it'd be good to start writing out the "Canonical Salsa 3.0 design patterns".

MichaReiser · 2024-09-22T14:41:24Z

Thanks for the detailed explanation. I read through the code changes again and realized that I missed that the accumulated values are stored on the Memo. They're not stored in their own table similar to single-argument tracked functions.

then the structs' memo listings would look like:

Oh interesting. My understanding was the table design was different in that each function would have its own table rather than that it's stored on the ingredient. But storing it on the ingredient itself

F1 would have [memo for S1, null, null, null, null]` (20%)
F2 would have [memo for S1, memo for S2, null, null, null]` (40%)
F3 would have [null, memo for S2, memo for S3, null, null] (40%)
F4 would have [null, null, memo for S3, memo for S4, null] (40%)
F5 would have [null, null, null, null, null] (0%, although tables are allocated lazily)

This would give us a worse overall yield but a better yield when introducing a new tracked structure, T because the tables remain unchanged.

If I included those in the output of the function, it would make for a lot of unnecessary churn.

Yes, this isn't the case for us because the main output of lint, type_check, and check_file are the diagnostics. We have more fine grained queries that ensure that changes remain local (although nothing as sophisticated as using relative offsets!). Having said that. Accumulators are nice to cut through multiple layers. In our case this isn't necessary and not using an accumulator works just fine enough (for now. Maybe we need them later).

nikomatsakis added 6 commits September 16, 2024 06:00

allocate values in a closure

34a8edb

This will allow us to store the id in the struct itself.

change disambiguator_map to FxHashMap

d7df2cd

There is no need for IndexMap here, we don't iterate or care about indices.

remove read of "entity table"

2528045

We used to have the concept of "resetting" a tracked struct, which removed all instances. We don't have that concept anymore. Therefore we don't need to record a read on the tracked struct table itself.

refactor extend_memo_lifetime to return memo

dcb0eb2

This is a step towards refactoring `fetch` to return the memo itself and not the value.

return lifetime-extended memo, not value

2caa5cc

Change the core operation to be `refresh_memo`. This is a nice refactoring on its own but it will also support the refactoring of how we manage accumulator values.

nikomatsakis requested a review from MichaReiser September 21, 2024 11:07

pacify the merciless clippy

b5af1d8

MichaReiser reviewed Sep 21, 2024

View reviewed changes

MichaReiser approved these changes Sep 22, 2024

View reviewed changes

nikomatsakis added this pull request to the merge queue Sep 22, 2024

Merged via the queue into salsa-rs:master with commit 198c43f Sep 22, 2024
8 of 10 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Refactor accumulators #575

Refactor accumulators #575

nikomatsakis commented Sep 21, 2024

netlify bot commented Sep 21, 2024 •

edited

Loading

codspeed-hq bot commented Sep 21, 2024 •

edited

Loading

MichaReiser left a comment

MichaReiser Sep 21, 2024

MichaReiser commented Sep 21, 2024

MichaReiser left a comment

nikomatsakis commented Sep 22, 2024 •

edited

Loading

nikomatsakis commented Sep 22, 2024

nikomatsakis commented Sep 22, 2024

MichaReiser commented Sep 22, 2024

Refactor accumulators #575

Refactor accumulators #575

Conversation

nikomatsakis commented Sep 21, 2024

netlify bot commented Sep 21, 2024 • edited Loading

✅ Deploy Preview for salsa-rs canceled.

codspeed-hq bot commented Sep 21, 2024 • edited Loading

CodSpeed Performance Report

Merging #575 will not alter performance

Summary

MichaReiser left a comment

Choose a reason for hiding this comment

MichaReiser Sep 21, 2024

Choose a reason for hiding this comment

MichaReiser commented Sep 21, 2024

MichaReiser left a comment

Choose a reason for hiding this comment

nikomatsakis commented Sep 22, 2024 • edited Loading

nikomatsakis commented Sep 22, 2024

nikomatsakis commented Sep 22, 2024

MichaReiser commented Sep 22, 2024

netlify bot commented Sep 21, 2024 •

edited

Loading

codspeed-hq bot commented Sep 21, 2024 •

edited

Loading

nikomatsakis commented Sep 22, 2024 •

edited

Loading