Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refactor accumulators #575

Merged
merged 7 commits into from
Sep 22, 2024

Conversation

nikomatsakis
Copy link
Member

Refactor how accumulators are implemented. Instead of storing the values accumulated by a particular function X in a central map in the ingredient, attach them to the memo for X. This moves us a big step closer towards having all relevant data stored in tables and will help with speculative execution, lightweight dependencies, and a number of forward goals.

Review requested from @MichaReiser, if you have time, and others too =)

This will allow us to store the id
in the struct itself.
There is no need for IndexMap here,
we don't iterate or care about indices.
We used to have the concept of "resetting"
a tracked struct, which removed all instances.
We don't have that concept anymore.
Therefore we don't need to record a read on the
tracked struct table itself.
This is a step towards refactoring `fetch`
to return the memo itself and not the value.
Change the core operation to be `refresh_memo`.
This is a nice refactoring on its own but it
will also support the refactoring of how
we manage accumulator values.
We used to store values in a central map,
but now each memo has an `AccumulatorMap`
that maps accumulated values (if any).

The primary goals of this change are

* forward compatible with speculative execution
  because it puts more data into tables;
* step towards a refactoring to stop tracking
  outputs in the same array as inputs and thus
  to simplify how we do versioning. We will no
  longer need to walk the function's outputs
  and refresh their versions and so forth because
  they are stored in the function memo and so
  they get refreshed automatically when the memo
  is refreshed.
Copy link

netlify bot commented Sep 21, 2024

Deploy Preview for salsa-rs canceled.

Name Link
🔨 Latest commit b5af1d8
🔍 Latest deploy log https://app.netlify.com/sites/salsa-rs/deploys/66eebb070b2bfb0008f71a85

Copy link

codspeed-hq bot commented Sep 21, 2024

CodSpeed Performance Report

Merging #575 will not alter performance

Comparing nikomatsakis:refactor-accumulators (b5af1d8) with master (2af849b)

Summary

✅ 8 untouched benchmarks

Copy link
Contributor

@MichaReiser MichaReiser left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The code changes look good to me.

What I remember from the table refactor is that it avoids a hash map lookup by storing the result of query(a) -> b at the index a. This means the design is based on the assumption that queries are called for most of their arguments (a densely populated map).

I understand that we know do the same for accumulated values: The accumulated values for query(a) are stored at the index a. I think this could be problematic because the main use case for accumulators are diagnostics and diagnostics are rare. Only few queries will have accumulated values. The worst case is that a is stored at a very high index, requiring salsa to allocate many pages to store a single accumulated value/diagnostic.

Do you foresee changes to the table representation that can better support sparsely populated tables?

Related: it would also be nice if we could avoid the Arc inside of Memo but I don't think that's possible with today's table design because we relocate the Memo when storing a new value.

Comment on lines +22 to +32
// NOTE: We don't have a precise way to track accumulated values at present,
// so we report any read of them as an untracked read.
//
// Like tracked struct fields, accumulated values are essentially a "side channel output"
// from a tracked function, hence we can't report this as a read of the tracked function(s)
// whose accumulated values we are probing, since the accumulated values may have changed
// even when the the main return value of the function has not changed.
//
// Unlike tracked struct fields, we don't have a distinct id or ingredient to represent
// "the values of type A accumulated by tracked function X". Typically accumulated values
// are read from outside of salsa anyway so this is not a big deal.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe this is an existing comment and unrelated to the changes at this PR. At least I remember coming across it when evaluating if Ruff should use accumulators.

We decided to not use accumulators and instead return the results as part of the function's output because of the added untracked_read dependency when accumulators are read inside a query. Maybe we're using them incorrectly but our use case is:

  • A check_file query that calls lint and infer
  • the lint query creates LintDiagnostics
  • the infer query creates TypeDiagnostic

@MichaReiser
Copy link
Contributor

I should have said that the code changes are great! I find it overall easier to follow what's happening.

But I am interested on your perspective on whether the "densely populated" tables assumption is an issue for accumulators. It just feels wasteful to me to allocate a bunch of pages to store a single diagnostic for the file with index 100_000.

Copy link
Contributor

@MichaReiser MichaReiser left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The dense table discussion is something that we can resolve separately from this PR.

One more data point for that discussion. Ruff has a check_file(File) query that runs for first-party files only. I used countme to get some rough numbers and check_file is called for 1538 out of the 3750 files (first party, third party and standard library files)

@nikomatsakis
Copy link
Member Author

nikomatsakis commented Sep 22, 2024

@MichaReiser

Actually, that's not quite right. The accumulators are not stored "densely" in the same way as tracked functions memos are stored. Every tracked function memo has a hashmap, yes, but the hashmap is empty most of the time. We only add an entry when you actually accumulate something. So it's sparse in the sense that we don't assume most tracked functions will accumulate something from most accumulators.

I feel like what is needed here is probably some set of diagrams, as using English paragraphs feels like a fairly imprecise way to describe what is going on, I am afraid we may be saying similar things in different ways.

That said, let me give you an example. Suppose there are four tracked functions (F1, F2, F3, and F4) and 3 accumulators (A1, A2, A3), and suppose that

  • F1 accumulates values for A1, A2, and A3
  • F2 accumulates values for A2 and A3
  • F3 accumulates values for A3 only
  • F4 accumulates nothing

In that case, the storage would like like

  • F1's memo has a FxHashMap like { A1: [...}, A2: [...], A3: [...] } (to use Python notation :)
  • F2's memo has a FxHashMap like { A2: [...], A3: [...] }
  • F3's memo has a FxHashMap like { A3: [...] }
  • F4's memo has an empty FxHashMap like { }

In contrast, with tracked structs, at least the way it's setup now, we use a vector of memos. So if we have a tracked struct S with 5 instances (S1, S2, S3, S4, and S5), and...

  • F1 and F2 are invoked on S1
  • F2 and F3 are invoked on S2
  • F3 and F4 are invoked on S3
  • F4 is invoked on S4
  • and nothing is invoked on S5

then the structs' memo listings would look like:

  • S1 would have [memo for F1, memo for F2] (length of 2, 0% not utilized)
  • S2 would have [null, memo for F2, memo for F3] (length of 3, 33% not utilized)
  • S3 would have [null, null, memo for F3, memo for F4] (length of 4, 50% not utilized)
  • S4 would have [null, null, null, memo for F4] (length of 4, 75% unutilized)
  • S5 would have [] (length of 0, 0% unutilized)

I strongly suspect we should use some other system for memo storage. That said, the above case doesn't look too bad to me, what's really painful is that we assign globally unique indices to functions, even though every function can be used with at most one kind of tracked struct. We should let the indices overlap. In other words, if we had two kinds of tracked structs (S and T) and a tracked function FT that gets invoked on T instances, then EVERY instance of T would have a memo table that starts with [null, null, null, null] because it has to leave space for memos from F1..F4, even though those functions can only be invoked on S instances. Obvious place to optimize.

@nikomatsakis
Copy link
Member Author

I believe this is an existing comment and unrelated to the changes at this PR. At least I remember coming across it when evaluating if Ruff should use accumulators.

Correct.

We decided to not use accumulators and instead return the results as part of the function's output because of the added untracked_read dependency when accumulators are read inside a query.

Hmm. So, with the old representation, it was very hard to determine when the set of values accumulated by a function had changed, which is why I adopted this "unversioned" approach. In the new representation, however, it's actually fairly easy. We have the old memo, and we have its AccumulatorMap. We also have the new AccumulatorMap. We can just compare them and take that into account for the last changed revision of the function.

That said, in the code I've written, I've only used accumulated for errors, and I always read errors from outside of tracked functions (e.g., in the "main IDE driver"). This also means that I structure these values differently. For example, in my dada compiled, all spans are stored relative to the enclosing item, so that they don't change when new lines are inserted, but spans for errors are AbsoluteSpan values that have absolute byte indices. If I included those in the output of the function, it would make for a lot of unnecessary churn.

Perhaps we want to make it configurable?

@nikomatsakis
Copy link
Member Author

My hunch is that the use case you are describing is somewhat different, but worth figuring out. I am imagining something like an "accumulation field" on a tracked struct, versus a global accumulator. Maybe we should dive into your particular case a bit deeper.

This is also suggesting to me that it'd be good to start writing out the "Canonical Salsa 3.0 design patterns".

@nikomatsakis nikomatsakis added this pull request to the merge queue Sep 22, 2024
Merged via the queue into salsa-rs:master with commit 198c43f Sep 22, 2024
8 of 10 checks passed
@MichaReiser
Copy link
Contributor

Thanks for the detailed explanation. I read through the code changes again and realized that I missed that the accumulated values are stored on the Memo. They're not stored in their own table similar to single-argument tracked functions.

then the structs' memo listings would look like:

Oh interesting. My understanding was the table design was different in that each function would have its own table rather than that it's stored on the ingredient. But storing it on the ingredient itself

  • F1 would have [memo for S1, null, null, null, null]` (20%)
  • F2 would have [memo for S1, memo for S2, null, null, null]` (40%)
  • F3 would have [null, memo for S2, memo for S3, null, null] (40%)
  • F4 would have [null, null, memo for S3, memo for S4, null] (40%)
  • F5 would have [null, null, null, null, null] (0%, although tables are allocated lazily)

This would give us a worse overall yield but a better yield when introducing a new tracked structure, T because the tables remain unchanged.

If I included those in the output of the function, it would make for a lot of unnecessary churn.

Yes, this isn't the case for us because the main output of lint, type_check, and check_file are the diagnostics. We have more fine grained queries that ensure that changes remain local (although nothing as sophisticated as using relative offsets!). Having said that. Accumulators are nice to cut through multiple layers. In our case this isn't necessary and not using an accumulator works just fine enough (for now. Maybe we need them later).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants