Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

tech-debt: Context lib #4060

Merged
merged 1 commit into from
Jul 3, 2023
Merged

tech-debt: Context lib #4060

merged 1 commit into from
Jul 3, 2023

Conversation

miguelff
Copy link
Contributor

Problem

All the way down the stack, from the query engine entry point down to quiant, we pass a trace id for issuing telemetry traces. This is a very specific cross-cutting concern, the moment we need more things to be passed in, we will need more and more parameters to functions being called.

Because most of the behaviors down the stack are dynamic (i.e. a trait implementation with specific methods provides the behavior, rather than a struct holding the state) it’s each function in the trait that needs to be parameterized and changed in all the implementations. Consider this commit for instance (from a hackathon project), that passes down a prisma query, only for a subset of the read operations, it had to change more than 600 lines of code:

da24144

This lack of a mechanism to pass down a query context down the stack was also the reason why telemetry capturing was so complex. If it were in place it would have been much easier to implement that feature.

Goals

  • Define a struct for a generic Context object
  • Apply the pattern gradually down the stack, gradually to prevent a very large PR.

@miguelff miguelff requested a review from a team as a code owner June 30, 2023 11:19
@codspeed-hq
Copy link

codspeed-hq bot commented Jun 30, 2023

CodSpeed Performance Report

Merging #4060 will not alter performances

Comparing context (e8797ac) with main (e6267db)

Summary

✅ 11 untouched benchmarks

Comment on lines +13 to +15
pub fn concurrent(self) -> Arc<Mutex<Context>> {
Arc::new(Mutex::new(self))
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like that the Arc<Mutex<..>> overhead is optional :)


#[derive(Debug, Default)]
pub struct Context {
store: HashMap<(String, TypeId), Box<dyn Any>>,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we adopt something different than a String as a key type?
I'd strive to avoid too many memory allocations just to retrieve dynamically populated values from Context. Could the key be represented a closed set type, like a numeric enum, whose static list is append-only (similar to the introspection warning codes list we had some months ago)?

E.g.:

enum ContextKey {
  TRACING,
  PRISMA_QUERY,
  ...
}

and

#[derive(Debug, Default)]
pub struct Context {
    store: HashMap<(ContextKey, TypeId), Box<dyn Any>>,
}

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One can also try to extend this idea to TypeId as well, but I think that can happen in a future query and only if we see a noticeable slowdown in benchmarked performance.

Copy link
Contributor Author

@miguelff miguelff Jul 3, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No we can't sorry, if we had an enum then the enum has to be known by the context, thus making it coupled to it (which will be a very bad design decision) On top of that if this is passed down as owned value it won't be cloned so there would only be one allocation per key (in case it's not wrapped in an Arc)

What I will do in case needed, is to make it receive a &'static str as a key, as the users of context will identify the keys by name, but rather not prematurely optimize anything until I see how this is applied when refactoring (I have one in progress)

Copy link
Contributor

@jkomyno jkomyno Jul 3, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see. I was recommending intentionally coupling Context with its available ContextKeys for the sake of avoiding String allocations, which have already bit us several times in the past: we risk using Strings now just to replace them later. Considering this is a library for the Query Engine and not a general purpose crate, I’d find the coupling acceptable

Copy link
Contributor

@jkomyno jkomyno left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Approved with some nits

@miguelff miguelff merged commit 27eb244 into main Jul 3, 2023
48 checks passed
@miguelff miguelff deleted the context branch July 3, 2023 10:06
@janpio janpio added this to the 5.0.0 milestone Jul 3, 2023
miguelff added a commit that referenced this pull request Jul 14, 2023
miguelff added a commit that referenced this pull request Jul 14, 2023
miguelff added a commit that referenced this pull request Jul 14, 2023
miguelff pushed a commit that referenced this pull request Jul 18, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants