Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

refactor: remove ScalarValue type parameter #782

Closed

Conversation

mirosval
Copy link

@mirosval mirosval commented Oct 9, 2020

The type parameter S for ScalarValue is used to open up a degree of freedom in the parsing of leaf scalar types. However because of that it is threaded through the entire codebase. This commit removes the type parameter in all possible places with the following in mind:

  • Fix the S parameter to DefaultScalarValue where necessary (mainly in Value which occurs on the return path after resolution)
  • Remove the parameter everywhere else

This commit effectively disables the feature. I'm committing here mainly because now it compiles so I want a safe point.

Next we need to figure out how to re-add the ability to control the leaf value parsing. I figured that we can do that effectively from this point either by adding it from scratch or by un-removing it from this commit.

The type parameter S for ScalarValue is used to open up a degree of freedom in the parsing of leaf scalar types. However because of that it is threaded through the entire codebase. This commit removes the type parameter in all possible places with the following in mind:
* Fix the S parameter to DefaultScalarValue where necessary (mainly in Value which occurs on the return path after resolution)
* Remove the parameter everywhere else

This commit effectively disables the feature. I'm committing here mainly because now it compiles so I want a safe point.

Next we need to figure out how to re-add the ability to control the leaf value parsing
Now the idea is that we'll let users annotate their types with a ScalarValue, but we'll generate a converter rather than threading the type all through the codebase. To that end this commit restores some of the bookkeeping previously deleted.
@tyranron tyranron added k::api Related to API (application interface) semver::breaking Breaking change in terms of SemVer enhancement Improvement of existing features or bugfix k::integration Related to integration with third-party libraries or systems labels Oct 12, 2020
@tyranron
Copy link
Member

I'm very much in favour of it. S: ScalarValue is a dreaded nightmare of this codebase. Its benefits are too small in comparance how it pollutes every single aspect of this library, and posses implementation and user-facing challenges.

@mirosval
Copy link
Author

@tyranron @ccbrown guys, after working on this for over a week, I'm at a point where I could use some guidance.

What I have done so far is to remove the <S> type parameter from everywhere apart from the proc macros. Somewhat surprisingly everything compiled and most of the tests pass as well (apart from integration tests where apparently this functionality is exercised).

My understanding of the problem is currently that we need an implementation of ScalarValue in order to:

  • Parse GraphQL variables of InputValues and pass these to the resolvers
  • Serialize the responses of resolvers back into json

The current approach achieves this by threading the ScalarValue through the entire graph so that it can be used to parse, validate and resolve the leaves.

What I wanted to do is to leave the InputVariables basically unparsed and late-bind something that can parse them only at resolution time. At the same time that something can perform the inverse conversion on the result of the resolver. In all likelihood that something needs to be passed to the validation too so we can be sure the type will convert at the time of resolution.

I have attempted to do this by changing all DefaultScalarValue variants to contain String. However this seems to lead to issues with some of the macros, because they expect &str and I have a String that needs to be parsed, always resulting in an owned object that I can't return.

The progress is slow also because there seems to be a circular dependency between juniper and juniper_codegen, so I can't just limit my changes to one of the crates, since that breaks the other.

Anyway, I thought I'd ask for opinions before spending another fruitless night fighting the borrow checker :)

@ccbrown
Copy link
Contributor

ccbrown commented Oct 14, 2020

I think my best advice is to work on this with more of a bottom-up approach that can be merged piece-meal. I have a local branch where I started to remove ScalarValue from the AST. Later today I'll see if I can polish that up into a mergeable state and get it into a PR. If we merge small pieces like that I feel like this will become much more approachable.

@mirosval
Copy link
Author

Yeah, that's what I tried in the beginning, but since everything is interconnected, I ended up removing it everywhere at once. I guess what I'm asking is

  1. what should the end result look like and
  2. where should the boundaries of these intermediate steps be.

@ccbrown
Copy link
Contributor

ccbrown commented Oct 15, 2020

The first places where ScalarValue will need to be removed are the AST nodes. For that to happen, we need to figure out what Juniper functionality actually requires parsed ScalarValues and what functionality can just work with the raw scalar token. In order to do that, I would suggest moving InputValue, FromInputValue, and ToInputValue out of ast and directly into the top level of the crate.

In the AST, we should represent values with an enum that looks something like this:

pub enum Value<'a, S> {
    Null,
    Bool(bool),
    Scalar(ScalarToken<'a>),
    Enum(String),
    Variable(String),
    List(Vec<Spanning<Value<'a, S>>>),
    Object(Vec<(Spanning<String>, Spanning<Value<'a, S>>)>),
    _ScalarValue(PhantomData<S>), // TODO: remove me
}

Because just deleting all of the <S>s is a massive undertaking in itself (as you've found), I would suggest the first change set just leave all of them and focus on moving the scalar value parsing to the right places. Hence the PhantomData<S> variant.

Once everything is updated to either use ast::Value or keep using InputValue, there will be a small handful of places in the executor and validation that need to convert from ast::Value to InputValue.

These will need to call some function that takes a ast::Value and MetaType<S> and returns an InputValue result.

Once we have those conversions done outside of the parser we can focus on deleting all of the template variables in a subsequent commit or PR.

Hope that helps!

@tyranron
Copy link
Member

tyranron commented Oct 15, 2020

@mirosval @ccbrown giving this a lot of thoughts and experiments before, I've found the following observations and conclusions:

There are two purposes of using ScalarValue in Juniper:

  1. Using custom ScalarValue for an efficient parsing of some custom GraphQL scalar (for example: we want MyId to parse directly into u128 but hold it as number in JSON, and we don't want to go trough default i32 representation).
  2. Using custom ScalarValue to efficiently back the whole GraphQL schema/root (for example: use SmolStr for representing strings, and so cut off some allocations).

Beside those I don't see any other use cases for using custom ScalarValue.

So, the whole ScalarValue thing is essentially rooted in two types: Value<S> and InputValue<S> of ast. Once we solve how to remove or move ScalarValue parametrization out of there - everything else is a trivial routine (though, a huge one).

Let's start with a simple one: InputValue<S>.
Currently, it's used in two places:

  • when a GraphQL scalar value is parsed from AST value;
  • stored in schema as default values for field arguments.

The parsing is not a big deal. Similarly to what @ccbrown has proposed above, we may just put the ScalarToken<'a> into InputValue instead S: ScalarValue:

pub enum InputValue<'a, S> {
    // ...
    Scalar(ScalarToken<'a>),
    // ...
}

And allow to parse from &str directly when implementing GraphQL scalar. This way we even omit some in-between parsing and allow arbitrary GraphQL scalars to have arbitrary parsing, which isn't possible at the moment (because the only one ScalarValue type is used for the whole schema).

However, keeping defaults in schema is a question:
In the current implementation the schema keeps already half-parsed values for field arguments, and whenever the field is invoked (which is hot path, unfortunately) it takes this half-parsed value, converts into GraphQL scalar (second half of parsing) and passes into the field as the argument.
However, if we would keep &str or String there, then we should fully parse a GraphQL scalar from its string representation every time it's used on hot path, which seems to be a reasonable performance penalty.
Another thought on this case is to parse a GraphQL scalar value only once when constructing a schema and keep in schema an erased Box<dyn Any> value. It still has a performance penalty of dynamic dispatch + downcast (which should be negligable in comparance to every-where-boxing in async), but gives us a desired level of flexibility.

Now, move on to a harder problem: Value<S>.
I've thought about this a lot, and haven't figured out any feasible solutions at all. The problem is that Value<S> can be constructed and used dynamically (see FieldError<S>), without connection to any real user Rust types, and we have even graphql_value! macro for those purposes. Given this, it's totally unclear how we can remove S type parameter from the Value<S> type, but still allow to swap the inner Value::Scalar representation. Going with trait objects instead would be a huge performance penalty, as this is a hot path stuff.
I've also checked other implementations like graphql-parser and async-graphql, but all of them don't ship an ability to swap scalar value representation.

@ccbrown
Copy link
Contributor

ccbrown commented Oct 15, 2020

Yeah I don't expect Value<S> to ever lose the template parameter.

@tyranron
Copy link
Member

@ccbrown and that's the problem:
If it remains in Value then FieldError will have it too. If FieldError has it, then this type parameter is leaked into object/interface macros in fallible fields.

@ccbrown
Copy link
Contributor

ccbrown commented Oct 15, 2020

I don't think that prevents us from getting rid of the type parameter in all of the AST types, which personally is my main hope. I'm not even expecting to remove it from InputValue necessarily. But I believe it should be perfectly feasible to delay the parsing of scalar tokens until after document parsing, which would allow us to do document parsing without knowing the schema and at least eliminate the parameter on ast::Document, ast::Selection, etc.

@mirosval
Copy link
Author

Thank you for the thoughtful advice. I see now that this PR is a little overzealous with removing it in too many places. I'd propose the following:

  1. I'm going to revert the changes to Value and all types that descend from it. It will remain with the type parameter for the time being. This means all types on the return path including FieldError and
  2. I'm going to focus on InputValue and change the Scalar representation there to include the ScalarToken

Let me restate the general problem again:

  • Given an user-defined function fn resolve(arg: A) -> R
  • We need a way to transform it using macros into a function of the form fn resolve(arg: InputValue) -> Value that we can call from the executor
  • In order to do that, we need two functions fn in(InputValue) -> Result<A> and fn out(R) -> Value

To me this looks a lot like the From trait. Can we not just call .into() and have the compiler find the right impl of the From trait for us? This way no parameter is needed. We put our default impls into the prelude. If users want to use non-standard types, they will be forced to write the conversion. If they want to interpret a standard type differently they have to stop including prelude and instead do a la carte import.

@tyranron
Copy link
Member

ack @LegNeato it would be nice to hear some thoughts from you on this topic. Getting rid of ScalarValue will simplify drastically both an implementation and user experience.

Actually, I still doubt that library users use custom ScalarValues.

@mirosval
Copy link
Author

Guys, I have re-started the work on this already 3x from different ends and it seems I can't figure out how to do this in small pieces. I'm giving up.

@mirosval mirosval closed this Oct 27, 2020
@ccbrown
Copy link
Contributor

ccbrown commented Oct 27, 2020

@mirosval Thanks for your efforts. For what it's worth, I feel we've done enough due diligence here to move forward with #776. It seems removing ScalarValue from Document is not likely to be a short term feat.

@ruslantalpa
Copy link

@tyranron

Actually, I still doubt that library users use custom ScalarValues.

I would have liked to use custom ScalarValue but i am blocked by Value using DefaultScalarValue

The need came from the fact that sometimes, one has a "json string" (coming from the database directly) and i would like to avoid a serialize/deserialize step

I was looking at having something like

enum MyScalarValue {
    Int(i32),
    Float(f64),
    String(String),
    Boolean(bool),
    //custom
    RawValue(String),
}

Then look into using serde's RawValue for that (if it does what i think it does)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement Improvement of existing features or bugfix k::api Related to API (application interface) k::integration Related to integration with third-party libraries or systems semver::breaking Breaking change in terms of SemVer wontfix This will not be worked on
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants