Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refactor code generation #373

Open
danieleades opened this issue Jan 17, 2023 · 3 comments
Open

Refactor code generation #373

danieleades opened this issue Jan 17, 2023 · 3 comments

Comments

@danieleades
Copy link
Contributor

danieleades commented Jan 17, 2023

The code generation is very bespoke, and I think there's significant scope for simplifying this, but it would be quite a lot of work, so I wanted to gauge the appetite for this before I start.

Currently the code is generated as a formatted string, using a custom code-gen framework.

I propose generating an abstract syntax tree representation instead, and then formatting it at the end.

That would allow you to use rust's standard tooling (syn and quote) to generate the AST. You could then format using prettyplease or by shelling out to rustfmt

@dwrensha
Copy link
Member

I agree that the capnpc could use a lot of refactoring. I don't think we need to bring in new dependencies, though. quote seems like overkill. We don't need to make an AST. We just need to print the generated code as a string.

@dzfranklin
Copy link
Contributor

This library already requires you to set up an environment where you can shell out to capnp so I don't see a point in adding inconvenience to avoid shelling out to rustfmt.

Automatically running the output through rustfmt sound great though. I think it could let you significantly simplify FormattedText. Maybe it could just be a Vec<String>?

@edfraenkel
Copy link

edfraenkel commented Sep 6, 2023

It is difficult for the end-user to get a grip on the code generation process, partly because the to-be-interpolated code lacks visual clarity:

let getter_result = Branch(vec![
line("#[inline]"),
Line(fmt!(ctx,
"pub fn which(self) -> ::core::result::Result<{concrete_type}, {capnp}::NotInSchema> {{"
)),
indent(vec![
Line(format!(
"match self.{field_name}.get_data_field::<u16>({doffset}) {{"
)),
indent(getter_interior),
line("}"),
]),
line("}"),
]);

Using a procedural macro which interpolates the variables (like for instance the one in the quote crate) would result in much shorter and much more readable code:

quote_like!{
    #[inline]
    pub fn which(self) -> ::core::result::Result<#concrete_type, #capnp::NotInSchema> {
        match self.#field_name.get_data_field::<u16>(#doffset) {
            #getter_interior
        }
    }
}

In principle (as per the documentation) users should have the flexibility to add functionality to the codegen (for instance with custom annotations) and in many cases this would require forking and modifying the existing codegen. I would therefore argue that the codegen needs to be especially transparent and comprehensible.

The usage of the quote library might initially appear as overkill (it is true that in principle no AST is required in in the output) but it can serve as a reliable intermediate data format, facilitating future development and maintainability. The quote library's documentation even provides a section on non macro code generation so it does seem like an acceptable use case.

Of course there might be a performance penalty and relying on another dependency would undoubtedly increase compile times.

Another option would be to write a custom procedural macro, similar to quote! which only performs the interpolation.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants