Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

lit-html - non quoted attribute values give ERROR #77

Closed
renxzen opened this issue Feb 14, 2024 · 19 comments
Closed

lit-html - non quoted attribute values give ERROR #77

renxzen opened this issue Feb 14, 2024 · 19 comments

Comments

@renxzen
Copy link

renxzen commented Feb 14, 2024

When working with lit-element webcomponents, i have a problem for binded attribute values.
When the attribute value is just ${...} without quotes, the next line gives an error because the parser is not recognizing the attribute value as a quoted_attribute_value

Lit problem

As you can see, i get ERROR in the tree sitter playground.
But when the binded value is enclosed with quotes the problem goes away:

lit solved

Note: for the framework, it is not necessary for attribute values to be enclosed with quotes, they work without them too.

@amaanq
Copy link
Member

amaanq commented Feb 19, 2024

Yeah I'm gonna need actual HTML code pasted as plain text with parse output from tree-sitter parse to verify this is an issue on our end and not an issue with either a) nvim-treesitter or b) the javascript grammar

Closing until it's evident it's an issue here.

@amaanq amaanq closed this as not planned Won't fix, can't repro, duplicate, stale Feb 19, 2024
@renxzen
Copy link
Author

renxzen commented Feb 19, 2024

here it is the output of tree-sitter parse with the latest git commit of this repo

image

@amaanq
Copy link
Member

amaanq commented Feb 19, 2024

can you paste the code...

@renxzen
Copy link
Author

renxzen commented Feb 19, 2024

Sure, i pasted the code with the cat command above, i'm sorry i didn't put it here

<reminder-element
	id="modify-reminder-modal"
	header-text=${this.operation.computeStyle('active')}
	modal-type="info"
	button-text=""
	fixed-footer
	@click-bottom="${() => this.operation.log()}"
></reminder-element>

@milahu
Copy link

milahu commented Feb 21, 2024

minimal repro

<img src=${'x'}>

imo this is out of scope for an html parser, and should be handled by an JSX parser
since ${expr} can contain arbitrary JS expressions

related #57

@renxzen
Copy link
Author

renxzen commented Feb 21, 2024

Is there a way to tag attribute values that do not have quotes, like =${expr} as (quoted_attribute_value)?

I'm using the html parser inside html`...` templates on lit. Your solution would be to inject the jsx parser inside those templates? The JSX parser already parses expressions as src={"x"} as (jsx_expression) but do not support the =${"x"} expression.

@milahu
Copy link

milahu commented Feb 22, 2024

Is there a way to tag attribute values that do not have quotes

that would require to change the html parser

but still, this is valid html, and a browser will show ${expr} as title

<div title=${expr}>text</div>

I'm using the html parser inside html`...` templates on lit

The JSX parser already parses expressions as src={"x"} as (jsx_expression) but do not support the =${"x"} expression.

aah, so ${expr} can appear anywhere in the template string
and it can be escaped like \${expr} or \\\${expr}

<div>${expr}</div>
<${tagName}>text</${tagName}>

so you would have to extend the html parser to parse ${expr}
and then delegate expr to the JS parser

also you will have to handle escaped quotes: \` and \\\`

@renxzen
Copy link
Author

renxzen commented Feb 22, 2024

<${tagName}>text</${tagName}>
so you would have to extend the html parser to parse ${expr} and then delegate expr to the JS parser
also you will have to handle escaped quotes: \` and \\\`

I haven't encountered that use case yet on lit-html.
Nonetheless, after your comment i ran an experiment with this code:

<element
	type="info"
	header=title
	body=${value}
	${value()}
	button=${value + 'main'}
></element>

Where i found that:

  • header=title without quotes, is correctly being parsed as (attribute (attribute_value))
  • body=${value} is correctly being parsed as (attribute (attribute_value))
  • ${value()} is correctly being parsed as (attribute (attribute_name))
  • button=${value + 'main'} here is where the error occurs.

Apparently the problem comes when there's a single quotes inside the brackets, eg. this.translate('someString')
I got this output with tree-sitter parse:

(document [0, 0] - [8, 0]
  (element [0, 0] - [6, 11]
    (start_tag [0, 0] - [6, 1]
      (tag_name [0, 1] - [0, 8])
      (attribute [1, 1] - [1, 12]
        (attribute_name [1, 1] - [1, 5])
        (quoted_attribute_value [1, 6] - [1, 12]
          (attribute_value [1, 7] - [1, 11])))
      (attribute [2, 1] - [2, 13]
        (attribute_name [2, 1] - [2, 7])
        (attribute_value [2, 8] - [2, 13]))
      (attribute [3, 1] - [3, 25]
        (attribute_name [3, 1] - [3, 5])
        (attribute_value [3, 6] - [3, 25]))
      (attribute [4, 1] - [4, 18]
        (attribute_name [4, 1] - [4, 18]))
      (attribute [5, 1] - [5, 21]
        (attribute_name [5, 1] - [5, 7])
        (attribute_value [5, 8] - [5, 21]))
      (attribute [5, 22] - [5, 23]
        (attribute_name [5, 22] - [5, 23]))
      (ERROR [5, 24] - [5, 31]
        (ERROR [5, 25] - [5, 29])
        (ERROR [5, 30] - [5, 31])))
    (end_tag [6, 1] - [6, 11]
      (tag_name [6, 3] - [6, 10]))))
html.html       0 ms    (ERROR [5, 24] - [5, 31])

@DiegoCardoso
Copy link

I started getting my whole Neovim to freeze whenever I opened some of the files in the project I am working on that use lit-html templates. Interestingly it only started to occur after I updated all my plugins (probably a noobie mistake).

After finding this thread, I tried surrounding all attributes with quotes and noticed that the error was gone. Is there any workaround for this issue? I want to use the HTML parser but it's impossible since it completely freezes the application.

@renxzen
Copy link
Author

renxzen commented Mar 31, 2024

hello @DiegoCardoso can you tell me if you have a function with a string parameter?

e.g.
atribute=${function('value')}

@DiegoCardoso
Copy link

DiegoCardoso commented Mar 31, 2024

Not in this case. I went and added quotes in all bindings and removed them one by one until I found that removing quotes on the @click event binding of the following snippet would cause the issue:

<button. 
  class="i18n-cta-button"
  .disabled=${!someStaticStringsSelected}
  @click="${() => this.internationalizeSelection()}">
    Internationalize selection
</button>

Weirdly, there are other bindings with similar content where the same error does not occur.

EDIT

I opened the inspect tree pane and performed the same operation, and I got the following stack trace (couldn't take the whole message):
image

@milahu
Copy link

milahu commented Mar 31, 2024

I'm using the html parser inside html`...` templates on lit

javascript template string != html

this will never work with an html parser
so this is clearly out of scope for tree-sitter-html

you will have to create a javascript-and-html parser
or stop using lit, and start using a JSX-based framework like qwik

   const tpl = html`<div class=${x ? "a" : 'b'}></div>`;
// ^^^^^^^^^^^^^^^^^ js        ^               ^      ^
//                  ^^^^^^^^^^^ html
//                             ^^^^^^^^^^^^^^^^ js
//                                             ^^^^^^^ html
//                                                    ^^ js

@renxzen
Copy link
Author

renxzen commented Mar 31, 2024

javascript template string != html

This is html we're talking about tho. Here it is a minimal repro:

<element
    type=${function()}
    body=${function('value')}
></element>

output:

(document [0, 0] - [4, 0]
  (element [0, 0] - [3, 11]
    (start_tag [0, 0] - [3, 1]
      (tag_name [0, 1] - [0, 8])
      (attribute [1, 1] - [1, 19]
        (attribute_name [1, 1] - [1, 5])
        (attribute_value [1, 6] - [1, 19]))
      (attribute [2, 1] - [2, 17]
        (attribute_name [2, 1] - [2, 5])
        (attribute_value [2, 6] - [2, 17]))
      (ERROR [2, 17] - [2, 26]
        (ERROR [2, 18] - [2, 23])
        (ERROR [2, 24] - [2, 26])))
    (end_tag [3, 1] - [3, 11]
      (tag_name [3, 3] - [3, 10]))))
html.html          0.10 ms         711 bytes/ms (ERROR [2, 17] - [2, 26])

the problem i'm encountering is not similar to @DiegoCardoso
since i encounter an error when the attribute value has single quotes, like calling a function with a string as parameter

or stop using lit, and start using a JSX-based framework like qwik

ironically, i love qwik. it's my favorite lmao.

@milahu
Copy link

milahu commented Mar 31, 2024

This is html we're talking about

no, that is "javascript with embedded html"

@renxzen
Copy link
Author

renxzen commented Mar 31, 2024

i provided a html example with tree-sitter-cli output using this repo's latest commit as the parser

regardless of where i found this, which is html templates from javascript, i think the fact that as long as the attribute value is unquoted, has a ${...} expression and doesn't have quote characters inside, is correctly tagged as (attribute (attribute_value)), means that these kind of expressions were expected in the parser from the beginning.

is there a way to ignore single quotes inside ${ ... } for the parser?

i'd love to implement and test it, could you maybe point me in the right direction?
since i'm not familiar with C/C++ in this project, maybe it would take me a while to figure things out

@milahu
Copy link

milahu commented Mar 31, 2024

regardless of where i found this

it does matter where this html is stored. context matters

html templates from javascript

exactly
this means the html parser will never see ${expr}

body=${function('value')}

at least this should not give a parse error
but should parse ${function('value')} as attribute value

unquoted attribute values end on space or >
so this would break on body=${function('va space lue')}
which would parse ${function('va as attribute value
space as boolean attribute and lue')} as parse error

is there a way to ignore single quotes

such hacks dont make sense
either you want to support all javascript expressions in ${expr} or none

point me in the right direction

maybe look at JSX parsers
maybe modify the template string parser of a javascript parser
maybe find a way to merge javascript and html parser

see also

@DiegoCardoso
Copy link

It is still not clear to me why the HTML parser suddenly started to be used for the string template passed to the Lit html function. The only thing I did was to update my plugins (unfortunately, I believe I won't be able to get back the versions I updated from).

Nevertheless, it feels wrong that the parser would cause a stack overflow error and freeze the whole application.

Is there a simple way to disable the parser so that it won't try to parse JS strings? Now I need to uninstall the parser so I can continue working on my project.

@renxzen
Copy link
Author

renxzen commented Apr 15, 2024

maybe look at JSX parsers maybe modify the template string parser of a javascript parser maybe find a way to merge javascript and html parser

Thank you so much for your help @milahu !!
Thanks to the documentation you provided, i ended up modifying this parser to accomodate my lit-html needs
https://github.com/renxzen/tree-sitter-html

Been testing it for these past few days and gave me an idea about creating a lit-html parser that would not take into account single quotes cases and parse ${...} into their own elements, since that's mostly what's been happening for html`` templates inside javascript.
Would like to know your opinion about it

@milahu
Copy link

milahu commented Apr 16, 2024

nice try, but that is still no javascript-in-javascript-string-interpolation parser

grammar.js.diff

--- a/grammar.js
+++ b/grammar.js
@@ -120,13 +120,16 @@ module.exports = grammar({
         choice(
           $.attribute_value,
           $.quoted_attribute_value,
+          $.js_expression_value,
         ),
       )),
     ),
 
-    attribute_name: _ => /[^<>"'/=\s]+/,
+    attribute_name: _ => /[^<>"/=\s]+/,
 
-    attribute_value: _ => /[^<>"'=\s]+/,
+    attribute_value: _ => /[^<>"=\s]+/,
+
+    js_expression_value : _ => /\${[^}]+}/,
 
     // An entity can be named, numeric (decimal), or numeric (hexacecimal). The
     // longest entity name is 29 characters long, and the HTML spec says that
@@ -134,7 +137,6 @@ module.exports = grammar({
     entity: _ => /&(#([xX][0-9a-fA-F]{1,6}|[0-9]{1,5})|[A-Za-z]{1,30});?/,
 
     quoted_attribute_value: $ => choice(
-      seq('\'', optional(alias(/[^']+/, $.attribute_value)), '\''),
       seq('"', optional(alias(/[^"]+/, $.attribute_value)), '"'),
     ),

js_expression_value : _ => /\${[^}]+}/,

this fails on

const tpl = html`
  <br id=${"}"}>
`;

and

const tpl = html`
  <br id=${await (async () => {
    // this can contain literally any javascript code, wrapped in an IIFE
    (await import("...")).asdf();
    return html`... including nested template strings`;
  })()}>
`;

nit: rename tree-sitter-html to tree-sitter-lit-html
nit: use latest tree-sitter-cli to reduce diff noise in src/tree_sitter/parser.h
nit: add online playground, example: ikatyang/tree-sitter-yaml

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants