Does not parse HTML properly #18

wosc · 2023-02-01T08:34:29Z

Our production application contains quite a few inline <script> tags with accumulated javascript inside. An excerpt looks like this:

<head>
<script>
// snip
                            if ( something < other ) {
// snip
                            // explanatory comment: we replace " and ' as late as possible
// snip
</script>

<esi:remove>This directive is not executed</esi:remove>
</head>

When processing this kind of content, the esi crate does not execute any esi-directives (at least inside <head> in the example, directives later in <body> are picked up). I guess this is due to using quick_xml as the parser, which expects XML, where e.g. < inside the script tag would have to be escaped as <, but is getting HTML, where the escaping rules are much more relaxed -- and conversely, applying XML-style escapes in an HTML document results in JavaScript syntax errors, so that's not a solution. I think we really need an HTML-aware parser here.

The text was updated successfully, but these errors were encountered:

kailan added this to the v0.4.0 milestone Mar 13, 2023

kailan removed this from the v0.4.0 milestone Jul 24, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Does not parse HTML properly #18

Does not parse HTML properly #18

wosc commented Feb 1, 2023

Does not parse HTML properly #18

Does not parse HTML properly #18

Comments

wosc commented Feb 1, 2023