Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Does not parse HTML properly #18

Open
wosc opened this issue Feb 1, 2023 · 0 comments
Open

Does not parse HTML properly #18

wosc opened this issue Feb 1, 2023 · 0 comments

Comments

@wosc
Copy link

wosc commented Feb 1, 2023

Our production application contains quite a few inline <script> tags with accumulated javascript inside. An excerpt looks like this:

<head>
<script>
// snip
                            if ( something < other ) {
// snip
                            // explanatory comment: we replace " and ' as late as possible
// snip
</script>

<esi:remove>This directive is not executed</esi:remove>
</head>

When processing this kind of content, the esi crate does not execute any esi-directives (at least inside <head> in the example, directives later in <body> are picked up). I guess this is due to using quick_xml as the parser, which expects XML, where e.g. < inside the script tag would have to be escaped as &lt;, but is getting HTML, where the escaping rules are much more relaxed -- and conversely, applying XML-style escapes in an HTML document results in JavaScript syntax errors, so that's not a solution. I think we really need an HTML-aware parser here.

@kailan kailan added this to the v0.4.0 milestone Mar 13, 2023
@kailan kailan removed this from the v0.4.0 milestone Jul 24, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants