Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Pass through for certain HTML elements #313

Closed
jelmervdl opened this issue Jan 27, 2022 · 0 comments · Fixed by #312
Closed

Pass through for certain HTML elements #313

jelmervdl opened this issue Jan 27, 2022 · 0 comments · Fixed by #312
Assignees
Labels
enhancement New feature or request mod: html Issues related to handling HTML

Comments

@jelmervdl
Copy link
Member

jelmervdl commented Jan 27, 2022

We're treating <script> and <style> elements in a special way right now to not translate their contents. Should we also do this for:

  • <code>: translating code is probably not going to go well
  • <kbd>: semantic meaning is a bit like that of code.
  • <pre>: bit more debatable but right now we're not handling white space around tags very consistently so there is a good chance we'll mess up the intended preformatted text. But this might end up skipping large parts of specific types of sites, like mailing list archives.
  • <textarea>: we probably do not want to mess with its contents

This could be handled inside the extension, but other uses of bergamot-translator might also benefit from having it here. Also, if you'd implement it in the extension you'd need to add a new mechanism for the whole element substitution thing.

@jelmervdl jelmervdl added enhancement New feature or request mod: html Issues related to handling HTML labels Jan 27, 2022
@jelmervdl jelmervdl self-assigned this Jan 27, 2022
@jelmervdl jelmervdl changed the title Pass through for certain html elements Pass through for certain HTML elements Jan 27, 2022
jelmervdl added a commit that referenced this issue Feb 8, 2022
@jelmervdl jelmervdl linked a pull request Feb 15, 2022 that will close this issue
jerinphilip pushed a commit that referenced this issue Feb 22, 2022
- Prefer spreading markup over a full word.
- Ignore certain tags that are unlikely to be supposed to be translated,
  such as `<code>` and `<samp>`.
- Never treat `<wbr>` as a space.
- Allow for inconsistent cases in tag names.
- Fix bug where void elements were inserted multiple times.
- Better handling of whitespace around punctuation.
- Ignore parsing `<noscript>` to be compatible with Firefox.
- Improvements to documentation and readability of `HTML` and `Scanner`
  classes.

Fixes: #313, #339
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request mod: html Issues related to handling HTML
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant