-
Notifications
You must be signed in to change notification settings - Fork 38
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improve handling HTML special cases #312
Commits on Jan 26, 2022
-
Aggressively try to retain markup on words if it appears on one of it…
…s source tokens I do need those continuation delimiters for that, even though I really don't like them since they're so character set focussed!
Configuration menu - View commit details
-
Copy full SHA for 40eabc1 - Browse repository at this point
Copy the full SHA 40eabc1View commit details -
Configuration menu - View commit details
-
Copy full SHA for 723e725 - Browse repository at this point
Copy the full SHA 723e725View commit details
Commits on Feb 8, 2022
-
Configuration menu - View commit details
-
Copy full SHA for 9600c70 - Browse repository at this point
Copy the full SHA 9600c70View commit details -
Make HTML tags case insensitive
Tag case is retained in the output though. Well, for the opening tag at least. Closing tag always matches opening tag.
Configuration menu - View commit details
-
Copy full SHA for 3d6673c - Browse repository at this point
Copy the full SHA 3d6673cView commit details -
Configuration menu - View commit details
-
Copy full SHA for 5634c40 - Browse repository at this point
Copy the full SHA 5634c40View commit details -
Configuration menu - View commit details
-
Copy full SHA for e516dbd - Browse repository at this point
Copy the full SHA e516dbdView commit details -
Configuration menu - View commit details
-
Copy full SHA for 19acb54 - Browse repository at this point
Copy the full SHA 19acb54View commit details
Commits on Feb 9, 2022
-
Add test for regression in ignored element code path
std::bad_alloc :( Also expand tests to make sure we're recording the full ignored tag contents.
Configuration menu - View commit details
-
Copy full SHA for 46159ba - Browse repository at this point
Copy the full SHA 46159baView commit details -
Fix bad_alloc in consumeIgnoredTag
Trouble was that `Scanner::scanEntity()` returns a value() that does not point to inside the HTML input stream (but to a *decoded* entity instead). So we need another API, `Scanner::start()`, to figure out where a token starts in HTML.
Configuration menu - View commit details
-
Copy full SHA for af39c75 - Browse repository at this point
Copy the full SHA af39c75View commit details
Commits on Feb 11, 2022
-
Prevent straggler void elements to show up twice
When a word near the of a translated sentence aligns with one at the beginning, it pushes prevIt back to the beginning. Then the next translated token will insert all straggler void elements between prevIt and it. Instead of using prevIt to track where we were with inserting stragglers, we keep our own iterator that never moves backwards.
Configuration menu - View commit details
-
Copy full SHA for f595c51 - Browse repository at this point
Copy the full SHA f595c51View commit details -
Use isContinuation function to check whether we need to insert a spac…
…e after a tag Main reason for using this instead of `std::isspace` is to prevent a space being inserted between the tag and the full stop in `This is a <b>test</b>.`. Because that has been bothering me a lot.
Configuration menu - View commit details
-
Copy full SHA for 32f403a - Browse repository at this point
Copy the full SHA 32f403aView commit details
Commits on Feb 14, 2022
-
Merge branch 'main' into html-improvements
# Conflicts: # src/translator/html.cpp
Configuration menu - View commit details
-
Copy full SHA for afc75f0 - Browse repository at this point
Copy the full SHA afc75f0View commit details -
Treat more elements as opaque when parsing
These are all elements that Firefox treats as opaque in their HTML5 parser. As a consequence, when you'd request `noscriptElement.innerHTML` you'd get the raw text content of the thing, as opposed to a serialized tree. So invalid HTML? Just passed on as is! Well, we're going to do the same then. Besides, if noscript then also probably no extension.
Configuration menu - View commit details
-
Copy full SHA for 72e54f8 - Browse repository at this point
Copy the full SHA 72e54f8View commit details -
This tag is a bit difficult. No HTML is allowed inside of it (e.g. similar to `<textarea>`) but we do want to capture it's text content as text (decoding entities etc.) so we can translate it. So for now I'll just trust that nobody is insane enough to use HTML inside the title tag. And if they do, we'll be as insane back and try to maintain that (very much not allowed) structure.
Configuration menu - View commit details
-
Copy full SHA for ea244d2 - Browse repository at this point
Copy the full SHA ea244d2View commit details
Commits on Feb 16, 2022
-
Configuration menu - View commit details
-
Copy full SHA for dda9860 - Browse repository at this point
Copy the full SHA dda9860View commit details -
Configuration menu - View commit details
-
Copy full SHA for d7e1c07 - Browse repository at this point
Copy the full SHA d7e1c07View commit details -
Add more comments and less creative variable names
Hopefully this will make the overall code more readable given you're familiar with the concept it tries to implement…
Configuration menu - View commit details
-
Copy full SHA for 203ba0a - Browse repository at this point
Copy the full SHA 203ba0aView commit details -
Configuration menu - View commit details
-
Copy full SHA for a1ee8e9 - Browse repository at this point
Copy the full SHA a1ee8e9View commit details
Commits on Feb 21, 2022
-
Configuration menu - View commit details
-
Copy full SHA for ac83e50 - Browse repository at this point
Copy the full SHA ac83e50View commit details -
Configuration menu - View commit details
-
Copy full SHA for 6a7bd21 - Browse repository at this point
Copy the full SHA 6a7bd21View commit details -
Configuration menu - View commit details
-
Copy full SHA for c90d00f - Browse repository at this point
Copy the full SHA c90d00fView commit details -
Configuration menu - View commit details
-
Copy full SHA for ad612e4 - Browse repository at this point
Copy the full SHA ad612e4View commit details -
Configuration menu - View commit details
-
Copy full SHA for f451983 - Browse repository at this point
Copy the full SHA f451983View commit details -
Configuration menu - View commit details
-
Copy full SHA for c891eda - Browse repository at this point
Copy the full SHA c891edaView commit details -
Configuration menu - View commit details
-
Copy full SHA for 54be426 - Browse repository at this point
Copy the full SHA 54be426View commit details -
Configuration menu - View commit details
-
Copy full SHA for 279462c - Browse repository at this point
Copy the full SHA 279462cView commit details -
Configuration menu - View commit details
-
Copy full SHA for 346821b - Browse repository at this point
Copy the full SHA 346821bView commit details
Commits on Feb 22, 2022
-
Configuration menu - View commit details
-
Copy full SHA for 8cc695b - Browse repository at this point
Copy the full SHA 8cc695bView commit details -
Configuration menu - View commit details
-
Copy full SHA for 48cfc00 - Browse repository at this point
Copy the full SHA 48cfc00View commit details -
Configuration menu - View commit details
-
Copy full SHA for a81dfdf - Browse repository at this point
Copy the full SHA a81dfdfView commit details -
Configuration menu - View commit details
-
Copy full SHA for bbfa4e3 - Browse repository at this point
Copy the full SHA bbfa4e3View commit details -
Configuration menu - View commit details
-
Copy full SHA for ea10e91 - Browse repository at this point
Copy the full SHA ea10e91View commit details