You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Expected behavior: The escaped UTF-16 surrogate pair should be combined into a single unicode codepoint. From the JSON RFC:
To escape an extended character that is not in the Basic Multilingual Plane, the character is represented as a twelve-character sequence, encoding the UTF-16 surrogate pair. So, for example, a string containing only the G clef character (U+1D11E) may be represented as "\uD834\uDD1E".
Actual behavior: The characters become null.
Versions
4.12.0-rc1
Additional Information
[Additional information, configuration or data that might be necessary to reproduce the issue]
Prerequisites
Please put an X between the brackets as you perform the following steps:
https://github.com/leanprover/lean4/issues
Avoid dependencies to Mathlib or Batteries.
https://live.lean-lang.org/#project=lean-nightly
(You can also use the settings there to switch to “Lean nightly”)
Description
The Json parser does not correctly handle surrogate pairs
Context
ASCII-encoded JSON (the default) serialized with Python is sometimes corrupted when loaded in Lean.
Steps to Reproduce
Expected behavior: The escaped UTF-16 surrogate pair should be combined into a single unicode codepoint. From the JSON RFC:
Actual behavior: The characters become null.
Versions
4.12.0-rc1
Additional Information
[Additional information, configuration or data that might be necessary to reproduce the issue]
Impact
Add 👍 to issues you consider important. If others are impacted by this issue, please ask them to add 👍 to it.
The text was updated successfully, but these errors were encountered: