-
-
Notifications
You must be signed in to change notification settings - Fork 194
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Unicode leads to incorrect indentation #2945
Comments
Wow, this one is quite the sight 🙈. |
What I imagine is happening here is that we're using the I checked to see how we do this in Darklang and found this:
|
That is an interesting pointer, thanks! I'll try and follow-up on this. |
Hmm, the range of SynExpr.Const(constant = SynConst.String(text = "ä͖̭̈̇", synStringKind = SynStringKind.Regular, range = R("(1,2--1,10)")), range = R("(1,2--1,10)")) That is definitely part of the problem and would need a fix on the compiler side. Most likely EDIT: Maybe not. Not sure what to make of that. |
I guess the range is using the number of bytes, and I'm guessing this is an 8 byte unicode character. Fantomas clearly needs to use unicode length, but I don't know about the compiler. Is the compiler's range field supposed to be length in bytes or length on screen? If it's trying to do error reporting, it might be length on screen (in which case it's incorrect here). Could fantomas use the text to get the EGC length instead of using range? I'm guessing that would fix it (though it might miss other cases like eg Match Patterns). |
I believe that will be the length on the screen.
Possibly, so Fantomas doesn't use the string value that is stored in the AST because it is an optimized representation of the value. There are multiple scenarios where this is beneficial. When we grab the string from the source text we probably need to do something more clever when there is unicode involved in https://github.com/fsprojects/fantomas/blob/e671f3d7c68a258d80f6440ea82aaada2c48a34d/src/Fantomas.Core/ISourceTextExtensions.fs We can detect the difference between EGC and range: in fantomas/src/Fantomas.Core/ASTTransformer.fs Lines 87 to 94 in 0ce91b7
I'm just not really sure, how to extract the right thing from the |
Issue created from fantomas-online
Code
Result
Problem description
This is formatted weiiiiiird! Though the code is valid still.
FYI, we have lots of F# unicode edgecases in the darklang codebase, esp in string.dark
Extra information
Options
Fantomas main branch at 1/1/1990
The text was updated successfully, but these errors were encountered: