Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add option to Document.source to read files without universalized line endings. #224

Open
karthiknadig opened this issue Feb 10, 2022 · 6 comments

Comments

@karthiknadig
Copy link
Contributor

Currently pygls provides the text document source where the line endings are universalized. This is an issue when computing diffs and while applying changes. Universalized line endings might not be the right thing when reading source files.

pygls/pygls/workspace.py

Lines 273 to 277 in bbf671f

def source(self) -> str:
if self._source is None:
with io.open(self.path, 'r', encoding='utf-8') as f:
return f.read()
return self._source

This is an example of a problem, where we are trying to handle refactoring request, and it leads to adding too many line endings. pappasam/jedi-language-server#159 . We had a similar problem implementing formatting. Any scenario where text edits have to be handled this becomes an issue.

In the message where pygls gets the text from the Languange client, I see that the IDE preserves the line endings. See this trace for textDocument/didOpen where the server receives full text, line endings are as is.

[Trace - 11:14:48 PM] Sending notification 'textDocument/didOpen'.
Params: {
    "textDocument": {
        "uri": "file:///c%3A/GIT/s%20p/vscode-python/pythonFiles/interpreterInfo.py",
        "languageId": "python",
        "version": 1,
        "text": "# Copyright (c) Microsoft Corporation. All rights reserved.\r\n# Licensed under the MIT License.\r\n\r\nimport json\r\nimport sys\r\n\r\nobj = {}\r\nobj[\"versionInfo\"] = tuple(sys.version_info)\r\nobj[\"sysPrefix\"] = sys.prefix\r\nobj[\"sysVersion\"] = sys.version\r\nobj[\"is64Bit\"] = sys.maxsize > 2**32\r\n\r\nprint(json.dumps(obj))\r\n"
    }
}

It would be helpful if we can control the line endings for source files

@tombh
Copy link
Collaborator

tombh commented Dec 3, 2022

I think I'm seeing this issue too. But I don't completely understand the issue. Are you saying that forcing of UTF-8 in io.open(self.path, 'r', encoding='utf-8') is causing problems when the file itself is not UTF-8?

@zbyna
Copy link

zbyna commented Dec 3, 2022

I may not understand this issue but in my opinion encoding does not play the role here. I encounterred this issue time ago and I had to change line endings in source file to finish operation without adding extra lines and change line endings back. It was during editing python source file in vscode.

@tombh
Copy link
Collaborator

tombh commented Dec 5, 2022

You had to change line endings in the source file? How do you mean? You, as an LSP user, had to change line endings in your editor, VSCode or Vim or something? That obviously is not the solution we're looking for here, because we want to be able to support all forms of line endings.

@zbyna
Copy link

zbyna commented Dec 5, 2022

OK. We are on the same boat. It is not a solution to change line endings for proper LSP functioning. To avoid confusion, here is video how LSP behaves with different line endings (utf-8 encoded source file):

Code_0LCvZB7WmU.mp4

@zbyna
Copy link

zbyna commented Dec 6, 2022

From Will open() function change line-endings in files? on python.org :

Assuming that you have CRLF endings in your file, the logic is:

    When reading, you will get a string with just LF.
    When you write that string back into the file, LF will be converted to CRLF again.

@tombh
Copy link
Collaborator

tombh commented Dec 6, 2022

Fantastic, I see exactly what you mean now. Thank you.

Also, I just read the Jedi server issue and there @karthiknadig suggested adding newline='' to Pygls' open() call. So, I think it would become this:

def source(self) -> str: 
    if self._source is None: 
        with io.open(self.path, 'r', encoding='utf-8', newline='') as f: 
            return f.read() 
    return self._source 

Could this be a potential fix?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants