Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cider-docstring--format breaks some formatting #3709

Closed
katomuso opened this issue Jun 9, 2024 · 16 comments · Fixed by #3712
Closed

cider-docstring--format breaks some formatting #3709

katomuso opened this issue Jun 9, 2024 · 16 comments · Fixed by #3712

Comments

@katomuso
Copy link
Contributor

katomuso commented Jun 9, 2024

Currently, the cider-docstring--format function used for formatting docstrings breaks some formatting by moving every single paragraph (indicated by . ) to a new line and removing a variable number of spaces from the beginning of each line. Let me demonstrate how it looks.

Snippet to get original docstring:

(thread-first
  (cider-var-info "clojure.core/reduce")
  (nrepl-dict-get "doc"))

Original docstring:

f should be a function of 2 arguments. If val is not supplied,
  returns the result of applying f to the first 2 items in coll, then
  applying f to that result and the 3rd item, etc. If coll contains no
  items, f must accept no arguments as well, and reduce returns the
  result of calling f with no arguments.  If coll has only 1 item, it
  is returned and f is not called.  If val is supplied, returns the
  result of applying f to val and the first item in coll, then
  applying f to that result and the 2nd item, etc. If coll contains no
  items, returns val and f is not called.

Snippet to get formatted docstring:

(thread-first
  (cider-var-info "clojure.core/reduce")
  (nrepl-dict-get "doc")
  cider-docstring--format)

Formatted docstring:

f should be a function of 2 arguments. If val is not supplied,
returns the result of applying f to the first 2 items in coll, then
applying f to that result and the 3rd item, etc. If coll contains no
items, f must accept no arguments as well, and reduce returns the
result of calling f with no arguments.

If coll has only 1 item, it
is returned and f is not called.

If val is supplied, returns the
result of applying f to val and the first item in coll, then
applying f to that result and the 2nd item, etc. If coll contains no
items, returns val and f is not called.

It seems that nobody has noticed this strange formatting because cider-docstring--format is currently used only for docstrings in the minibuffer and in completions and not many people look at the docstrings there (though it will also be beneficial to use it for docstrings in documentation buffers).

As discussed in #3701, removing two spaces from the beginning while formatting the docstring will be good enough.

@katomuso katomuso changed the title Change the way cider-docstring--format performs formatting cider-docstring--format breaks some formatting Jun 9, 2024
@katomuso
Copy link
Contributor Author

katomuso commented Jun 9, 2024

I should've mentioned that this formatting is done specifically before displaying the docstring to the user. We remove two spaces from the beginning because this is a literal docstring from the source code. Since there are two spaces at the beginning of lines in docstrings, we remove them to make the docstring visually align nicely.

@yuhan0
Copy link
Contributor

yuhan0 commented Jun 9, 2024

Agreed that this is a much too idiosyncratic and 'surprising' formatting choice to be applying across the board to all docstrings, especially if (as the inline comment suggests) it was intended for the specific case of clojure.core/reduce.

@vemv
Copy link
Member

vemv commented Jun 9, 2024

. is used beyond reduce - it's relatively frequent across clojure.core.

I find the original reduce docstring 100% unreadable - paragraphs give readers breathing space. It's part of the intended semantics of . which is a different construct from . .

(You used paragraphs in your OP as well 😄)

Anyway, the reformatting is primarily intended to be used within a specific context, namely trimming docstrings semantically up to N lines. Splitting by . allows to split in a more fine-grained manner while also not trimming in a random (non 'semantic') place.

Nobody else has complained about this since its introduction, so please let's default to doing nothing until real problems pop up, if any.

Thanks!

@vemv vemv closed this as completed Jun 9, 2024
@bbatsov
Copy link
Member

bbatsov commented Jun 9, 2024

@vemv I think here the question is whether those spaces are intentional or accidental, though. And to me they definitely seem accidental, so treating them specially seems like a mistake on our part.

@bbatsov bbatsov reopened this Jun 9, 2024
@vemv
Copy link
Member

vemv commented Jun 9, 2024

I'm pretty sure they're intentional, although sometimes inconsistent.

'Two spaces after a period' is definitely a thing in English.

I thought of asking over https://ask.clojure.org/ if they could use . more frequently so that tools have the chance to detect paragraphs in clojure.core.

(Especially given that larger docstring changes there seem highly unlikely)

@katomuso
Copy link
Contributor Author

katomuso commented Jun 9, 2024

@vemv It's okay if you want to leave splitting paragraphs be, but currently it is done incorrectly (note how the "If coll has only 1 item, it is returned and f is not called." paragraph is broken into two lines). And it's also worthwhile to replace not all spaces from the beginning, but exactly two because there can be some additional indentation, and it breaks it.

@bbatsov
Copy link
Member

bbatsov commented Jun 9, 2024

'Two spaces after a period' is definitely a thing in English.

But it's super controversial and rather uncommon these days. If I were to wager a guess, it'd be that some of them are to do with early usage of Emacs for Clojure development (by Rich), as Emacs was the last tool that was pushing for 2 spaces by default. Semantically there's no difference between single space and double space between sentences.

If someone can actually have a real paragraph in the docstring why rely on something obscure instead? I'm reasonably sure those are just accidental.

@katomuso
Copy link
Contributor Author

katomuso commented Jun 9, 2024

I agree with @vemv that two spaces after a paragraph's dot are intentional and a common idiom in comments, but I think this is too opinionated to make every single paragraph on a separate line.

@bbatsov
Copy link
Member

bbatsov commented Jun 9, 2024

I agree with @vemv that two spaces after a paragraph's dot are intentional and a common idiom in comments, but I think this is too opinionated to make every single paragraph on a separate line.

I'm a bit puzzled by this - if it's just our assumption those mark paragraphs, how do we know it's a common idiom to denote them? I've spent quite a bit of time exploring the elements of style in both English and programming and that's not something I've ever come across.

@katomuso
Copy link
Contributor Author

katomuso commented Jun 9, 2024

Well, I mean it's common in the Clojure standard library, and periodically I find them in Emacs as well. But yeah, outside of the standard library, they are very rare, so we should not rely on that assumption.

@bbatsov
Copy link
Member

bbatsov commented Jun 9, 2024

But in Emacs by default it's 2 spaces all the time (Emacs even has docstring lint checks) and paragraphs are just paragraphs, that's why I don't understand why you think they have some special meaning. Perhaps some examples will help me to understand what you mean?

@yuhan0
Copy link
Contributor

yuhan0 commented Jun 9, 2024

Might also be worth pointing out that periods can be used for other purposes like ASCII diagrams, which would be completely broken by such a formatting choice.

A somewhat contrived example:

(defn translate-morse
  "Convert the dots-and-dashes representation of Morse code into a string:
  .-   => A   .    => E    (etc.)
  -... => B   .--. => F
  -.-. => C   --.  => G
  -..  => D   .... => H"
  [input]
  ,,,)

More broadly it seems like an overreach for CIDER to make any assumptions on behalf of the author of the docstring and their intended layout, beyond the bare minimum (e.g. trimming the 2-spaces before each line, converting backticked symbols into links etc.)

@katomuso
Copy link
Contributor Author

katomuso commented Jun 9, 2024

Can we conclude that for now it will be okay to just trim 2 spaces before each line for docstrings which are intended to be displayed?

@yuhan0
Copy link
Contributor

yuhan0 commented Jun 11, 2024

I think this issue should remain open, the part about ". " formatting hasn't been resolved.

For clarity's sake, here's the current implementation applied to the above translate-morse docstring:

(with-current-buffer "repro.clj"
  (thread-first
    (cider-var-info "repro/translate-morse")
    (nrepl-dict-get "doc")
    cider-docstring--format
    princ))

=>

Convert the dots-and-dashes representation of Morse code into a string:
.-   => A   .

=> E    (etc.)
-... => B   .--. => F
-.-. => C   --.

=> G
-..

=> D   .... => H

@katomuso
Copy link
Contributor Author

katomuso commented Jun 11, 2024

@yuhan0 Hmm, are you sure you are using the latest version of cider-docstring--format? I guess that's the case because current cider-docstring--format is as simple as (replace-regexp-in-string "\n " "\n" string).

@yuhan0
Copy link
Contributor

yuhan0 commented Jun 11, 2024

Oops you're right, had things mixed up with git worktrees and stale elc's. All resolved on the latest version :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants