-
-
Notifications
You must be signed in to change notification settings - Fork 628
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Replace non-custom flake8 linting with ruff #21037
base: main
Are you sure you want to change the base?
Conversation
If it's the same speed or faster, it's a +1 for me. Question: What's the perf like once we get rid of flake entirely? |
I did some basic experiments running
Running ruff outside of pants runs much faster (e.g. (Potentially there's some processes like discovering Python interpreters that are being included in the |
That's still a decent improvement, but clearly could be a lot more. There is something to also be said about creating sandboxes and whatnot for lint/check-based operations that won't mutate the environment. Maybe something like the option to run those in the new workspace environments 🤷🏽 That's a can of worms all on its own, but overall, to answer the original question - it seems like swapping over is a win all around! And will likely become more of a win as we optimize some more steps. |
# flake8-2020 | ||
"YTT", | ||
# flake8-implicit-str-concat, but only on a single line (includes bytes) | ||
"ISC001", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we also need to add ISC002 to match NIC002 ? If it includes bytes too, I only found one instance of that (in a test), so it might be easier to fix it there
One thing to remember is that some linters are just formatter we run and check if they've modified any files |
Yeah, case-by-case - take the optimizations where we can, for those formatters that don't come baked with a "check" option, we can default to comparing sandbox code. As much as I hate splitting up code paths, if there is any non-trivial perf improvement, I generally think build tools should tend towards those options. |
I think we can get rid of a lot of the sandbox overhead for fast tools (i.e. where the overhead of building sandboxes is a high proportion of the total time) by creating fewer sandboxes/putting more files in each one. See analysis/ideas in #18570 (comment) In particular, running things in sandboxes has correctness advantages (e.g. handling of edits that happen concurrently with a running processes #21051 (comment)), that'd be nice to avoid losing by default, I think. |
From your linked comment, what is the cost of sandboxing X files on a modern computer? In my mind, if the cost to sandbox ever exceeds the cost of running the process, we'd probably want an escape hatch to run it in a workspace or skip the sandbox. With a single Your equation from that linked comment I think is the right way, so long as the minimum number of sandboxes could be 0, not 1 (by default, 0 sandboxes would need to be opt-in though, as it's riskier to your concurrent edit points above). Running fast tools without a sandbox in CI, for example, might be nice. And I can basically guarantee that I would always opt into whatever is fastest :) |
Can I suggest we move the discussion of tweaking sandboxing behaviour to #18570 or somewhere else, doesn't feel relevant to this PR? |
Oh yeah, sure. I'm +1 for this change |
This swaps all flake8 usage to ruff via the built-in Ruff backend, except for the custom PNT20 and PNT30 custom rules.
We still have to run flake8, and it does not run much faster (still ~10s). So, this is doing "strictly" more work than previously: run flake8 and ruff.
However this change does open the door for easily enabling some additional lints that are built-in to ruff, like #21018, rather than requiring searching out and installing a flake8 plugin.
I did some basic experiments running
main
:pants --no-local-cache lint --stats-log --only=flake8 ::
pants --no-local-cache lint --stats-log --only=ruff-check ::
local_process_total_time_run_ms
(B)Running ruff outside of pants runs much faster (e.g.
ruff check .
takes like 300ms, so I'd expect a lot of the 7.7 seconds is Pants overhead, e.g. setting up sandboxes or unzipping pexes or whatever. #18570 is potentially related.Thoughts?