Replace non-custom flake8 linting with ruff #21037

huonw · 2024-06-09T03:44:01Z

This swaps all flake8 usage to ruff via the built-in Ruff backend, except for the custom PNT20 and PNT30 custom rules.

We still have to run flake8, and it does not run much faster (still ~10s). So, this is doing "strictly" more work than previously: run flake8 and ruff.

However this change does open the door for easily enabling some additional lints that are built-in to ruff, like #21018, rather than requiring searching out and installing a flake8 plugin.

I did some basic experiments running

flake8 on main: pants --no-local-cache lint --stats-log --only=flake8 ::
ruff on this PR: pants --no-local-cache lint --stats-log --only=ruff-check ::

timing	flake8	ruff
building requirements/PEX (A)	0.8	0.9
`local_process_total_time_run_ms` (B)	27.757	8.643
time spent actually running the linter (B - A)	~27	7.7

Running ruff outside of pants runs much faster (e.g. ruff check . takes like 300ms, so I'd expect a lot of the 7.7 seconds is Pants overhead, e.g. setting up sandboxes or unzipping pexes or whatever. #18570 is potentially related.

Thoughts?

sureshjoshi · 2024-06-10T01:29:24Z

If it's the same speed or faster, it's a +1 for me.

Question: What's the perf like once we get rid of flake entirely?

huonw · 2024-06-11T04:22:31Z

I did some basic experiments running

flake8 on main: pants --no-local-cache lint --stats-log --only=flake8 ::
ruff on this PR: pants --no-local-cache lint --stats-log --only=ruff-check ::

timing	flake8	ruff
building requirements/PEX (A)	0.8	0.9
`local_process_total_time_run_ms` (B)	27.757	8.643
time spent actually running the linter (B - A)	~27	7.7

Running ruff outside of pants runs much faster (e.g. ruff check . takes like 300ms, so I'd expect a lot of the 7.7 seconds is Pants overhead, e.g. setting up sandboxes or unzipping pexes or whatever. #18570 is potentially related.

(Potentially there's some processes like discovering Python interpreters that are being included in the local_process_total_time_run_ms counter, but looking at the [INFO] Completed: Lint with `ruff check` ... lines shows 7 invocations that finish about a second after the [INFO] Completed: Building ruff.pex ... line, so ~7s seems plausible)

sureshjoshi · 2024-06-11T12:29:22Z

That's still a decent improvement, but clearly could be a lot more. There is something to also be said about creating sandboxes and whatnot for lint/check-based operations that won't mutate the environment.

Maybe something like the option to run those in the new workspace environments 🤷🏽

That's a can of worms all on its own, but overall, to answer the original question - it seems like swapping over is a win all around! And will likely become more of a win as we optimize some more steps.

lilatomic · 2024-06-23T19:54:07Z

pyproject.toml

+  # flake8-2020
+  "YTT",
+  # flake8-implicit-str-concat, but only on a single line (includes bytes)
+  "ISC001",


I think we also need to add ISC002 to match NIC002 ? If it includes bytes too, I only found one instance of that (in a test), so it might be easier to fix it there

lilatomic · 2024-06-24T00:24:51Z

That's still a decent improvement, but clearly could be a lot more. There is something to also be said about creating sandboxes and whatnot for lint/check-based operations that won't mutate the environment.

One thing to remember is that some linters are just formatter we run and check if they've modified any files

sureshjoshi · 2024-06-24T00:41:33Z

One thing to remember is that some linters are just formatter we run and check if they've modified any files

Yeah, case-by-case - take the optimizations where we can, for those formatters that don't come baked with a "check" option, we can default to comparing sandbox code. As much as I hate splitting up code paths, if there is any non-trivial perf improvement, I generally think build tools should tend towards those options.

huonw · 2024-06-24T01:10:42Z

I think we can get rid of a lot of the sandbox overhead for fast tools (i.e. where the overhead of building sandboxes is a high proportion of the total time) by creating fewer sandboxes/putting more files in each one. See analysis/ideas in #18570 (comment)

In particular, running things in sandboxes has correctness advantages (e.g. handling of edits that happen concurrently with a running processes #21051 (comment)), that'd be nice to avoid losing by default, I think.

sureshjoshi · 2024-06-24T01:24:40Z

From your linked comment, what is the cost of sandboxing X files on a modern computer? In my mind, if the cost to sandbox ever exceeds the cost of running the process, we'd probably want an escape hatch to run it in a workspace or skip the sandbox.

With a single format, it's not a huge deal, but a large monorepo with 20 tools across 4 languages - it really adds up.

Your equation from that linked comment I think is the right way, so long as the minimum number of sandboxes could be 0, not 1 (by default, 0 sandboxes would need to be opt-in though, as it's riskier to your concurrent edit points above).

Running fast tools without a sandbox in CI, for example, might be nice. And I can basically guarantee that I would always opt into whatever is fastest :)

huonw · 2024-06-24T01:35:09Z

Can I suggest we move the discussion of tweaking sandboxing behaviour to #18570 or somewhere else, doesn't feel relevant to this PR?

sureshjoshi · 2024-06-24T02:01:48Z

Oh yeah, sure. I'm +1 for this change

huonw added 4 commits June 6, 2024 15:42

Enable ruff

1a1574b

Move move built-in linters from flake8 to ruff

5b2a569

Move flake8-2020, flake8-comprehensions

f330f2e

flake8-no-implicit-concat -> ruff (flake8-implicit-str-concat)

9c270a3

huonw added the category:internal CI, fixes for not-yet-released features, etc. label Jun 9, 2024

huonw marked this pull request as ready for review June 20, 2024 04:39

lilatomic approved these changes Jun 23, 2024

View reviewed changes

sureshjoshi approved these changes Jun 24, 2024

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Replace non-custom flake8 linting with ruff #21037

Replace non-custom flake8 linting with ruff #21037

huonw commented Jun 9, 2024 •

edited

Loading

sureshjoshi commented Jun 10, 2024

huonw commented Jun 11, 2024

sureshjoshi commented Jun 11, 2024 •

edited

Loading

lilatomic Jun 23, 2024

lilatomic commented Jun 24, 2024

sureshjoshi commented Jun 24, 2024

huonw commented Jun 24, 2024

sureshjoshi commented Jun 24, 2024

huonw commented Jun 24, 2024

sureshjoshi commented Jun 24, 2024

Replace non-custom flake8 linting with ruff #21037

Are you sure you want to change the base?

Replace non-custom flake8 linting with ruff #21037

Conversation

huonw commented Jun 9, 2024 • edited Loading

sureshjoshi commented Jun 10, 2024

huonw commented Jun 11, 2024

sureshjoshi commented Jun 11, 2024 • edited Loading

lilatomic Jun 23, 2024

Choose a reason for hiding this comment

lilatomic commented Jun 24, 2024

sureshjoshi commented Jun 24, 2024

huonw commented Jun 24, 2024

sureshjoshi commented Jun 24, 2024

huonw commented Jun 24, 2024

sureshjoshi commented Jun 24, 2024

huonw commented Jun 9, 2024 •

edited

Loading

sureshjoshi commented Jun 11, 2024 •

edited

Loading