Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

chore: fix pylint message consider-using-any-or-all #881

Open
wants to merge 1 commit into
base: staging
Choose a base branch
from

Conversation

jenstroeger
Copy link
Contributor

@jenstroeger jenstroeger commented Sep 30, 2024

This change is part of issue #876 and addresses the consider-using-any-or-all check messages.

I’ll add the appropriate configuration to pyproject.toml in a separate PR later.

@oracle-contributor-agreement oracle-contributor-agreement bot added the OCA Verified All contributors have signed the Oracle Contributor Agreement. label Sep 30, 2024
@tromai
Copy link
Member

tromai commented Sep 30, 2024

Thanks for the PR.
I wonder if this checker automatically updates the code if we are using the old style of plain for loop and return? It could be a nuisance for developers that prefer using for-loop over any and all (like me) during the development process for the pre-commit hooks to pass.
In my opinion, if there isn't any strong benefit (performance wise), it's better if we leave the decision to the developer on a case-by-case basis as both styles are completely valid and equal.
However, it's my personal preference, if the other team members decide it's better to go with any and all, that would be fine by me.

@jenstroeger
Copy link
Contributor Author

I wonder if this checker automatically updates the code if we are using the old style of plain for loop and return?

It does not: pylint is a checker that doesn’t rewrite code.

It could be a nuisance for developers that prefer using for-loop over any and all (like me) during the development process for the pre-commit hooks to pass.

I’m curious: why would you? I think a loop is more code to read and more to write, so the cognitive load is higher. Once you get used to these built-in functions, the intention is clearer and they read more comfortably. For me, reading something like all( <some iterable> ) is easier than a for loop, with the intention of an “all-inclusive” check plainly visible.

In my opinion, if there isn't any strong benefit (performance wise), it's better if we leave the decision to the developer on a case-by-case basis as both styles are completely valid and equal.

I mildly disagree here: it’s the same argument for why folks use black to enforce a common coding style. Reading code that sometimes uses loops and sometimes uses built-ins becomes tedious to read. So if you prefer loops then the checker should be turned off completely. But that raises the question: why? It’s Python, so code should be “pythonic” and code should be written to be efficient.

However, there isn’t always a performance benefit as you can read e.g. in this older discussion: performance depends on the complexity of the check and the interpreter you run. In the case of Macaron, it seems like the for loop is slightly faster for CPython 3.11. I expect both Python 3.12 and 3.13 to improve here.

However, it's my personal preference, if the other team members decide it's better to go with any and all, that would be fine by me.

I think it’s a good opportunity to expand “scope” from imperative C-style programming towards a more functional and abstract style of programming.

Feel free to leave the PR open if you folks want to discuss this particular change more.

@behnazh-w
Copy link
Member

For me, reading something like all( <some iterable> ) is easier than a for loop, with the intention of an “all-inclusive” check plainly visible.

I also find that using all(<some iterable>) is often easier to read than a for loop. It took me some time to transition from imperative programming styles, but Python’s design prioritizes readability, making it feel almost like natural language. The use of all(<some iterable>) is a good example of this.

In the case of Macaron, it seems like the for loop is slightly faster for CPython 3.11. I expect both Python 3.12 and 3.13 to improve here.

That’s interesting! As we prepare to switch to Python 3.13 soon, we can also expect some performance benefits.

@tromai
Copy link
Member

tromai commented Sep 30, 2024

I’m curious: why would you?

It's just simply a personal preference. Just like how you believe in the benefits of using any and all compared to the for-loop style (e.g less code written).
For me, it's easier to read a for-loop because I prefer explicit branching logic. Meanwhile, whenever I encounter something like all(<iterable>), I need to think about all different cases of <iterable> to make sure that the all function behaves as I expected.
For example, if iterable is supposed to yield a None value or a list (non-empty), all(<iterable>) would behave the same in the case iterable yields an empty list (which can be a bug) and I will miss this if I didn't think about it.

# Tested on Python3.11
>>> all([None, "boo"]) == all(["", None])
True

Now if I did a for-loop style, the condition could be listed more explicitly that we accept empty list, but not "None" value.

for boo in some_iterable:
    # Accept empty list. 
    if is None:
        return False

return True

With that being say, I completely understand that each of us has different personal preference and I totally respect that without disregarding the opposite argument.
I don't hate any and all, even find it useful in some cases, but I prefer to decide for myself what would be the best choice on a case-by-case basis. That is the reason why I suggest we shouldn't limit the developer given that:

  • Both styles are completely valid in the Python language
  • There isn't a noticeable performance difference between them (I really appreciate the details you have given).

I mildly disagree here: it’s the same argument for why folks use black to enforce a common coding style. Reading code that sometimes uses loops and sometimes uses built-ins becomes tedious to read.

I totally agree that given a group project, consistency is key. And if we (as a team) agree it's better to pursue the usage of any and all instead of for-loop, then I am more than happy to follow.

@jenstroeger
Copy link
Contributor Author

jenstroeger commented Oct 1, 2024

Ok, there’s a lot to unpack here.

Just like how you believe in the benefits of using any and all compared to the for-loop style (e.g less code written).

I avoid terms like “believe” because, quite frankly, opinions don’t matter much in a data-driven conversation. In this case, the lines of code can be quantified easily, and so can the runtime. Thus, we don’t need to “believe” that one is “better” than the other, we can objectively weigh data and the benefits of one over the other.

For me, it's easier to read a for-loop because I prefer explicit branching logic.

I believe (pun intended) that this will change as you read more Python code that makes more use of Python’s built-in functions.

Meanwhile, whenever I encounter something like all(<iterable>), I need to think about all different cases of <iterable> to make sure that the all function behaves as I expected.

You have to make that exact same cognitive effort when writing a for or while loop — you should know what the iterable iterates over. Whether your code consumes an iterable using for or while or all() or list() or… any other Python construct that consumes an iterable object.

I don’t understand the argument or the example that follows, so let me pick this apart.

For example, if iterable is supposed to yield a None value or a list (non-empty),

An iterable produces values until it is exhausted (in which case it raises StopIteration). An iterable can not produce None or a list.

all(<iterable>) would behave the same in the case iterable yields an empty list (which can be a bug) and I will miss this if I didn't think about it.

Please rephrase more precisely, I don’t understand this statement. In this context, please also consider how Python implements truth value testing and the difference between “truthy” and “falsy” values — in your example, both None and [] are falsy.

# Tested on Python3.11
>>> all([None, "boo"]) == all(["", None])
True

I don’t know what this example demonstrates: both all() return False and two False values always compare equal. The values in the list in both cases are sometimes truthy (i.e. "boo") and mostly falsy (i.e. None or "") which is why all() returns False.

Now if I did a for-loop style, the condition could be listed more explicitly that we accept empty list, but not "None" value.

for boo in some_iterable:
    # Accept empty list. 
    if is None:
        return False

return True

Here too, I don’t understand your statement. If the list was empty then neither all() or for would iterate (because the iterable is empty). The comment # Accept empty list. is misplaced, it seems to indicate that the body of the for loop executes for an empty list which it does not:

>>> for i in []:
...     print(i)
... 
>>> 

With that being say, I completely understand that each of us has different personal preference and I totally respect that without disregarding the opposite argument. I don't hate any and all, even find it useful in some cases, but I prefer to decide for myself what would be the best choice on a case-by-case basis.

As I mentioned previously, this is not so much about “I like this, you like that” but it’s about consistency and performance and writing “pythonic” code. If Macaron folks decide to enable this code checker then it makes no sense to enable it here sometimes and disable it there sometimes, just because this developer has one preference and that developer another. We do not disable black randomly, and therefore should not disable these checkers randomly. The beauty of an opinionated toolchain is that developers’ opinions are moot, and code is consistent and thus easier to read and maintain.

That is the reason why I suggest we shouldn't limit the developer given that:

  • Both styles are completely valid in the Python language

Horribly wrong code is also valid Python code, but that doesn’t mean I want to write that.

  • There isn't a noticeable performance difference between them (I really appreciate the details you have given).

At this point and with CPython and the miniture-use cases, no, there is not. But I think this conversation is also about creating the habit of questioning code instead of just blindly sticking with a habit. Python evolves, it adds features, it improves on existing features — we should be able to adapt to these changes instead of sticking with the old.

Cheers.

@tromai
Copy link
Member

tromai commented Oct 1, 2024

Thanks for the detailed response. I am happy to approve this PR.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
OCA Verified All contributors have signed the Oracle Contributor Agreement.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants