Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

C.62 etc: Mention copy-and-swap. Eliminate misinformation about self-move-assignment #1606

Open
wants to merge 1 commit into
base: master
Choose a base branch
from
Open
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
80 changes: 44 additions & 36 deletions CppCoreGuidelines.md
Original file line number Diff line number Diff line change
Expand Up @@ -5942,7 +5942,8 @@ The standard-library containers handle self-assignment elegantly and efficiently

##### Note

The default assignment generated from members that handle self-assignment correctly handles self-assignment.
If your members handle self-assignment correctly, then the default assignment generated
for your class will handle self-assignment correctly.

struct Bar {
vector<pair<int, int>> v;
Expand All @@ -5968,26 +5969,30 @@ You can handle self-assignment by explicitly testing for self-assignment, but of

Foo& Foo::operator=(const Foo& a) // OK, but there is a cost
{
if (this == &a) return *this;
s = a.s;
i = a.i;
if (this != &a) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

contradicts NR.2 "single-return rule makes it harder to concentrate error checking at the top of a function"

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I doubt that NR.2 was meant to imply "Every function must have at least two return points." Using if for a block of code that you want to be executed conditionally is actually idiomatic C++.

Also, this is a "bad code" example anyway.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't have an opinion on whether this is better or worse but I don't think "this is a bad code example" is a good justification. a "bad code" example should be bad in precisely one way so that it illustrates the point it is intended to illustrate without any distractions.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It also contradicts F.56
"Shallow nesting of conditions makes the code easier to follow. It also makes the intent clearer. Strive to place the essential code at outermost scope, unless this obscures intent."

"Use a guard-clause to take care of exceptional cases and return early."

s = a.s;
i = a.i;
}
return *this;
}

This is obviously safe and apparently efficient.
However, what if we do one self-assignment per million assignments?
That's about a million redundant tests (but since the answer is essentially always the same, the computer's branch predictor will guess right essentially every time).
Consider:
The code we are skipping with the test would actually work fine even for self-assignment;
so the extra test is not buying us any correctness, only speed. And whatever speed we gain
in the rare case of self-assignment may well be lost again, in that same case, due to
branch misprediction.

Furthermore, while the code above is technically safe, careless maintenance
(such as swapping the order of members `s` and `i` or changing the type of `i`) may cause it
to lose the strong exception guarantee. To preserve correctness, we would do better
to use the "copy and swap" idiom:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So you're suggesting to replace the almost invisible costs of a single comparison with an entire copy-and-swap?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Replace the almost invisible costs of the entire function with the almost invisible costs of a copy-and-swap.

It's true that string::operator= can be faster than string::string(const string&), because the latter must do a heap allocation and a memcpy into the resulting buffer, whereas the former can skip the heap allocation if its buffer is already big enough (but still pays for the memcpy).

QuickBench can be massaged to show various results. The smaller the string is, the better simple-assignment performs. Copy-and-swap beats simple-assignment on libc++; I don't really see why. http://quick-bench.com/YiLlbTxctQKuvRINZVkg1h4MKWw

As far as the C++ Core Guidelines are concerned, though, we're not talking about performance micro-optimizations; we're talking about guidelines that help programmers avoid big correctness issues like

Foo& Foo::operator=(const Foo& a) {
    i = a.i;
    s = a.s;    // ERROR: This code is not exception-safe!
    return *this;
}

The old text went so far as to claim that there was "no known general way of avoiding an if (this == &a) return *this; test for a move assignment and still get a correct answer" — as if copy-and-swap hadn't been known for decades!


Foo& Foo::operator=(const Foo& a) // simpler, and probably much better
Foo& Foo::operator=(const Foo& a) // OK, simplest and best
{
s = a.s;
i = a.i;
Foo copy(a);
this->swap(copy);
return *this;
}

`std::string` is safe for self-assignment and so are `int`. All the cost is carried by the (rare) case of self-assignment.

##### Enforcement

(Simple) Assignment operators should not contain the pattern `if (this == &a) return *this;` ???
Expand Down Expand Up @@ -6048,12 +6053,12 @@ After `y = std::move(x)` the value of `y` should be the value `x` had and `x` sh

##### Note

Ideally, that moved-from should be the default value of the type.
Ensure that unless there is an exceptionally good reason not to.
Ideally, that moved-from state should be the same as the default-constructed state.
Ensure that, unless there is an exceptionally good reason not to.
However, not all types have a default value and for some types establishing the default value can be expensive.
The standard requires only that the moved-from object can be destroyed.
Often, we can easily and cheaply do better: The standard library assumes that it is possible to assign to a moved-from object.
Always leave the moved-from object in some (necessarily specified) valid state.
Always leave the moved-from object in some (not necessarily specified) valid state.

##### Note

Expand All @@ -6067,7 +6072,7 @@ Unless there is an exceptionally strong reason not to, make `x = std::move(y); y

##### Reason

If `x = x` changes the value of `x`, people will be surprised and bad errors may occur. However, people don't usually directly write a self-assignment that turn into a move, but it can occur. However, `std::swap` is implemented using move operations so if you accidentally do `swap(a, b)` where `a` and `b` refer to the same object, failing to handle self-move could be a serious and subtle error.
If `x = std::move(x)` changes the value of `x`, people will be surprised and bad errors may occur. People don't usually directly write a self-assignment that turn into a move, but it can occur. `std::swap` is implemented using move operations so if you accidentally do `swap(a, b)` where `a` and `b` refer to the same object, failing to handle self-move could be a serious and subtle error.

##### Example

Expand All @@ -6081,42 +6086,45 @@ If `x = x` changes the value of `x`, people will be surprised and bad errors may

Foo& Foo::operator=(Foo&& a) noexcept // OK, but there is a cost
{
if (this == &a) return *this; // this line is redundant
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

also NR.2

s = std::move(a.s);
i = a.i;
if (this != &a) {
s = std::move(a.s);
i = a.i;
}
return *this;
}

The one-in-a-million argument against `if (this == &a) return *this;` tests from the discussion of [self-assignment](#Rc-copy-self) is even more relevant for self-move.

##### Note

There is no known general way of avoiding an `if (this == &a) return *this;` test for a move assignment and still get a correct answer (i.e., after `x = x` the value of `x` is unchanged).
Here, the test `(this != &a)` is not redundant, because the standard library does not guarantee the result
Copy link
Member

@cubbimew cubbimew Apr 20, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

library guarantees the result is valid-but-unspecified (as noted in the next paragraph you're deleting), even though it wasn't specified well in the past, see http://wg21.link/lwg2468 and http://wg21.link/lwg2839 This makes the test redundant in this example.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In practice, both libc++ and libstdc++ will set s to the empty string. So, if you want foo = std::move(foo) to be a no-op, you must either write special-case code to handle the self-assignment case, or use the copy-and-swap idiom.

https://godbolt.org/z/Bsst2o

Perhaps the Guidelines should go further, and explain that if you follow the Rule of Zero (as in the godbolt above), you will not be following rule C.65, because x = std::move(x) will not be a no-op. The defaulted move assignment operator will not use copy-and-swap; it will use std::string's move-assignment operator, which is not a no-op on self-move.

The test in this example is absolutely not redundant. https://godbolt.org/z/NjL5VR

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wait, I didn't realize that you're interpreting C.65 so that foo = std::move(foo) must be a no-op. That's definitely wrong if it gives that impression.

Copy link
Member

@cubbimew cubbimew Apr 20, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

...actually it seems to imply it with "x is unchanged", even though it is not supported by C.65's own motivation to make self-swap work (any safe move does). This may be worth a new issue.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep. Right now, rule C.64 says "A move operation should move and leave its source in a valid state" (but potentially any valid state), which is the STL's own rule.

Rule C.65 specifically goes stronger; C.65 says "Make move assignment safe for self-assignment ... If x = x changes the value of x, people will be surprised ... after x = x the value of x is unchanged ... The ISO standard guarantees only a “valid but unspecified” state [but] The rule here is more caution and insists on complete safety."

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

right, and this means the mention of self-swap in the lead is irrelevant, the title is misleading (should be "no-op" rather than "safe"), and that the downsides that you're correctly bringing up were omitted.

of `s = std::move(a.s)` when `s` and `a.s` are the same object. In practice, both libc++ and libstdc++
will set `s` to the empty string. So, if you want `foo = std::move(foo)` to be a no-op, you must either write
special-case code to handle the self-assignment case, or use the copy-and-swap idiom.

##### Note
Prefer to use the copy-and-swap idiom:

The ISO standard guarantees only a "valid but unspecified" state for the standard-library containers. Apparently this has not been a problem in about 10 years of experimental and production use. Please contact the editors if you find a counter example. The rule here is more caution and insists on complete safety.
Foo& Foo::operator=(Foo&& a) noexcept // OK, simplest and best
{
Foo copy(std::move(a));
this->swap(copy);
return *this;
}

##### Example

Here is a way to move a pointer without a test (imagine it as code in the implementation a move assignment):
Here is a way to move a pointer without a test (imagine it as code in the implementation of move assignment):

// move from other.ptr to this->ptr
T* temp = other.ptr;
other.ptr = nullptr;
T* temp = std::exchange(other.ptr, nullptr);
delete ptr;
ptr = temp;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the point of the temp variable in the original code was so that you could remember the value of other.ptr after you set it to null. Since std::exchange seems to be designed to do that for you then if you're using std::exchange wouldn't you just write this as

delete ptr;
ptr = std::exchange(other.ptr, nullptr);

or possibly even
delete std::exchange(ptr, std::exchange(other.ptr, nullptr)); ?

Feels like at the moment readability is suffering without any real gain. The same can probably be said of my second example though.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This snippet is supposed to keep working even if &ptr == &other.ptr, so your two-liner wouldn't work.
Your one-liner seems like it would work fine, though! Totally unreadable, but it does work even when &ptr == &other.ptr as far as I can tell.

However, this entire "Note" seems to be just someone showing off their cleverness; it's not code you'd ever use in real life, given that copy-and-swap already exists. So I certainly wouldn't object to just removing the entire "Note."

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good point about the two liner. Pure coincidence that the one liner works. I agree the one liner is pretty unreadable. If I were reviewing this code in production I'd be advocating for a comment explaining why it's so sensitive (i.e. easy to get wrong). I have no opinion on whether the example is required.


##### Enforcement

* (Moderate) In the case of self-assignment, a move assignment operator should not leave the object holding pointer members that have been `delete`d or set to `nullptr`.
* (Not enforceable) Look at the use of standard-library container types (incl. `string`) and consider them safe for ordinary (not life-critical) uses.
* (Moderate) In the case of self-assignment, a move assignment operator should not leave the object holding pointer members that have been `delete`d.

### <a name="Rc-move-noexcept"></a>C.66: Make move operations `noexcept`

##### Reason

A throwing move violates most people's reasonably assumptions.
A throwing move violates most people's reasonable assumptions.
A non-throwing move will be used more efficiently by standard-library and language facilities.

##### Example
Expand All @@ -6139,15 +6147,15 @@ These operations do not throw.
template<typename T>
class Vector2 {
public:
Vector2(Vector2&& a) { *this = a; } // just use the copy
Vector2& operator=(Vector2&& a) { *this = a; } // just use the copy
Vector2(Vector2&& a) { *this = a; } // just use the copy assignment operator
Vector2& operator=(Vector2&& a) { *this = a; } // just use the copy assignment operator
// ...
private:
T* elem;
int sz;
};

This `Vector2` is not just inefficient, but since a vector copy requires allocation, it can throw.
This `Vector2` is not just inefficient, but, since a vector copy requires allocation, it can throw.

##### Enforcement

Expand Down