string_view
operator+
vs. StrCat()
absl::Status
std::bind
absl::optional
and std::unique_ptr
absl::StrFormat()
make_unique
and private
Constructors.bool
explicit
= delete
)switch
Statements Responsibly= delete
AbslHashValue
and Youcontains()
std::optional
parametersif
and switch
statements with initializersinline
Variablesstd::unique_ptr
Must Be MovedAbslStringify()
vector.at()
auto
for Variable DeclarationsOriginally posted as TotW #117 on June 8, 2016
by Geoff Romer, ([email protected])
“Everything is so far away, a copy of a copy of a copy. The insomnia distance of everything, you can’t touch anything and nothing can touch you.” — Chuck Palahniuk
Suppose you have a class like this:
class Widget {
public:
…
private:
string name_;
};
How would you write its constructor? For years, the answer has been to do it like this:
// First constructor version
explicit Widget(const std::string& name) : name_(name) {}
However, there is an alternative approach which is becoming more common:
// Second constructor version
explicit Widget(std::string name) : name_(std::move(name)) {}
(If you’re not familiar with std::move()
, see TotW #77, or
pretend I used std::swap
instead; the same principles apply). What’s going
on here? Isn’t it horribly expensive to pass std::string
by copy? It turns
out that no, sometimes passing by value (as we’ll see, it’s not really “by
copy”) can be much more efficient than passing by reference.
To understand why, consider what happens when the callsite looks like this:
Widget widget(absl::StrCat(bar, baz));
With the first version of the Widget
constructor, absl::StrCat()
produces
a temporary string containing the concatenated string values, which is
passed by reference into Widget()
, and then the string is copied into
name_
. With the second version of the Widget
constructor, the temporary
string is passed into Widget()
by value, which you might think would cause
the string to be copied, but this is where the magic happens. When the
compiler sees a temporary being used to copy-construct an object, the
compiler will1 simply use the same storage for both the temporary and
the new object, so that copying from the one to the other is literally free;
this is called copy elision. Thanks to this optimization, the string is
never copied, but only moved once, which is a cheap constant-time operation.
Consider what happens when the argument isn’t a temporary:
string local_str;
Widget widget(local_str);
In this case, both versions of the code make a copy of the string, but the second version also moves the string. That, again, is a cheap constant-time operation, whereas copying is a linear-time operation, so in many cases that will be a price well worth paying.
The key property of the name
parameter that makes this technique work is that
name
has to be copied. Indeed, the essence of this technique is to try to make
the copy operation happen on the function call boundary, where it can be elided,
rather than in the interior of the function. This need not necessarily involve
std::move()
; for example, if the function needs to mutate the copy rather than
store it, it may just be mutated in place.
Passing parameters by value has several drawbacks which should be borne in mind.
First of all, it makes the function body more complex, which creates a
maintenance and readability burden. For example, in the code above, we’ve added
a std::move()
call, which creates a risk of accidentally accessing a moved-from
value. In this function that risk is pretty minimal, but if the function were
more complex, the risk would be higher.
Second, it can degrade performance, sometimes in surprising ways. It can sometimes be difficult to tell whether it’s a net performance win without profiling a specific workload.
Widget()
are almost always very
short, or are almost never temporaries, this technique may be harmful on
balance. As always when considering optimization tradeoffs, when in doubt,
measure.set_name()
method to Widget
), the pass-by-reference version can
sometimes avoid memory allocation by reusing name_
’s existing buffer, in
cases where the pass-by-value version would allocate new memory. In
addition, the fact that pass-by-value always replaces name_
’s allocation
can lead to worse allocation behavior down the line: if the name_
field
tends to grow over time after it is set, that growth will require further
new allocations in the pass-by-value case, whereas in the pass-by-reference
case we only reallocate when the name_
field exceeds its historical
maximum size.Generally speaking, you should prefer simpler, safer, more readable code, and only go for something more complex if you have concrete evidence that the complex version performs better and that the difference matters. That principle certainly applies to this technique: passing by const reference is simpler and safer, so it’s still a good default choice. However, if you’re working in an area that’s known to be performance-sensitive, or your benchmarks show that you’re spending too much time copying function parameters, passing by value can be a very useful tool to have in your toolbox.
Strictly speaking the compiler is not required to perform copy elision, but it is such a powerful and vitally important optimization that it is extremely unlikely that you will ever encounter a compiler that doesn’t do it. ↩