Tip of the Week #117: Copy Elision and Pass-by-value

Originally posted as TotW #117 on June 8, 2016

“Everything is so far away, a copy of a copy of a copy. The insomnia distance of everything, you can’t touch anything and nothing can touch you.” — Chuck Palahniuk

Suppose you have a class like this:

class Widget {
 public:
  …

 private:
  string name_;
};

How would you write its constructor? For years, the answer has been to do it like this:

// First constructor version
explicit Widget(const std::string& name) : name_(name) {}

However, there is an alternative approach which is becoming more common:

// Second constructor version
explicit Widget(std::string name) : name_(std::move(name)) {}

(If you’re not familiar with std::move(), see TotW #77, or pretend I used std::swap instead; the same principles apply). What’s going on here? Isn’t it horribly expensive to pass std::string by copy? It turns out that no, sometimes passing by value (as we’ll see, it’s not really “by copy”) can be much more efficient than passing by reference.

To understand why, consider what happens when the callsite looks like this:

Widget widget(absl::StrCat(bar, baz));

With the first version of the Widget constructor, absl::StrCat() produces a temporary string containing the concatenated string values, which is passed by reference into Widget(), and then the string is copied into name_. With the second version of the Widget constructor, the temporary string is passed into Widget() by value, which you might think would cause the string to be copied, but this is where the magic happens. When the compiler sees a temporary being used to copy-construct an object, the compiler will¹ simply use the same storage for both the temporary and the new object, so that copying from the one to the other is literally free; this is called copy elision. Thanks to this optimization, the string is never copied, but only moved once, which is a cheap constant-time operation.

Consider what happens when the argument isn’t a temporary:

string local_str;
Widget widget(local_str);

In this case, both versions of the code make a copy of the string, but the second version also moves the string. That, again, is a cheap constant-time operation, whereas copying is a linear-time operation, so in many cases that will be a price well worth paying.

The key property of the name parameter that makes this technique work is that name has to be copied. Indeed, the essence of this technique is to try to make the copy operation happen on the function call boundary, where it can be elided, rather than in the interior of the function. This need not necessarily involve std::move(); for example, if the function needs to mutate the copy rather than store it, it may just be mutated in place.

When to Use Copy Elision

Passing parameters by value has several drawbacks which should be borne in mind. First of all, it makes the function body more complex, which creates a maintenance and readability burden. For example, in the code above, we’ve added a std::move() call, which creates a risk of accidentally accessing a moved-from value. In this function that risk is pretty minimal, but if the function were more complex, the risk would be higher.

Second, it can degrade performance, sometimes in surprising ways. It can sometimes be difficult to tell whether it’s a net performance win without profiling a specific workload.

As stated above, this technique applies only to parameters that need to be copied; it will be useless at best and harmful at worst when applied to parameters that don’t need to be copied, or only need to be copied conditionally.
This technique often involves some amount of extra work in the function body, such as the move assignment in the example above. If that extra work imposes too much overhead, the slowdown in the cases where the copy can’t be elided may not be worth the speedup in the cases where the copy can be elided. Note that this determination can depend on the particulars of your use case. For example, if the arguments to Widget() are almost always very short, or are almost never temporaries, this technique may be harmful on balance. As always when considering optimization tradeoffs, when in doubt, measure.
When the copy in question is a copy assignment (for example if we wanted to add a set_name() method to Widget), the pass-by-reference version can sometimes avoid memory allocation by reusing name_’s existing buffer, in cases where the pass-by-value version would allocate new memory. In addition, the fact that pass-by-value always replaces name_’s allocation can lead to worse allocation behavior down the line: if the name_ field tends to grow over time after it is set, that growth will require further new allocations in the pass-by-value case, whereas in the pass-by-reference case we only reallocate when the name_ field exceeds its historical maximum size.

Generally speaking, you should prefer simpler, safer, more readable code, and only go for something more complex if you have concrete evidence that the complex version performs better and that the difference matters. That principle certainly applies to this technique: passing by const reference is simpler and safer, so it’s still a good default choice. However, if you’re working in an area that’s known to be performance-sensitive, or your benchmarks show that you’re spending too much time copying function parameters, passing by value can be a very useful tool to have in your toolbox.

Strictly speaking the compiler is not required to perform copy elision, but it is such a powerful and vitally important optimization that it is extremely unlikely that you will ever encounter a compiler that doesn’t do it. ↩