Tip of the Week #3: String Concatenation and operator+ vs. StrCat()

Originally published as totw/3 on 2012-05-11

Updated 2017-09-18; revised 2018-01-22

Users are often surprised when a reviewer says, “Don’t use the string concatenation operator, it’s not that efficient.” How can it be that std::string::operator+ is inefficient? Isn’t it hard to get that wrong?

It turns out, such inefficiency isn’t clear cut. These two snippets have close to the same execution time, in practice:

std::string foo = LongString1();
std::string bar = LongString2();
std::string foobar = foo + bar;

std::string foo = LongString1();
std::string bar = LongString2();
std::string foobar = absl::StrCat(foo, bar);

However, the same is not true for these two snippets:

std::string foo = LongString1();
std::string bar = LongString2();
std::string baz = LongString3();
string foobar = foo + bar + baz;

std::string foo = LongString1();
std::string bar = LongString2();
std::string baz = LongString3();
std::string foobar = absl::StrCat(foo, bar, baz);

The reason these two cases differ can be understood when we pick apart what is happening in the expression foo + bar + baz. Since there are no overloads for three-argument operators in C++, this operation is necessarily going to make two calls to string::operator+. And between those two calls, the operation will construct (and store) a temporary string. So std::string foobar = foo + bar + baz is really equivalent to:

std::string temp = foo + bar;
std::string foobar = std::move(temp) + baz;

Specifically, note that the contents of foo and bar must be copied to a temporary location before they are placed within foobar. (For more on std::move, see Tip of the Week #77: Temporaries, moves, and copies.)

C++11 at least allows the second concatenation to happen without creating a new string object: std::move(temp) + baz is equivalent to std::move(temp.append(baz)). However, it’s possible that the buffer initially allocated for the temporary won’t be large enough to hold the final string, in which case a reallocation (and another copy) will be required. As a result, in the worst case, chains of n string concatenations require O(n) reallocations.

It is better instead to use absl::StrCat(), a nice helper function from absl/strings/str_cat.h that calculates the necessary string length, reserves that size, and concatenates all of the input data into the output - a well-optimized O(n). Similarly, for cases like:

foobar += foo + bar + baz;

use absl::StrAppend(), which performs similar optimizations:

absl::StrAppend(&foobar, foo, bar, baz);

As well, absl::StrCat() and absl::StrAppend() operate on types other than just string types: you can use absl::StrCat/absl::StrAppend to convert int32_t, uint32_t, int64_t, uint64_t, float, double, const char*, and string_view, like this:

std::string foo = absl::StrCat("The year is ", year);

Subscribe to the Abseil Blog