string_view
operator+
vs. StrCat()
absl::Status
std::bind
absl::optional
and std::unique_ptr
absl::StrFormat()
make_unique
and private
Constructors.bool
explicit
= delete
)switch
Statements Responsibly= delete
AbslHashValue
and Youcontains()
std::optional
parametersif
and switch
statements with initializersinline
Variablesstd::unique_ptr
Must Be MovedAbslStringify()
vector.at()
Originally posted as TotW #180 on June 11, 2020
Updated 2020-06-11
Quicklink: abseil.io/tips/180
Unlike many languages, C++ lacks the safety checks necessary to avoid
referencing invalid memory (aka “dangling references”). You can easily
dereference a pointer to an object that was already delete
-ed, or follow a
reference to an object that has gone out of scope. Even class types carry this
risk. Importantly, we are building naming conventions around the names view
and span
to signify “This is an object that has reference semantics and may
dangle.” These types, like all types with reference semantics, never own the
underlying data that they point to. Be mindful whenever you see instances of any
of these types stored.
If you’re coming to C++ from other languages, there are quite a few fundamental surprises. The type system is meaningfully more complicated than most languages, requiring a sometimes-subtle understanding of references, temporaries, shallow-const, pointers, object lifetimes, etc. One of the most uniquely important issues when learning C++ is recognizing that having a pointer or a reference to an object doesn’t mean the object still exists. C++ is not garbage-collected nor reference-counted, and as a result holding a handle to an object isn’t enough to ensure the object stays alive.
Consider:
int* int_handle; { int foo = 42; int_handle = &foo; } std::cout << *int_handle << "\n"; // Boom
When we dereference int_handle
with operator*
, we are following a pointer to
an object whose lifetime is ended. This is a bug. Formally, this is undefined
behavior, and anything can happen.
Distressingly, one of the “anything can happen” options is “this does what you
naively think it might” - printing 42. C++ is a language that does not promise
to diagnose or react to your bugs. The fact that your program seems to work
does not mean it is correct. It means at best that the compiler happened to
choose an outcome that worked for you. But make no mistake: this is no less
buggy than if int_handle
was a pointer to null
.
From this we draw two important points:
delete
-ed.It is critically important to understand that our informal “handle” discussion applies to values of certain class types as well as to the more-obvious pointers and references. Consider iterators:
std::vector<int>::iterator int_handle; { std::vector<int> v = {42}; int_handle = v.begin(); } std::cout << *int_handle << "\n"; // Boom
This is morally identical to the previous example. On some platforms, vector
iterators may in fact be implemented as pointers. Even if these iterators are
class types, the same language rules apply: dereferencing the iterator will
(under the hood) eventually be following a pointer or reference to an object
that is no longer in scope (in this case, v[0]
).
Because C++ does not define what happens when code uses an invalid pointer, reference, or iterator, code that does so is always incorrect (even if it appears to work). This allows debugging tools such as sanitizers and debugging iterators to report bugs with no false positives.
Over the past few years, Abseil and the C++ standard library have been
introducing additional class types with similar “handle” behavior. The most
common of these is string_view
, which is a handle to some contiguous buffer of
characters (often a string
). Holding a string_view
is exactly like holding
any other handle type: there is no general guarantee that the underlying data
lives. It is up the programmer to prove that the underlying buffer outlives the
string_view
. Importantly the handle that string_view
provides does not allow
for mutation: a string_view
cannot be used to modify the underlying data.
Another handle design that is becoming common is span<T>
, which is a
contiguous buffer of any type T
. If T
is non-const, then span
allows
mutation of the underlying data. If T
is const, then the span
cannot modify
it, in the same fashion that string_view
cannot modify the underlying buffer.
Thus, span<const char>
is similar to string_view
. Although the two types
have different APIs, reasoning about the handles or underlying buffers works in
exactly the same way.
string_view
and span
tend to be very safe to use as function parameters,
abstracting away from a variety of input argument formats. Because of the
possibility of a dangling reference, any time that types of this design are
stored, they become a significant source of programmer error. Every storage of
any handle type requires critical thinking to understand why we are sure the
underlying object stays valid for the lifetime of the handle. Using
string_view
or span
in a container is not always wrong but is a subtle
optimization that warrants clear comments describing the associated storage.
Using these types for data members of a class is rarely the right choice.
It is critically important going forward that C++ programmers understand these design patterns, and how to use these “reference parameter types.” To assist in that understanding, type designers and library providers tend toward the following meaning for types:
Since both of these naming indicators suggest reference types, any storage of a library-provided type called a “view” or a “span” needs to be accompanied by the same logic you would use when thinking about the lifetimes of a pointer or reference: how do I know that the underlying object is still alive?
The popular external range_v3
library and the upcoming C++20 ranges library have a different meaning for
“view”, although the types described by these definitions overlap. In ranges,
“view” means “a range that can be copied in O(1)”. This includes string_view
.
However, this definition does not preclude mutation of the underlying data. This
mismatch is unfortunate, and largely recognized by the C++ standards committee,
but nobody could find consensus on any alternative to “view” after the concern
was raised.
The C++20 span
type and
Abseil’s
Span
type have slightly different interfaces and semantics when it comes to
comparability and copying. The most notable difference is with
absl::Span::operator==
, which we now know to
probably be a design mistake.
For more on the design theory underlying modern reference parameter types, see Revisiting Regular Types.