string_view
operator+
vs. StrCat()
absl::Status
std::bind
absl::optional
and std::unique_ptr
absl::StrFormat()
make_unique
and private
Constructors.bool
explicit
= delete
)switch
Statements Responsibly= delete
AbslHashValue
and Youcontains()
std::optional
parametersif
and switch
statements with initializersinline
Variablesstd::unique_ptr
Must Be MovedAbslStringify()
vector.at()
auto
for Variable Declarationsvector.at()
Originally posted as TotW #224 on August 24, 2023
Updated 2024-01-24
Quicklink: abseil.io/tips/224
There is no good use of vector<T>::at()
in google3, and fairly few good uses
in other C++ environments. The same reasoning applies to at()
on other
random-access sequences like RepeatedPtrField
in protobuf, as well as to
value()
on wrapper types like optional<T>
and absl::StatusOr<T>
.
at()
Do?The specification of at(size_type pos)
is as follows:
Returns a reference to the element at specified location
pos
, with bounds checking. Ifpos
is not within the range of the container, an exception of type std::out_of_range is thrown.
This means we could view the contract of this method as two distinct behaviors:
pos >= size()
, and if so then throw a std::out_of_range
exception.pos
.Note: The specification does not directly address the case of code passing a
negative index, but std::out_of_range
will be thrown for that case too –
because size_type
is an unsigned integral type, a call to at(-5)
will
yield a very large positive value for pos
.
at()
?Since the contract of at()
depends on the bounds-checking logic, we can break
this into two cases: either we know by construction that the index is valid, or
we don’t.
If we already know that the sequence is sufficiently large and the lookup will
succeed, the extra bounds check is overhead. Most vector
accesses, for
instance, are as part of a loop from 0
to size()
and we already know the
operation will succeed. Therefore, in cases where we already know the bounds
check will be successful, it’s likely that we want the more common
operator[]()
.
{.bad} for (int i = 0; i + 1 < vec.size(); ++i) { ProcessPair(vec.at(i), vec.at(i + 1)); }
becomes
{.good} for (int i = 0; i + 1 < vec.size(); ++i) { ProcessPair(vec[i], vec[i + 1]); }
If we do not know that the sequence is sufficiently large, is throwing an
exception the right way to handle that? Usually not. In google3 builds, throwing
an exception will terminate the program, messily. Many (perhaps most) readers
won’t necessarily spot an innocuously named method like at()
as a process
termination risk.
{.bad} std::vector<absl::string_view> tokens = absl::StrSplit(user_string, ByChar(',')); LOG(INFO) << "Got leading token " << tokens.at(0);
is probably better as
{.good} std::vector<absl::string_view> tokens = absl::StrSplit(user_string, ByChar(',')); if (tokens.empty()) { return absl::InvalidArgumentError("Invalid user_string, expected ','"); }
or if aborting the program is preferable
{.good} std::vector<absl::string_view> tokens = absl::StrSplit(user_string, ByChar(',')); CHECK(!tokens.empty()) << "Invalid user_string " << std::quoted(user_string) << ", expected at least one ','";
So at least in a google3 context, none of the uses of at()
are really useful —
for any given use case, there is a more preferred alternative.
Unfortunately, reality is hardly so clean as “we know or we don’t”: we make
mistakes and code can change over time, invalidating originally correct
assumptions. Given that humans are fallible, we can imagine a use-case for
at()
. Specifically, if we are completely consistent in using at()
instead of
operator[]
, we might ensure that even if we’re crashing messily (bad), we
don’t trigger undefined behavior (UB) (worse).
While we believe “avoid UB” is a very legitimate goal, we still don’t endorse
the use of at()
, specifically, because of its exception-entangled semantics,
discussed above. The ideal future solution is a hardened-by-default
operator[]
, with compiler optimizations to remove bounds checking, when
provably safe. The at()
method is a bad approximation of this solution.
Instead, we encourage users to stick with operator[]
and reduce exposure to UB
by other means, including:
If your project can afford it, we recommend also enabling bounds check in production in other libraries where available.
If you run your code with ASAN you’ll also get diagnostics if you access an element out of range.
In fact, your project is likely already relying on some of these protections!
In Tip #202 we discussed the use of at()
on associative
containers like maps and sets. In general, the error-handling logic above
applies: it’s likely the case that a missing key should be handled by logging or
returning an error, rather than messily crashing the process.
However, the “bounds checking” overhead logic is different for these containers.
In the std::vector
case, the compute cost of doing the bounds check is similar
to the cost of doing the actual work (returning the indicated reference). For
associative containers, the “bounds check” equivalent is doing the (necessary)
lookup, whether that is tree traversal, hashing, etc.
Following that reasoning, we might use at()
when we know the key is present
already (no exception throwing) but were unable to keep an iterator or
reference, so it is necessary to perform the lookup again. This is pretty rare:
see Tip #132 for ways to avoid redundant map lookups.
In the end, there’s some minor room for usage of at()
in associative
containers. There is more room for nuance in those cases than there is for
vector
.
In an exceptions-enabled environment, opinions may differ a bit more when it
comes to at()
. It’s still broadly the case that explicit bounds checking is
likely better performance (and harder to mess up) than relying on exceptions. An
argument could be made for defense-in-depth prevention of UB, but it’s fairly
clear that the idiom is (and will continue to be) operator[]()
rather than
at()
.
Ideally, code should make as few assumptions as it can about the environment in
which it will work. Reasoning about code based on which toolchains will be used
to compile it is often fragile. For code that uses at()
(or another
exception-based API) to be correct, it needs to be correct for two different
build modes: it must be acceptable to terminate the entire process and it must
be acceptable for code at a higher level to catch the exception and continue
execution, so the library code must preserve all invariants. In practice that
means that the code must be exception-safe and that it must be OK for any
out-of-bounds use of at()
to terminate the process.
The best advice we can give about use of at()
in an exception-enabled
environment is perhaps that it trades a reduction in potential UB for hidden and
often unnecessary error handling. That isn’t always a clear tradeoff, but it
still seems unlikely to be commonly worth the cost.
When indexing into a container, be mindful of which case we are in: is the index
“correct by construction”, or does the code need to detect and handle invalid
indexes? In both cases we can do better than using the exception-based
std::vector<T>::at()
API.
Similar thinking applies to other exception-based APIs such as
std::optional<T>::value()
and absl::StatusOr<T>::value()
(See
Tip #181). For error handling in non-concurrent C++ code, prefer to
“look before you leap” – and then, having checked that things are in order,
avoid APIs that include their own checking.