Tip of the Week #224: Avoid `vector.at()`

Originally posted as TotW #224 on August 24, 2023

Updated 2024-01-24

There is no good use of vector<T>::at() in google3, and fairly few good uses in other C++ environments. The same reasoning applies to at() on other random-access sequences like RepeatedPtrField in protobuf, as well as to value() on wrapper types like optional<T> and absl::StatusOr<T>.

What Does `at()` Do?

The specification of at(size_type pos) is as follows:

Returns a reference to the element at specified location pos, with bounds checking. If pos is not within the range of the container, an exception of type std::out_of_range is thrown.

This means we could view the contract of this method as two distinct behaviors:

Check whether pos >= size(), and if so then throw a std::out_of_range exception.
Otherwise, return the element at index pos.

Note: The specification does not directly address the case of code passing a negative index, but std::out_of_range will be thrown for that case too – because size_type is an unsigned integral type, a call to at(-5) will yield a very large positive value for pos.

When Would We Use `at()`?

Since the contract of at() depends on the bounds-checking logic, we can break this into two cases: either we know by construction that the index is valid, or we don’t.

If we already know that the sequence is sufficiently large and the lookup will succeed, the extra bounds check is overhead. Most vector accesses, for instance, are as part of a loop from 0 to size() and we already know the operation will succeed. Therefore, in cases where we already know the bounds check will be successful, it’s likely that we want the more common operator[]().

 {.bad}
for (int i = 0; i + 1 < vec.size(); ++i) {
  ProcessPair(vec.at(i), vec.at(i + 1));
}

becomes

 {.good}
for (int i = 0; i + 1  < vec.size(); ++i) {
  ProcessPair(vec[i], vec[i + 1]);
}

If we do not know that the sequence is sufficiently large, is throwing an exception the right way to handle that? Usually not. In google3 builds, throwing an exception will terminate the program, messily. Many (perhaps most) readers won’t necessarily spot an innocuously named method like at() as a process termination risk.

 {.bad}
std::vector<absl::string_view> tokens = absl::StrSplit(user_string, ByChar(','));
LOG(INFO) << "Got leading token " << tokens.at(0);

is probably better as

 {.good}
std::vector<absl::string_view> tokens = absl::StrSplit(user_string, ByChar(','));
if (tokens.empty()) {
  return absl::InvalidArgumentError("Invalid user_string, expected ','");
}

or if aborting the program is preferable

 {.good}
std::vector<absl::string_view> tokens = absl::StrSplit(user_string, ByChar(','));
CHECK(!tokens.empty()) << "Invalid user_string "
                       << std::quoted(user_string)
                       << ", expected at least one ','";

So at least in a google3 context, none of the uses of at() are really useful — for any given use case, there is a more preferred alternative.

What About UB?

Unfortunately, reality is hardly so clean as “we know or we don’t”: we make mistakes and code can change over time, invalidating originally correct assumptions. Given that humans are fallible, we can imagine a use-case for at(). Specifically, if we are completely consistent in using at() instead of operator[], we might ensure that even if we’re crashing messily (bad), we don’t trigger undefined behavior (UB) (worse).

While we believe “avoid UB” is a very legitimate goal, we still don’t endorse the use of at(), specifically, because of its exception-entangled semantics, discussed above. The ideal future solution is a hardened-by-default operator[], with compiler optimizations to remove bounds checking, when provably safe. The at() method is a bad approximation of this solution.

Instead, we encourage users to stick with operator[] and reduce exposure to UB by other means, including:

If your project can afford it, we recommend also enabling bounds check in production in other libraries where available.
If you run your code with ASAN you’ll also get diagnostics if you access an element out of range.

In fact, your project is likely already relying on some of these protections!

What About Maps?

In Tip #202 we discussed the use of at() on associative containers like maps and sets. In general, the error-handling logic above applies: it’s likely the case that a missing key should be handled by logging or returning an error, rather than messily crashing the process.

However, the “bounds checking” overhead logic is different for these containers. In the std::vector case, the compute cost of doing the bounds check is similar to the cost of doing the actual work (returning the indicated reference). For associative containers, the “bounds check” equivalent is doing the (necessary) lookup, whether that is tree traversal, hashing, etc.

Following that reasoning, we might use at() when we know the key is present already (no exception throwing) but were unable to keep an iterator or reference, so it is necessary to perform the lookup again. This is pretty rare: see Tip #132 for ways to avoid redundant map lookups.

In the end, there’s some minor room for usage of at() in associative containers. There is more room for nuance in those cases than there is for vector.

What About C++ With Exceptions?

In an exceptions-enabled environment, opinions may differ a bit more when it comes to at(). It’s still broadly the case that explicit bounds checking is likely better performance (and harder to mess up) than relying on exceptions. An argument could be made for defense-in-depth prevention of UB, but it’s fairly clear that the idiom is (and will continue to be) operator[]() rather than at().

Ideally, code should make as few assumptions as it can about the environment in which it will work. Reasoning about code based on which toolchains will be used to compile it is often fragile. For code that uses at() (or another exception-based API) to be correct, it needs to be correct for two different build modes: it must be acceptable to terminate the entire process and it must be acceptable for code at a higher level to catch the exception and continue execution, so the library code must preserve all invariants. In practice that means that the code must be exception-safe and that it must be OK for any out-of-bounds use of at() to terminate the process.

The best advice we can give about use of at() in an exception-enabled environment is perhaps that it trades a reduction in potential UB for hidden and often unnecessary error handling. That isn’t always a clear tradeoff, but it still seems unlikely to be commonly worth the cost.

Closing Thoughts

When indexing into a container, be mindful of which case we are in: is the index “correct by construction”, or does the code need to detect and handle invalid indexes? In both cases we can do better than using the exception-based std::vector<T>::at() API.

Similar thinking applies to other exception-based APIs such as std::optional<T>::value() and absl::StatusOr<T>::value() (See Tip #181). For error handling in non-concurrent C++ code, prefer to “look before you leap” – and then, having checked that things are in order, avoid APIs that include their own checking.

Tip of the Week #224: Avoid vector.at()

What Does at() Do?

When Would We Use at()?