string_view
operator+
vs. StrCat()
absl::Status
std::bind
absl::optional
and std::unique_ptr
absl::StrFormat()
make_unique
and private
Constructors.bool
explicit
= delete
)switch
Statements Responsibly= delete
AbslHashValue
and Youcontains()
std::optional
parametersif
and switch
statements with initializersinline
Variablesstd::unique_ptr
Must Be MovedAbslStringify()
vector.at()
auto
for Variable DeclarationsOriginally posted as TotW #227 on November 16, 2023
Updated 2024-03-11
Quicklink: abseil.io/tips/227
Modern C++ code uses index-based for
loops much less often now that
range-based for
loops are available, but there are still times when we need to
use an index while we iterate. Parallel iteration over multiple containers is
one such case; another is when we want to process multiple adjacent elements
from a single container. In this tip we’ll look at a pitfall for the second of
these.
Let’s start by looking at some code that might be correct:
for (int64_t i = 0; i < v.size() - 1; ++i) { ProcessPair(v[i], v[i+1]); }
This code wisely takes some care to check for valid indexes before calling
ProcessPair()
, so why do we say that it only “might” be correct? A careful
unit test (or almost any fuzz test)
will cover the case where v
is empty. If the code surrounding our for
loop
ensures that the for
loop is never reached in that case, all is well. But if
we execute our loop with an empty v
, C++ makes trouble for us.
Recall that
our style guide warns
against use of unsigned
types in C++. The style guide also says
Because of historical accident, the C++ standard also uses unsigned integers to represent the size of containers - many members of the standards body believe this to be a mistake, but it is effectively impossible to fix at this point.
(emphasis added)
Looking carefully, we can see that our example falls afoul of the exact problem
discussed in the style guide. When checking whether we have valid v[i]
and
v[i+1]
elements, we are seemingly correct in checking whether i
is less than
v.size() - 1
given that both elements need to be valid. However, for an empty
container v.size()
is zero (so far, so good!), but because the type of
v.size()
is unsigned, when we subtract one from that zero, we don’t get the
value -1
, but instead we get the maximum value of the given type. Then the
check for whether i
is less than v.size() - 1
evaluates as true
for any
small value of i
, and so the code will use out-of-bounds indexes for v
-
yielding undefined behavior.
Interestingly, if we make the code express its intent a little more directly, our problem goes away.
What do we mean by “express its intent a little more directly”? What is the
intent here? The purpose of the loop condition here is to ensure that the
indexes i
and i + 1
used to index into v
are valid.
Given that indexes in C++ are zero-based, we test whether a (non-negative) index
i
into a container is valid by checking i < v.size()
. It would be redundant
to check validity of both indexes (though we could do so if we wished): if i +
1
is valid then we know that i
is (because i
is never negative here). “i +
1
is valid” translates directly into C++ as i + 1 < v.size()
. Our original
code i < v.size() - 1
does not have such a direct translation as a statement
about the validity of an index.
The rewritten code i + 1 < v.size()
looks almost the same as i < v.size() -
1
, but it is crucially different in that we never subtract, so we avoid the
danger of wrapping around to a huge positive value. Did we swap this for a risk
of overflowing when we calculate i + 1
? Only if i
is already the largest
value of its type (int64_t
) – so we are safe. This difference is sometimes
characterized by noting that the common, useful values of int64_t
are a long
way away from overflowing, whereas with unsigned types such as uint64_t
, the
very common value 0 is the smallest value of the type, so it’s much easier to
unintentionally wrap around.
With this one small change, our now-robust code looks like this:
for (int64_t i = 0; i + 1 < v.size(); ++i) { ProcessPair(v[i], v[i+1]); }
The indexes into v
are clearly safe, without a need to look further afield to
know whether v
might be empty.
Now we can let our fuzzer loose on the fixed code, and feel warm fuzzy feelings
that our for
loop is bug-free and reviewer-friendly.
Note: This is just one (robust) way to write this loop; there are many others.
While our fix doesn’t change many bytes of source code, it touches on a number of ideas:
container.size()
yields an unsigned type.