string_view
operator+
vs. StrCat()
absl::Status
std::bind
absl::optional
and std::unique_ptr
absl::StrFormat()
make_unique
and private
Constructors.bool
explicit
= delete
)switch
Statements Responsibly= delete
AbslHashValue
and Youcontains()
std::optional
parametersif
and switch
statements with initializersinline
Variablesstd::unique_ptr
Must Be MovedAbslStringify()
vector.at()
auto
for Variable DeclarationsOriginally posted as TotW #234 on August 29, 2024
By Steve Wang
Updated 2024-09-30
Quicklink: abseil.io/tips/234
Many programming languages, such as Java and Python, always access objects via references, and functions that accept objects get their own reference to the caller’s object. Others, such as C and Go, allow you to explicitly specify pointers to objects. C++ further allows you to choose whether to pass by value, giving the called function a copy of the argument, or to pass by reference, giving the called function access to the caller’s object. This tip will illustrate the various ways input-only function parameters are passed in C++ and provide recommendations and caveats.
When we talk about passing by value, we explicitly mean that the language ensures that the scope of the function call has an exclusive copy of its argument[^elision]. Reassigning a new value to this variable does not mutate the corresponding object in the caller’s scope. However, invoking the argument’s methods may still mutate its underlying state.
Meanwhile, when we talk about passing by reference, we effectively bring the object from the caller’s scope into the current function’s scope, and reassignment will mutate the underlying object.
Passing by pointer has some similarities to passing by reference, yet is technically a special case of pass-by-value, as the pointer itself is a value that corresponds to the address of the underlying object (or a null pointer, which refers to no object at all).
Consider the following:
void AddOneToValue(int x) { ++x; } void AddOneToReference(int& x) { ++x; } // Here, the pointer points at a "pointee"; we're adding one to the // pointed-at object. void AddOneToPointee(int* x) { ++*x; } ... int x = 5; AddOneToValue(x); // x is still 5. AddOneToReference(x); // x is now 6. AddOneToPointee(&x); // x is now 7.
As a result, when writing functions in C++, the language makes us consider how to pass parameters – should we pass by value, by pointer, by reference (if so, which kind)?
The astute reader might wonder what the issues are with always passing by
reference. First off, having unnecessary const T&
(e.g., Add(const int& a,
const int& b)
) adds visual clutter.
Second, in C++, as mentioned above, a reference is largely syntactic sugar for a pointer[^const_ref], with the associated overhead of a memory lookup when we want to use it unless the compiler is able to optimize that away. By passing small types by value, we can pass them in registers instead of needing to store them on the stack.
// Passing a small value by value, the compiler can avoid a stack allocation and // pass it in a register. int foo = 5; Bar(foo); // However, passing a small value by reference requires `foo` to be copied // ("spilled") to the stack since you can't take the address of a register. int foo = 5; Bar(&foo);
Of course, if the variable is already on the stack or heap (e.g., it’s part of an array) then this concern is irrelevant, but we should still prefer to pass by value to avoid some cache misses and memory pressure if it’s already loaded in a register.
Regardless, references can further introduce concerns regarding aliasing[^aliasing] – since the function does not have an exclusive copy of the object, we have no guarantees that the object will remain unchanged throughout the lifetime of the function, even if we have a reference-to-const (which is only a promise that we will not mutate it through that particular parameter).
On the opposite end of the spectrum, one might wonder why we don’t just pass all input parameters by value.
In C++, if you pass a variable by value, depending on how the function is called the variable’s value may be copied or moved (or neither)[^names]. On the other hand, passing by reference (or pointer) allows you to refer to an existing object and therefore avoid a copy entirely. So, in general, the larger an object, the more you should prefer passing by reference.
Passing by value can have benefits as well as drawbacks from the perspective of memory safety. On one hand, if you have the only copy of an object, then you don’t have to worry about other threads stomping over its state. On the other hand, if you retain a reference to this object, once it goes out of scope, then you have a use-after-free bug (just as for any other local variable).
All of these rules apply equally. If none of them apply, a safe option is to pass by reference-to-const for required parameters, and by pointer for optional ones.
Passing by value can be more efficient in some cases (when the relevant types are small enough that moving or copying them is more efficient than working via pointers), and is helpful for conveying ownership (typically when the called function wants to own the value, to move from it, or to otherwise modify its own copy).
Specifically, the types listed below should usually be passed by value:
Some additional types can be efficiently passed by value, as an optimization:
Types that provide an efficient move constructor, only if the called
function needs its own copy of the value. Examples include
std::vector<T>
, std::string
, absl::flat_hash_map<T>
and other
containers that don’t store their contents inline[^proto_move].
In these cases, you should pass by value, and std::move
at the callsite
when needed (or pass in a temporary which is subject to mandatory copy
elision per Tip #166). See Tip #117 for
supplemental reading on copy elision and pass-by-value.
Passing by value is especially common in a constructor that stores one of these types in a member variable.
class Foo { public: // Here, we pass bar by reference and copy it into bar_. Foo(const std::vector<int>& bar) : bar_(bar) {} // But, we can instead use std::vector's move constructor to avoid // the expensive copy entirely, in some cases. Foo(std::vector<int> bar) : bar_(std::move(bar)) {} private: std::vector<int> bar_; };
T
, where sizeof(T) <= 16
1 and T
is either a scalar type such as
an integer or a pointer type, or a class[^calls] such that:
For types that your team does not own[^hyrum], you should only rely on this
behavior if they explicitly document that they should be passed by value,
such as spanner::Database
and absl::Duration
.
std::optional<T>
, where passing T
by value applies.
std::optional<T>
adds some size overhead compared to T
, which further
limits the types that can be passed by value efficiently. So, for instance,
std::optional<std::span<U>>
and std::optional<absl::string_view>
are too
big, as each of these wrapped types is 16 bytes before accounting for
std::optional
’s overhead.
If sizeof(std::optional<T>) > 16
, or if T
has a nontrivial copy
constructor, then prefer passing absl::Nullable<const T*>
(Tip #163), and use a null pointer to represent the case that
would otherwise be captured by std::nullopt
.
Note that Tip #163 applies here – if all callers will always
have a std::optional<T>
, then you may pass by const&
.
Do not use this idiom with smart pointers or other types that have a
representation for “no value”. For instance, do not write
std::optional<std::unique_ptr<U>>
; instead, prefer to use
std::unique_ptr<U>
directly, and pass a null pointer to represent “no
value”.
Note: If an argument x
in a call f(x)
is required to outlive the function
call,
do not pass it by reference.
The types listed below should usually be passed by reference (for required parameters) or by pointer (for optional parameters).
Smart pointers (e.g., std::unique_ptr<T>
) where you don’t want to transfer
ownership: dereference the smart pointer to pass const T&
if the
pointed-at value is always known (and required) to be non-null; else pass
absl::Nullable<const T*>
(Tip #188).
In cases of shared ownership[^shared_ptr] where you only sometimes want to
take ownership, you may want to pass a reference to the std::shared_ptr
to
avoid the slight overhead of updating reference counts.
Containers that store their contents inline, e.g., std::array<T, N>
and
absl::InlinedVector<T, N>
.
While std::array<T, N>
can be efficient to pass by value if sizeof(T) * N
<= 16
, absl::InlinedVector<T, N>
has a non-trivial copy constructor and
thus will never be passed in a register.
Types with non-trivial copy constructors, where you don’t intend to use move semantics.
Protocol buffers.
You might think that the Duration
type defined by
edition = "2023"; message Duration { int64 seconds = 1; int32 nanos = 2; }
only contains an int64
(8 bytes) and an int32
(4 bytes) and is therefore
12 bytes (padded out to 16 bytes), but that’s not correct because protobufs
may have a vtable pointer (8 bytes) or other metadata. Additionally, you
shouldn’t pass protos by value by default (even if they don’t have many
fields) because they do not promise that they can be trivially copied (and
in practice they usually cannot).
For some types, a corresponding view type – a type that gives read-only access to the underlying data, and might support various different underlying types – can be a good way to accept inputs to a function that does not need its own copy of those inputs.
std::string
,
absl::string_view
, or const char*
, defining the parameter as an
absl::string_view
is efficient and supports all of these inputs types (see
Tip #179).For functions accepting a std::vector<T>
, defining the parameter as an
absl::Span<const T>
is more efficient and more flexible (see
Tip #93), though using const std::vector<T>&
can be a
reasonable choice if constraints make absl::Span
impractical.
const Fn&
(where Fn
is a
template parameter) or as a type-erased callable such as absl::FunctionRef
(see Tip #145).While passing function parameters by const&
is a good default choice, there
are plenty of cases where it’s not the best option. The guidelines in this tip
can help to weigh the relevant factors and design safe and efficient APIs.
We want to emphasize that they are just guidelines, though, and if you have good reason to deviate from these (e.g., benchmarking or profiling identifies potential performance gains), we encourage you to do so (and to document your rationale for the next reader).
In typical Google production environments, namely x86-64 Linux. See section 3.2.3 of the ELF x86-64 ABI spec. On Windows, only types that are 8 bytes or fewer are passed in registers. [^aliasing]: In best-case scenarios, pointer aliasing prevents the compiler from making certain optimizations. In worst-case scenarios, pointer aliasing can result in violated preconditions, logic bugs, and buffer overflows. [^calls]: Formally, this class must not be “non-trivial for the purpose of calls”. This is very similar, but not quite identical, to a trivially copyable class in C++. See the ABI specification for the formal definition of this term. [^const_ref]: const-references are somewhat more complex – for one, they can bind to temporaries. Further, references cannot be null, so we generally recommend passing references instead of pointers for required input parameters that don’t need to outlive the function call. [^elision]: This does not necessarily mean that the function creates a separate copy of its argument, as copy elision may have taken place. [^hyrum]: While it can be more efficient to pass small types by value, you may accidentally make it harder to add new fields to those types or otherwise change the internal representation (see go/hyrums-law), since you’re adding an implicit dependency on the size of the type, as well as on the constructors and destructors that it defines. [^names]: To a first approximation, this results in a copy when you pass in a named object (such as a non-reference stack variable or data member). Tip #166 covers this in more detail. [^proto_move]: Protocol buffers also define a move constructor that is usually comparable to a shallow copy – the exception is if you’re moving between two messages that live on different arenas, or between a heap-allocated message and an arena-allocated message, in which case it is comparable to a deep copy. go/proto-cpp-arena-allocation#message-class-methods has more details. [^shared_ptr]: As stated in the style guide, shared ownership should only be used with good reason, and not as a way to avoid thinking about object lifetimes. ↩