Tip of the Week #49: Argument-Dependent Lookup

Originally posted as totw/49 on 2013-07-14

“…whatever disappearing trail of its legalistic argle-bargle one chooses to follow…” –Antonin Scalia, U.S. v Windsor dissenting opinion

Overview

A function call expression such as func(a,b,c), in which the function is named without the :: scope operator, is called unqualified. When C++ code refers to a function by an unqualified name, the compiler performs a search for a matching function declaration. What is surprising to some people (and different from other languages) is that in addition to the caller’s lexical scope, the set of search scopes is augmented by namespaces associated with the function argument types. This additional lookup is called Argument-Dependent Lookup (ADL). It’s definitely happening in your code, so you’ll be much better off with a basic understanding of how it works.

Name Lookup Basics

A function call must be mapped to a single function definition by the compiler. This matching is done in two independent serial processing stages. First, name lookup applies some scope searching rules to produce a set of overloads matching the name of the function. Then overload resolution takes those overloads produced by that name lookup and tries to choose a best match for the arguments given at the call site. Keep this distinction in mind. Name lookup comes first, and it doesn’t try to make any determination as to whether a function is a good match or not. It doesn’t even consider the argument count. It just searches scopes for a function name. Overload resolution is a complex topic in its own right, but it’s not our focus right now. Just know that it’s a separate processing stage that gets its inputs from name lookup.

When an unqualified function call is encountered, several independent search sequences can occur for that function name, each attempting to match the name to a set of overloads. The most obvious search is the outward search starting from the lexical scope of the call site:

namespace b {
void func();
namespace internal {
void test() { func(); } // ok: finds b::func().
} // b::internal
} // b

This name lookup has nothing to do with ADL yet (func() has no arguments). It is simply a search outward from the site of the function call, proceeding outward from local function scope (if applicable), to class scope, enclosing class scope, and base classes (if applicable), then to namespace scope, and out into enclosing namespaces, and finally the global :: namespace.

As name lookup progresses through a sequence of increasingly widening scopes, the process stops as soon as any function with the target name is found, whether or not that function’s arguments are compatible with the arguments supplied by the call site. When a scope is encountered containing at least one function declaration with the target name, the overloads in that scope become the result of that name lookup.

This is illustrated in the example below:

namespace b {
void func(const string&);  // b::func
namespace internal {
void func(int);  // b::internal::func
namespace deep {
void test() {
  string s("hello");
  func(s);  // error: finds only b::internal::func(int).
}
}  // b::internal::deep
}  // b::internal
}  // b

It’s tempting but incorrect to think a func(s) expression will overlook the obviously bad match of b::internal::func(int), and continue looking to the next outward enclosing scope to find b::func(const string&). However, name lookup doesn’t consider argument types. It finds something called func and stops in b::internal, leaving the evaluation of “obviously bad” to the overload resolution phase. The b::func(const string&) function is never even seen by overload resolution.

An important implication of the scoped search order is that overloads in a scope appearing earlier in the search order will hide overloads from later scopes.

Argument-Dependent Lookup (ADL)

If a function call passes arguments, a few more parallel name lookups are launched. These extra lookups consider each associated namespace of each of the function call’s arguments. Unlike lexical scope name lookup, these argument- dependent lookups do not proceed to enclosing scopes.

Results of lexical scope name lookup and all ADLs are merged together, to form the final set of function overloads.

The Simple Case

Consider the following code:

namespace aspace {
struct A {};
void func(const A&);  // found by ADL name lookup on 'a'.
}  // namespace aspace

namespace bspace {
void func(int);  // found by lexical scope name lookup
void test() {
  aspace::A a;
  func(a);  // aspace::func(const aspace::A&)
}
}  // namespace bspace

Two name lookups are launched to resolve the call to func(a). The lexical scope name lookup starts in the local function scope of bspace::test(). It finds no func there and proceeds to the scope of namespace bspace, in which it finds func(int) and stops. The other name lookup, which is due to ADL, starts in the namespace associated with the argument a. In this case, that’s only namespace aspace. That lookup finds aspace::func(const aspace::A&) and stops. Overload resolution therefore receives two candidates. These are ‘bspace::func(int)’ from lexical name lookup, and ‘aspace::func(const aspace::A&)’ from the single ADL lookup. In overload resolution, the func(a) call resolves to aspace::func(const aspace::A&). The bspace::func(int) overload is not a good match for the argument type and so it is rejected by overload resolution.

The lexical name search and each of the additional ADL-triggered name searches can be considered to occur in parallel, with each returning a set of candidate function overloads. The results of all such searches are thrown in a bag and they compete via overload resolution to determine the best match. If there’s a tie for best match, the compiler issues an ambiguity error; “There can be only one.” If no overload is a good match, that’s an error too. So more precisely, “there must be exactly one”, which doesn’t sound as cool in a movie trailer.

Type-Associated Namespaces

The previous example was the simple case, but a more sophisticated type can have many namespaces associated with it. The set of namespaces associated with a type includes any namespace of any type that appears as a part of the argument type’s full name, including its template parameter types. It also includes the namespaces of direct and indirect base classes. For example, a single argument that expands to the type a::A<b::B, c::internal::C*> will produce searches beginning in the a, b and c::internal namespaces (and any other namespaces associated with the constituent types a::A, b::B, or c::internal::C), each looking for the called function name. The following example shows a few of these effects:

namespace aspace {
struct A {};
template <typename T> struct AGeneric {};
void func(const A&);
template <typename T> void find_me(const T&);
}  // namespace aspace

namespace bspace {
typedef aspace::A AliasForA;
struct B : aspace::A {};
template <typename T> struct BGeneric {};
void test() {
  // ok: base class namespace searched.
  func(B());
  // ok: template parameter namespace searched.
  find_me(BGeneric<aspace::A>());
  // ok: template namespace searched.
  find_me(aspace::AGeneric<int>());
}
}  // namespace bspace

Tips

With the fundamental name lookup mechanism fresh in your mind, consider the following tips which may help you when you are working with real C++ code.

Type Aliases

Sometimes determining the set of namespaces associated with a type will take a bit of detective work. typedef and using declarations can introduce aliases for a type. In those cases the aliases are fully resolved and expanded to their source types before the list of namespaces to search are chosen. This is one way in which typedef and using declarations can be a bit misleading, because they can lead you to make incorrect predictions about which namespaces will be searched by ADL. This is demonstrated below:

namespace cspace {
// ok: note that this searches aspace, not bspace.
void test() {
  func(bspace::AliasForA());
}
}  // namespace cspace

Caveat Iterator

Be careful with iterators. You don’t really know with what namespaces they are associated, so don’t rely on ADL for resolving function calls involving iterators. They might just be pointers to the elements, or they might be in some namespace private to the implementation that has nothing to do with the container’s namespace.

namespace d {
int test() {
  std::vector<int> vec(a);
  // maybe this compiles, maybe not!
  return count(vec.begin(), vec.end(), 0);
}
}  // namespace d

The above code has a dependency on whether std::vector<int>::iterator is int* (which is possible) or some type in a namespace that has a count overload (like std::count()). It’s possible that this will work on some platforms and not others, or that it will work in debug builds with instrumented iterators, but not in optimized builds. It’s better to just qualify the function name. If you want to call std::count(), spell it that way.

Overloaded Operators

An operator (e.g. + or <<) can be thought of as a kind of function name, e.g. operator+(a,b) or operator<<(a,b), and are also unqualified. One of the most important uses of ADL is the search for operator<< used during logging. Usually we see something like std::cout << obj; for some obj, let’s say of type O::Obj. This statement is like an unqualified function call of the form operator<<(std::ostream&, const O::Obj&), which will find overloads in the std namespace from the std::ostream parameter, the O namespace from the O::Obj parameter, and of course any overloads picked up from the lexical scope search from the call site.

It’s important to place such operators in the same namespace as the user-defined type they’re meant to operate upon: in this case within namespace O. If the operator<< is placed in an outer namespace like :: (the global namespace), that operator will work for a while until someone quite innocently places an unrelated operator<< in namespace ‘O’ for some other type. It takes a bit of discipline but saves a lot of confusion later to follow the simple rule of defining all operators and other associated nonmember functions next to the type’s definition in the same namespace.

Fundamental Types

Note that the fundamental types (e.g. int, double, etc) are not associated with the global namespace. They are associated with no namespace. They do not contribute any namespaces to ADL. Pointer and array types are associated with their pointee or element types.

Refactoring Gotchas

Refactorings that change the types of arguments to an unqualified function call can affect which, if any, overloads are considered. Just moving a type into a namespace and leaving behind a typedef in the old namespace for compatibility doesn’t help, and really just makes the problem harder to diagnose. Be careful when moving types to new namespaces.

Similarly, moving a function to a new namespace and leaving behind a using declaration might mean that unqualified calls won’t find it anymore. Sadly, they might still compile by finding a different function you didn’t intend them to find. Be careful when moving functions to new namespaces.

Final Thought

Relatively few programmers understand the exact rules and corner cases involved with function lookups. The language spec contains 13 pages of rules about what exactly goes into a name search, including special cases, details about friend functions, and enclosing class scopes to keep your head spinning for years. Despite all this complexity, if you keep the basic idea of parallel name searches in mind, you’ll be on solid footing for understanding how your function calls and operators are resolving. You will now be able to see how seemingly remote declarations end up being chosen when you invoke functions or operators. And you’ll be a little better able to diagnose puzzling build errors like ambiguities or name-hiding effects when they happen.