Tip of the Week #144: Heterogeneous Lookup in Associative Containers

Originally posted as TotW #144 on March 23, 2018

By Samuel Benzaquen

Updated 2020-04-06

Quicklink: abseil.io/tips/144

Introduction

Associative containers associate an element with a key. Inserting into or looking up an element from the container requires an equivalent key. In general, containers require the keys to be of a specific type, which can lead to inefficiencies at call sites that need to convert between near-equivalent types (like std::string and absl::string_view).

To avoid this unnecessary work, some containers provide heterogeneous lookup. This feature allows callers to pass keys of any type (as long as the user-specified comparator functor supports them). See std::map::find for an example of this feature in an STL container.

Transparent Functors

A transparent functor is one that accepts more than one particular type. It must publicize this fact by providing an is_transparent inner type. The actual definition of this inner type is not relevant as it is used merely as a tag. It is common to use a using declaration of is_transparent set to void.

When a container detects a transparent functor their lookup functions will forward the user specified value intact instead of converting it to key_type first (through implicit or explicit conversion).

Implicitly supporting heterogeneous lookup can be dangerous, as the relationship between values might not be maintained after conversions. For example, 1.0 < 1.1, but static_cast<int>(1.0) == static_cast<int>(1.1). Thus, using a double to look up a value in a std::set<int> could lead to incorrect results. These potential bugs are the reason this feature is opt-in.

Using Heterogeneous Lookup For Performance

One common reason for using heterogeneous lookup is performance. We could construct the key_type, but doing so requires non-trivial work that we would rather avoid. For example:

std::map<std::string, int> m = ...;
absl::string_view some_key = ...;
// Construct a temporary `std::string` to do the query.
// The allocation + copy + deallocation might dominate the find() call.
auto it = m.find(std::string(some_key));

Instead, we can use a transparent comparator like so:

struct StringCmp {
  using is_transparent = void;
  bool operator()(absl::string_view a, absl::string_view b) const {
    return a < b;
  }
};

std::map<std::string, int, StringCmp> m = ...;
absl::string_view some_key = ...;
// The comparator `StringCmp` will accept any type that is implicitly
// convertible to `absl::string_view` and says so by declaring the
// `is_transparent` tag.
// We can pass `some_key` to `find()` without converting it first to
// `std::string`. In this case, that avoids the unnecessary memory allocation
// required to construct the `std::string` instance.
auto it = m.find(some_key);

What Else Is It Good For?

Cases exist where it is impossible or inconvenient to create a valid key_type object just to do a lookup. In those cases, we might want to use an alternative type that is much easier to produce, but contains the necessary information for the lookup. For example:

struct ThreadCmp {
  using is_transparent = void;
  // Regular overload.
  bool operator()(const std::thread& a, const std::thread& b) const {
    return a.get_id() < b.get_id();
  }
  // Transparent overloads
  bool operator()(const std::thread& a, std::thread::id b) const {
    return a.get_id() < b;
  }
  bool operator()(std::thread::id a, const std::thread& b) const {
    return a < b.get_id();
  }
  bool operator()(std::thread::id a, std::thread::id b) const {
    return a < b;
  }
};

std::set<std::thread, ThreadCmp> threads = ...;
// Can't construct an instance of `std::thread` with the same id, just to do the lookup.
// But we can look up by id instead.
std::thread::id id = ...;
auto it = threads.find(id);

STL Container Support and Alternatives

Ordered containers (std::{map,set,multimap,multiset}) support heterogeneous lookup starting in C++14. As of C++17, unordered containers still do not support it. There are proposals for adding this feature, but none have been accepted yet.

The new family of Swiss Tables, however, support heterogeneous lookup for both string-like types (std::string, absl::string_view, etc.) and smart pointers (T*,std::shared_ptr, std::unique_ptr). They require both the hasher and the equality function to be tagged as transparent. All other key types require explicit opt-in from the user.

The B-Tree containers (absl::btree_{set,map,multiset,multimap}) also support heterogeneous lookup.


Subscribe to the Abseil Blog