Tip of the Week #144: Heterogeneous Lookup in Associative Containers

Originally posted as TotW #144 on March 23, 2018

By Samuel Benzaquen

Updated 2020-04-06

Quicklink: abseil.io/tips/144

Introduction

Associative containers associate an element with a key. Inserting into or looking up an element from the container requires an equivalent key. In general, containers require the keys to be of a specific type, which can lead to inefficiencies at call sites that need to convert between near-equivalent types (like std::string and absl::string_view).

To avoid this unnecessary work, some containers provide heterogeneous lookup. This feature allows callers to pass keys of any type (as long as the user-specified comparator functor supports them). See std::map::find for an example of this feature in an STL container.

Transparent Functors

A transparent functor is one that accepts more than one particular type. It must publicize this fact by providing an is_transparent inner type. The actual definition of this inner type is not relevant as it is used merely as a tag. It is common to use a using declaration of is_transparent set to void.

When a container detects a transparent functor their lookup functions will forward the user specified value intact instead of converting it to key_type first (through implicit or explicit conversion).

Implicitly supporting heterogeneous lookup can be dangerous, as the relationship between values might not be maintained after conversions. For example, 1.0 < 1.1, but static_cast<int>(1.0) == static_cast<int>(1.1). Thus, using a double to look up a value in a std::set<int> could lead to incorrect results. These potential bugs are the reason this feature is opt-in.

Using Heterogeneous Lookup For Performance

One common reason for using heterogeneous lookup is performance. We could construct the key_type, but doing so requires non-trivial work that we would rather avoid. For example:

std::map<std::string, int> m = ...;
absl::string_view some_key = ...;
// Construct a temporary `std::string` to do the query.
// The allocation + copy + deallocation might dominate the find() call.
auto it = m.find(std::string(some_key));

Instead, we can use a transparent comparator like so:

struct StringCmp {
  using is_transparent = void;
  bool operator()(absl::string_view a, absl::string_view b) const {
    return a < b;
  }
};

std::map<std::string, int, StringCmp> m = ...;
absl::string_view some_key = ...;
// The comparator `StringCmp` will accept any type that is implicitly
// convertible to `absl::string_view` and says so by declaring the
// `is_transparent` tag.
// We can pass `some_key` to `find()` without converting it first to
// `std::string`. In this case, that avoids the unnecessary memory allocation
// required to construct the `std::string` instance.
auto it = m.find(some_key);

What Else Is It Good For?

Cases exist where it is impossible or inconvenient to create a valid key_type object just to do a lookup. In those cases, we might want to use an alternative type that is much easier to produce, but contains the necessary information for the lookup. For example:

struct ThreadCmp {
  using is_transparent = void;
  // Regular overload.
  bool operator()(const std::thread& a, const std::thread& b) const {
    return a.get_id() < b.get_id();
  }
  // Transparent overloads
  bool operator()(const std::thread& a, std::thread::id b) const {
    return a.get_id() < b;
  }
  bool operator()(std::thread::id a, const std::thread& b) const {
    return a < b.get_id();
  }
  bool operator()(std::thread::id a, std::thread::id b) const {
    return a < b;
  }
};

std::set<std::thread, ThreadCmp> threads = ...;
// Can't construct an instance of `std::thread` with the same id, just to do the lookup.
// But we can look up by id instead.
std::thread::id id = ...;
auto it = threads.find(id);

STL Container Support and Alternatives

Ordered containers (std::{map,set,multimap,multiset}) support heterogeneous lookup starting in C++14. As of C++17, unordered containers still do not support it. There are proposals for adding this feature, but none have been accepted yet.

The new family of [Swiss Tables][swisstables], however, support heterogeneous lookup for both string-like types (std::string, absl::string_view, etc.) and smart pointers (T*,std::shared_ptr, std::unique_ptr). They require both the hasher and the equality function to be tagged as transparent. All other key types require explicit opt-in from the user.

The [B-Tree][btree] containers (absl::btree_{set,map,multiset,multimap}) also support heterogeneous lookup.

[swisstables]: https://abseil.io/docs/cpp/guides/container [btree]: go/btree


Subscribe to the Abseil Blog