absl::Hash
By Evan Brown, Engineer
The Abseil container library now includes B-tree containers that generally conform to the STL sorted container APIs:
absl::btree_set
(meant to replace usage of std::set
)absl::btree_map
(meant to replace usage of std::map
)absl::btree_multiset
(meant to replace usage of std::multiset
)absl::btree_multimap
(meant to replace usage of std::multimap
)You use a B-tree container just as you would an STL ordered container:
#include "absl/container/btree_map.h"
absl::btree_map<int, std::string> ducks =
{{2, "dewey"}, {1, "huey"}, {3, "louie"},};
// Prints "huey, dewey, louie "
for (const auto& n : ducks) {
std::cout << n.second << ", ";
}
B-trees have a different implementation from STL std::map
containers, which
require binary trees commonly implemented as red-black trees.
While a red-black tree is limited to single-element nodes, with precisely two
children, a B-tree may contain multiple values per node (M), with each node
having (M+1) children. Having more values and children per node is more cache
friendly because nodes are generally allocated separately so accessing
additional nodes often results in cache misses.
For search, insertion, and deletion, the number of nodes that need to be accessed in a sorted tree is proportional to the height of the tree. In balanced trees, this height is ~logC(N), where C is the number of children per node and N is the number of values in the container; because b-trees have more children per node than binary trees, their heights are lower, and searching is faster.
For iteration, it is most cache efficient to store all values contiguously. B-trees store values contiguously in each node so it’s also generally more efficient to iterate through a B-tree than a binary tree.
B-trees also use significantly less memory per value in the tree because overhead is per node, and there are fewer nodes per value in B-trees. There is also an optimization in Abseil B-tree in which leaf nodes don’t store child pointers. Since the vast majority of nodes are leaf nodes (because of the higher branching factor due to more children per non-leaf node), this ends up saving a significant amount of memory.
When values are inserted or removed from a B-tree, nodes can be split or merged and values can be moved within and between nodes (for the purpose of maintaining tree balance). This means that when values are inserted or removed from a B-tree container, pointers and iterators to other values in the B-tree can be invalidated. Abseil B-trees therefore do not provide pointer stability or iterator stability - this is in contrast to STL sorted containers that do provide these guarantees.
B-trees are a good default choice for sorted containers, however, there are cases in which the STL alternatives may be more appropriate.
value_type
is large, fewer values can be stored per node so the
benefits of B-tree are lessened.value_type
is expensive to move, B-tree may become more expensive than
STL alternatives because B-tree needs to move values within and between nodes
to maintain balance, whereas binary trees can just move pointers instead.
std::array<int32_t, 16>
is an example of a value_type
for which STL sorted
containers currently outperform B-trees.For more information, consult the Abseil Container library documentation. Check out the B-tree Design Note for information on B-tree’s implementation.