Originally posted as Fast TotW #52 on September 30, 2021
Updated 2023-09-30
Quicklink: abseil.io/fast/52
Flags, options, and other mechanisms to override default behaviors are useful during a migration or as a short-term mechanism to address an unusual need. In the long term they go stale (not providing real benefit to users), are almost always haunted (in the haunted graveyard sense), and prevent centralized consistency/optimization efforts. In this episode, we discuss the tradeoffs in technical debt and optimization velocity for adding configurability.
When developing a new feature, it’s straightforward and often recommended to guard it behind a flag. This approach of using feature flags makes it possible to decouple pushing the code changes to production from turning on a new feature, which might have undiscovered correctness bugs or different resource requirements.
For a commonly-used library, flags also allow early opt-ins from users. When the default is changed, the flag also provides an escape hatch to revert to the old behavior.
For example, this was employed successfully for the rollout of TCMalloc’s Huge Page Aware Allocator optimization: many applications opted-in early, but even with extensive testing, a few applications saw changes in their resource requirements. These could be opted-out while deeper investigation occurred without rolling back the efficiency gains seen by most other users of TCMalloc.
These experiences suggest flags are an unalloyed good in theory, but practice is wholly different. Whether flags are considered good or not is dependent on what percentage of users will use the feature:
The units for many flags are entirely opaque and often have second or third order effects that may not be immediately intuitive.
In his 2021 CppCon talk, Titus Winters makes a real-world note of this phenomenon: The “popcorn button” of microwaves should not be used for microwave popcorn, as the button does not align with the settings required.
Moving to Google’s C++ codebase, SwissMap, Abseil’s high performance hashtables,
does not provide an implementation of the flag max_load_factor
. The low
utility of max_load_factor
was uncovered during the migration to SwissMap.
Even worse, in many of the situations where max_load_factor
was set, it was
set incorrectly.
Even when the role of max_load_factor
was correctly understood, its value was
often misconfigured to achieve a desired goal. While max_load_factor(0.25)
might convey an intent to “trade RAM for speed,” such a setting can make CPU
performance worse while simultaneously using more RAM, defeating the intent of
its user.
In other situations, different implementations can be API-compatible, but their
behaviors do not transfer effectively between implementations. Open addressing
hashtables have typical load factors <1, while chained hashtables have load
factors typically ≥1. Changing between these implementations would cause the
max_load_factor
to have a surprisingly different effect.
This experience led the SwissMap authors to make max_load_factor
a no-op,
providing it only for API compatibility.
Tuning a configuration is another optimization that does not age well.
For flags defined in commonly used libraries, the defaults themselves have probably evolved: a feature was launched or an optimization landed. The nature of Google’s production configuration languages often means that once a service has hard-coded a flag’s value, it takes precedence over the default. This was the whole reason for choosing a non-default value in the first place; but with the codebase evolving at a high rate, it’s easy to overlook that the underlying infrastructure has improved and that overriding value now is worse than the default.
The key action here is to use customized flags lightly and regularly reconsider their use. When designing new options, prefer good defaults or make parameters self-tune if possible. Self-tuning may come in the form of adapting automatically to workloads, rather than requiring careful tuning through flags.
Titus Winters notes that “If 99% of your users understand an API’s behavior through the lens of the default setting, the 1% of users that change that setting are at risk: APIs built at a higher level have a good chance of assuming the default behavior, leaving your 1% semi-supported.”
Configurability can be a great short-term boon; but long-term, configurability is a double edged sword. Options increase the state-space that has to be considered with every future change, making it more difficult to reason about, test, and successfully land new features in production. Beyond just optimizing costs, this complexity also hampers achieving better business objectives: Extra complexity that delays an improvement to product experiences is a non-obvious externality.
For example, TCMalloc has a number of tuning options and customization points,
but ultimately, several optimizations came from sanding away extra configuration
complexity. The rarely used malloc hooks API required careful structuring of
TCMalloc’s fast path to allow users who didn’t use hooks–most users–to not pay
for their possible presence. In another case, removing the sbrk
allocator
allowed TCMalloc to structure its virtual address space carefully, enabling
several enhancements.
While this discussion has largely focused on knobs and tunables, APIs and libraries have the same challenges.
An existing library, X, might be inadequate or insufficiently expressive, which can motivate building a “better” alternative, Y, along some dimensions. Realizing the benefit of using Y is dependent on users both discovering Y and picking between X and Y correctly–and in the case of a long-lived code base, keeping that choice optimal over time.
For some uses, this strategy is infeasible. my::super_fast_string
will
probably never replace std::string
because the latter is so entrenched and the
impedance mismatch of living in an independent string ecosystem exceeds the
benefits. Multiple
vocabulary types
suffer from impedance mismatch–costly interconversions can overwhelm the
overall benefits. The costs of migrating the world also need to be considered
upfront. Without active migration, we end up with two things.
There are times where a new library or API is truly needed –
SwissMap needed to break stability
guarantees provided by std::unordered_map
on an instance-by-instance basis to
avoid waiting for every problematic usage to be fixed. In that case, however,
the performance benefits it provided were only realized by active migration.
Being able to aim for a complete migration eases maintenance and educational
burdens as well. A compelling performance case simplified to “just use SwissMap”
avoids the need for painstaking benchmarking with every use where the optimal
choice could get out of date.
When adding new customization points, consider how they’ll evolve over the long-term.
Flags are a powerful tool for tuning and optimization, but the author of a customization point has the most context for how to use it effectively. Choosing good defaults or making features self-tune is often better for the codebase as a whole.
Discoverability, let alone optimal selection, is challenging.