The Abseil Random library provides functions and utilities for generating
pseudorandom data. This library is designed to be used as a replacement for the
random number generators and distribution functions within the
<random>
library, while
maintaining compatibility with that library.
The Abseil Random library provides several advantages over <random>
:
Easy construction of well-seeded generators
Abseil’s bit generators require no constructor arguments to be seeded
properly. Providing the initial state for a random value generator (ie.
“seeding”) is a nontrivial task which often requires knowledge of the
underlying bit-generation algorithm. When seeded, Abseil bit generators
exhibit Process Stability.
<random>
by
representing distributions as functions rather than objects, while still
decoupling bit generation from distribution sampling.To get started, add the following #include
, and analogous dependency within
your build file.
#include "absl/random/random.h"
The Abseil Random library provides a variety of distribution function
templates, which produce randomly sampled values from particular
distributions. They obtain their randomness from a user-supplied uniform
random bit generator (URBG, or bit generator for short), which should be
treated as an opaque object unless you’re implementing a distribution function.
absl::BitGen
is the preferred bit generator for most use cases.
absl::BitGen bitgen;
size_t index = absl::Uniform(bitgen, 0u, elems.size());
double fraction = absl::Uniform(bitgen, 0, 1.0);
bool coin_flip = absl::Bernoulli(bitgen, 0.5);
Bit generators produce values with the function-call operator, but this interface should never be used directly in application code.
Properly sampling from a distribution can be surprisingly subtle; it requires knowledge of the underlying URBG algorithm, and the range of values that it produces. This range of values may or may not be the full space of values representable by the output data type. Getting these details wrong can result in biased sampling.
```c++ {.bad} // AVOID: Invoking a bit generator’s call operator directly. If bitgen() produces // values in the range [0,7], then this code will produce 1 and 2 twice as often // as other values. uint32_t die_roll = 1 + (bitgen() % 6);
```c++ {.good}
// BETTER: Use a distribution function instead:
uint32_t die_roll = absl::Uniform(absl::IntervalClosed, bitgen, 1, 6);
Always use the Abseil Random library’s distribution functions instead.
Avoid continuously re-instantiating bit generators. URBG instances have state which is designed to generate a sequence, and state initialization may involve a lot of CPU cycles. It’s better to reuse generator instances unless those generators will be called very infrequently.
```c++ {.bad} for (auto& elem : v_) { // AVOID: Creating a new absl::BitGen instance in a loop. absl::BitGen gen; elem = absl::Uniform(gen, 0, 1.0); }
```c++ {.good}
// BETTER: Reuse absl::Bitgen instances.
absl::BitGen gen;
for (auto& elem : v_) {
elem = absl::Uniform(gen, 0, 1.0);
}
absl::Uniform()
The most common use case for a random value library is also the most simple: “Give me a number between A and B”.
int digit = absl::Uniform(gen, 0, 10); // Samples an integer from [0, 10)
or perhaps:
double less_than_1 = absl::Uniform(gen, 0, 1.0); // Samples from [0.0, 1.0)
or if we want to explicitly specify the desired numerical type, then perhaps:
// Casts arguments to the specified type, before sampling.
auto index = absl::Uniform<size_t>(gen, 0, v.size());
In the absence of an explicitly specified return type, the absl::Uniform()
function will use the more general of the two endpoints’ data types. Note that
if neither of these types can represent the other without loss of precision,
then the function call will not compile.
size_t index = absl::Uniform(gen, 0u, v.size()); // Both are unsigned types
auto index = absl::Uniform<size_t>(gen, 0, v.size()); // Also fine
size_t index = absl::Uniform(gen, 0, v.size()); // Error: int vs size_t
absl::Uniform()
You might sometimes find that sampling from the half-open distribution [a, b)
isn’t a natural fit for your application. For such cases, we allow endpoint
semantics to be explicitly specified, by providing one of the following
identifiers as the first function call argument:
absl::IntervalClosed // Sample from [a, b]
absl::IntervalOpen // Sample from (a, b)
absl::IntervalOpenClosed // Sample from (a, b]
absl::IntervalClosedOpen // Sample from [a, b) … (Default)
Some examples might include:
int die_roll = absl::Uniform(absl::IntervalClosed, gen, 1, 6);
double jitter = absl::Uniform(absl::IntervalOpen, gen, -0.25, 0.25);
Choose whichever endpoints and semantics most naturally fit to your use case.
One final note - Omitting the endpoints when sampling an unsigned integer provides a shorthand syntax for sampling from the entire type.
auto byte = absl::Uniform<uint8_t>(bitgen); // From [0, 255]
BitGenRef
: A Type-Erased URBG InterfaceAn instance of the absl::BitGenRef
class can be thought of as a type-agnostic
“reference” to an
URBG
instance. Functions which accept an absl::BitGenRef
can be invoked using any
type of URBG, such as absl::BitGen
, absl::InsecureBitGen
, etc.
int TakesBitGenRef(absl::BitGenRef bitgen) {
int v = absl::Uniform<int>(bitgen, 0, 1000);
}
absl::BitGenRef
has implicit conversion constructors from any URBG&
. A
absl::BitGenRef
does not copy or own the underlying URBG, to which it
points, and so the underlying URBG must outlive the BitGenRef
instance.
MockingBitGen
Importantly, absl::BitGenRef
allows mocking through the compatible
absl::MockingBitGen
type. When testing we might want to mock to provide
deterministic results. The MockingBitGen
provides such a mock for URBG
objects:
absl::MockingBitGen bitgen;
ON_CALL(absl::MockUniform<int>(), Call(bitgen, 1, 10000))
.WillByDefault(Return(20));
EXPECT_EQ(absl::Uniform<int>(bitgen, 1, 10000), 20);
EXPECT_CALL(absl::MockUniform<double>(), Call(bitgen, 0.0, 100.0))
.WillOnce(Return(5.0))
.WillOnce(Return(6.5));
EXPECT_EQ(absl::Uniform(bitgen, 0.0, 100.0), 5.0);
EXPECT_EQ(absl::Uniform(bitgen, 0.0, 100.0), 6.5);
MockingBitGen
has full support for Googletest matchers and actions.
Abseil provides absl::BitGen
as a good default choice for random number
generation. An absl::BitGen
is well seeded by default, so seeding operations
are not necessary, it has a large cycle-length, good diffusion of state, and is
reasonably fast.
It is reasonable to use absl::InsecureBitGen
in tests or when using a URBG in
small isolated tasks such as in std::shuffle
. Otherwise, default to using
absl::BitGen
or util_random::SharedBitGen
except when a low memory footprint
is critical, or benchmarks indicate a measurable improvement in application
performance.
absl::BitGen
Over absl::InsecureBitGen
?The use of values produced by insecure bit generators in security-sensitive
contexts may introduce occasional (but dangerous) security issues. Although
absl::BitGen
is not suitable for cryptographic applications such as key
generation, it provides guarantees strong enough to be resilient to misuse.
A random bit generator is built on a recursive method where an internal hidden
state generates variates (outputs) as well as the next internal state.
Initialization of this hidden state is called seeding, which maps a sequence
of values, “the seed”, onto the internal state of the bit generator. In C++ the
class std::seed_seq
provides a common algorithm for doing this mapping. To generate truly random
seeds, the entropy typically comes from an OS entropy pool, which is available
via std::random_device
.
In Abseil, both absl::BitGen
and absl::InsecureBitGen
are well seeded by
default so no additional seeding is requred, and use of std::random_device
is
unnecessary. These generators acquire seed data from an an underlying entropy
pool managed by the Randen pseudorandom
generator, initially seeded from /dev/urandom
.
```c++ {.good} // GOOD: No need to provide additional seed to absl::BitGen. absl::BitGen gen;
When seeding other random bit generators, prefer using multiple values in the
seed as well as allowing `std::seed_seq` to handle combining the values rather
than pre-merging them to a single value. Using a `std::seed_seq` with an
`absl::BitGen` is allowed and will incorporate those values plus additional
salting value to initialize the underlying state; please refer to the
[section on stability](#seed-stability) for more information.
```c++ {.bad}
// AVOID: Combining seed values outside of std::seed_seq.
uint32_t seed = MyHash(a) + MyHash(b) + MyHash(c);
std::mt19937 gen(seed);
```c++ {.good} // BETTER: Allow std::seed_seq to combine seed values. std::seed_seq my_seed_seq{a, b, c}; absl::BitGen gen(my_seed_seq);
When available, an Abseil URBG is generally an excellent replacement for short
lived instances like the following:
```c++ {.bad}
// AVOID: Initializes a large-state PRNG with a single word from std::random_device
std::random_device rd;
std::mt19937 gen(rd());
std::shuffle(data.begin(), data.end(), gen);
```c++ {.good} // BETTER: Use a well-seeded PRNG by default. absl::InsecureBitGen g; std::shuffle(data.begin(), data.end(), g);
### Why Is My Generator Producing Different Values Between Re-runs, Even With An Explicit Seed?
The Abseil libraries are continuously updated, so values produced from one
seed sequence can change at any time. Abseil tries to prevent dependencies on
these mappings from accumulating by mixing the seed data with nondeterministic
data. The same `std::seed_seq` will always produce the same generator within the
same process, but the results will vary vary between executions!
```c++
std::seed_seq my_seed_seq{1, 2, 3};
absl::BitGen gen(my_seed_seq);
ABSL_LOG(INFO) << "Fraction: " << absl::Uniform(gen, 0, 1.0);
blaze test :my_unit_test
…
INFO Fraction: 0.2578
blaze test :my_unit_test
…
INFO Fraction: 0.012732
If you believe that your generator requires stability beyond the lifetime of its process execution, please refer to the section on stability.
Like the C++ standard library random engines, neither absl::BitGen
nor
absl::InsecureBitGen
are thread safe.
Efficiently leveraging a bit generator shared between multiple threads can be
tricky and subtle. Use of locally-instantiated generators are preferred to
global application-owned bit generators protected by a Mutex
and shared across
multiple threads.
Yes - the distribution functions are compatible with any type conforming to the
UniformRandomBitGenerator
named requirement, as defined by C++11. This includes std::
types (e.g.
std::minstd_rand
and std::mt19937_64
).
We recognize that there are use cases which inherently require universally stable (ie. seed-stable) variate generation, but this represents a narrow class of applications within Google. Such use cases require stability of both generator algorithms and distribution algorithms; in some cases, they require this stability across multiple platforms, as well. Providing this would completely freeze our ability to update and improve the Abseil Random library for the (much larger) class of applications which neither require nor benefit from these constraints. We hope to revisit the question of how to provide for these use cases in the future, but for now, we (by design) offer no API that indefinitely provides the same Seed→SequenceVariate mappings.
Most applications and unit tests do not need to explicitly depend on the sequence of variates generated by a given seed. If you believe that your binary is an exceptional use case, Abseil Random may not be the right library for you. That said, there is method to our “nondeterministic-seed” madness, and it’s worth outlining.
Our experience has taught us that random value generators are the ultimate victims to Hyrum’s Law. The details of a generator’s implementation (i.e. the algorithm for generating values) is effectively equivalent to its interface (i.e. the values generated). There have been instances in which attempts to improve existing algorithms, such as the routine for sampling pseudorandom floating-point values, have been foiled by thousands of unit tests throughout Google which have become dependent on the sequences of values generated. Thus, the API and implementation for previous iterations of generators within Google is, in many respects, effectively frozen in place and cannot be improved.
In order to prevent this from befalling the Abseil Random library, we have
implemented a scheme whereby the seed material used to derive the initial state
of a generator (absl::BitGen
, absl::InsecureBitGen
) is mixed with
non-deterministic data. We refer to the conditions under which a generator
promises to produce the same variates from a fixed seed sequence, as the
stability of the generator.
In the course of our discussions, we found it useful to define the following categories of generator stability:
Our generator types provide Process Stability.
To seed an Abseil bit generator for
process stability, pass a std::seed_seq
as the constructor parameter:
// Seeding an absl::BitGen for process stability.
std::seed_seq my_seed_seq{a, b, c};
absl::BitGen gen(my_seed_seq);
There is currently no generator type in the Abseil Random library which provides Seed Stability. The motivation for this decision is as much philosophical as it is practical: The legitimate use cases for an eternally unchanging pseudorandom sequence are uncommon within Google.
The Abseil family of distribution classes and distribution functions (e.g.
absl::Uniform()
) should be considered to have
Process Stability.
We hope to provide support for seed-stable distributions in the future, but at
the moment, no API from the Abseil Random library guarantees this contract.