Description
The rand()
et amici is usually far less than a fantastic PRNG, and you never know in fact what you get from a particular libc implementation. C++11 standardized both better PRNG algorithms (A Mersienne twister with a specific set of parameters, mt19937
, is the most commonly used, and is quite good for its performance), and random distributions. We should switch to these.
The current implementation causes compiler warnings
kaldi-math.cc:79:19: warning: implicit conversion from 'int' to 'float' changes value from 2147483647 to 2147483648 [-Wimplicit-const-int-float-conversion]
else if (prob * RAND_MAX < 128.0) {
~ ^~~~~~~~
/usr/include/stdlib.h:86:18: note: expanded from macro 'RAND_MAX'
#define RAND_MAX 2147483647
^~~~~~~~~~
kaldi-math.cc:91:29: warning: implicit conversion from 'int' to 'float' changes value from 2147483647 to 2147483648 [-Wimplicit-const-int-float-conversion]
return (Rand(state) < ((RAND_MAX + static_cast<BaseFloat>(1.0)) * prob));
^~~~~~~~ ~
with clang++. GCC and MSC both use RAND_MAX of 2^15-1, while clang++ goes with 2^31-1. The latter is unrepresentable in a float
. The magnitude could have been easily checked with the preprocessor, but it's would me more of sweeping the problem under the carpet. Most PRNG that I've seen in various libraries are simply bad. Using standard algorithms would also make cross-platform (and even, as seen above, cross-compiler!) results reproducible: they are guaranteed to generate the same sequence from the same seed on any platform.
As an example of the definition strictness, this is the mt19973
PRNG spec from C+11 26.5.5.3 [rand.predef]:
typedef mersenne_twister_engine<uint_fast32_t,
32,624,397,31,0x9908b0df,11,0xffffffff,7,0x9d2c5680,15,0xefc60000,18,1812433253>
mt19937;
Required behavior: The 10000th consecutive invocation of a default-constructed object of type
mt19937
shall produce the value 4123659995.
I buy it without haggling!