Skip to content

Elevenize RNG #4563

Open
Open
@kkm000

Description

@kkm000

The rand() et amici is usually far less than a fantastic PRNG, and you never know in fact what you get from a particular libc implementation. C++11 standardized both better PRNG algorithms (A Mersienne twister with a specific set of parameters, mt19937, is the most commonly used, and is quite good for its performance), and random distributions. We should switch to these.

The current implementation causes compiler warnings

kaldi-math.cc:79:19: warning: implicit conversion from 'int' to 'float' changes value from 2147483647 to 2147483648 [-Wimplicit-const-int-float-conversion]
  else if (prob * RAND_MAX < 128.0) {
                ~ ^~~~~~~~
/usr/include/stdlib.h:86:18: note: expanded from macro 'RAND_MAX'
#define RAND_MAX        2147483647
                        ^~~~~~~~~~
kaldi-math.cc:91:29: warning: implicit conversion from 'int' to 'float' changes value from 2147483647 to 2147483648 [-Wimplicit-const-int-float-conversion]
    return (Rand(state) < ((RAND_MAX + static_cast<BaseFloat>(1.0)) * prob));
                            ^~~~~~~~ ~

with clang++. GCC and MSC both use RAND_MAX of 2^15-1, while clang++ goes with 2^31-1. The latter is unrepresentable in a float. The magnitude could have been easily checked with the preprocessor, but it's would me more of sweeping the problem under the carpet. Most PRNG that I've seen in various libraries are simply bad. Using standard algorithms would also make cross-platform (and even, as seen above, cross-compiler!) results reproducible: they are guaranteed to generate the same sequence from the same seed on any platform.

As an example of the definition strictness, this is the mt19973 PRNG spec from C+11 26.5.5.3 [rand.predef]:

typedef mersenne_twister_engine<uint_fast32_t,
        32,624,397,31,0x9908b0df,11,0xffffffff,7,0x9d2c5680,15,0xefc60000,18,1812433253>
        mt19937;

Required behavior: The 10000th consecutive invocation of a default-constructed object of type mt19937 shall produce the value 4123659995.

I buy it without haggling!

Metadata

Metadata

Assignees

Labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions