Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
5 changes: 5 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -77,6 +77,11 @@ changes to
With `-ftree-vectorize`, it seems that this trick does wonders and speeds up
the code even more.

_Update:_ gcc-UBSAN complains about undefined behavior when the above bitshifts
are used, so the line has now been changed to:

uint32_t foo = y & 1 ? 0x12345678 : 0

Finally, note that people have done SIMD and CUDA implementations. If
you are looking for even more speed, I suggest you check them out.

Expand Down
5 changes: 2 additions & 3 deletions mersenne-twister.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -47,7 +47,7 @@ static MTState state;

#define UNROLL(expr) \
y = M32(state.MT[i]) | L31(state.MT[i+1]); \
state.MT[i] = state.MT[expr] ^ (y >> 1) ^ (((int32_t(y) << 31) >> 31) & MAGIC); \
state.MT[i] = state.MT[expr] ^ (y >> 1) ^ (y & 1 ? MAGIC : 0); \
++i;

static void generate_numbers()
Expand Down Expand Up @@ -99,8 +99,7 @@ static void generate_numbers()
{
// i = 623, last step rolls over
y = M32(state.MT[SIZE-1]) | L31(state.MT[0]);
state.MT[SIZE-1] = state.MT[PERIOD-1] ^ (y >> 1) ^ (((int32_t(y) << 31) >>
31) & MAGIC);
state.MT[SIZE-1] = state.MT[PERIOD-1] ^ (y >> 1) ^ (y & 1 ? MAGIC : 0);
}

// Temper all numbers in a batch
Expand Down