Opened 7 years ago

Closed 5 years ago

#2013 closed defect (duplicate)

Revisit Random Number Generation

Reported by: Kevin Meagher Owned by: Juan Carlos Díaz Vélez
Priority: normal Milestone: Long-Term Future
Component: combo core Keywords: random, rng, gsl, sprng
Cc: Christopher Weaver, Alex Olivas

Description

Currently icetray has its own random number generator interface. With 3 instances:

  • I3TRandom: ROOT's implementation of mt19937
  • I3GSLRandom: uses GSL's implementation of mt19937 but can be changed with and environment variable
  • I3SPRNGRandomService: combines the output of SPRNG and GSL for reasons which are not adequately explained in the documentation.

Random number generation has not been revisited in the past 13 years. Things to consider:

  • boost/c++11 has a random number generator interface which has all the functions we need (except for the unused PoissonD)
  • we only use version 2.0a of SPRNG, newer versions are available
  • SPRNG has a failure mode where it uses the exact same stream for every job in a batch
  • SPRNG has a nonstandard install script and who knows how long it will continue to work with current compilers
  • It is unclear if combining SPRNG and mt19937 is a statistically valid RNG

Requirements:

  • Determine whether to continue to use our custom RNG interface or switch to the c++11 one
  • Determine if there is a better RNG for batch processing (ie it can derive multiple streams from both the dataset number and the job number which are all independent). Bonus points for having a normal build system.
  • should be able to set the seed without causing all streams to be identical

Change History (8)

comment:1 Changed 7 years ago by David Schultz

  • Cc juancarlos olivas added
  • Keywords random rng gsl sprng added

comment:2 Changed 7 years ago by Alex Olivas

Note also there's a *really* slim version of SPRNG floating around which only implements the lagged fibonacci algorithm, which is the default and what we use in production. If we're not going to test out other algos, I'd recommend we switch to that.

comment:3 Changed 7 years ago by Christopher Weaver

If we do want to look at other algorithms, however, PCG might be an interesting choice to examine. I've only used it in cases where quality didn't matter and I wanted speed, but the associated paper makes some strong claims about quality as well. The library itself is quite light-weight, which is nice.

comment:4 Changed 7 years ago by Kevin Meagher

I'm not sure I trust PCG claims, the author doesn't seem to make any distinction between cryptographically secure RNGs and deterministic high-dimensionality RNGs. Instead just claiming that it is better than mt19937, arc4random, and chacha20 among others.

comment:5 Changed 7 years ago by Alex Olivas

  • Cc juancarlos removed
  • Owner set to juancarlos
  • Status changed from new to assigned

comment:6 Changed 6 years ago by Kevin Meagher

In 160150/IceCube:

add I3MT19937 an instance of I3RandomService which uses c++11's random number interface and can take a seed consisting of an aribtrary length list of integers suitable for distributed computing, see #2013

comment:7 Changed 5 years ago by Jakob van Santen

Another library that deserves a look is Random123 (BSD licensed also available as a proposed boost library and this random GitHub repo). I've used this for a strongly reproducible/parallelizable version of clsim, and it appears to work quite well.

The approach is very similar to PCG, but stripped down to even further, to the point where the state is a simple counter and the randomness comes from the permutation function. Like PCG, its quality claims are based on empirical evidence from TestU01. This is in contrast to SPRNG, which predates TestU01 and so comes with proofs.

Pros:

  • Small state
  • Arbitrary, cheap stream partitioning
  • Header-only (the best build system is no build system)
  • C, C++11, CUDA, and OpenCL implementations of all generators
  • BSD licensed

Cons:

  • Fails to cite The Hitchhiker's Guide to the Galaxy as reference 1
Last edited 5 years ago by Jakob van Santen (previous) (diff)

comment:8 Changed 5 years ago by Alex Olivas

  • Resolution set to duplicate
  • Status changed from assigned to closed
Note: See TracTickets for help on using tickets.