isUniformRNG

Sat May 10 09:34:14 PDT 2014

On 09/05/14 02:42, Nick Sabalausky via Digitalmars-d wrote:
> There can technically be multiple "instances", but yea, they're all effectively
> tied together. However, I'm leaning towards the belief that's "correct behavior"
> for a RNG. It's *definitely* correct for a crypto RNG - you certainly wouldn't
> want two crypto RNGs ever generating the same sequence, not even deliberately.
> That would defeat the whole point of a crypto RNG.

I'm not sure I follow your point about crypto RNGs here.  Assuming it's 
deterministic at all, if someone else knows the seed state (which would be 
required to get two instances generating the same sequences), aren't you 
buggered anyway?  And isn't the point of a deterministic crypto RNG that its 
seed state is kept secret, or that it is seeded with non-deterministic values 
upon instantiation?

> As for ordinary non-crypto RNGs, I honestly can't imaging any purpose for
> reliably generating the same values other than "playing back" a previous
> sequence. But if that's what you want, then IMO it's better to record the
> sequence of emitted values, or even record/replay the higher-level decisions
> which the randomness was used to influence.

Even that will not save you in Minecraft, where the results of different seeds 
will produce different results in updated versions of the game where there are 
more random possibilities available (more monsters, more flora and fauna...). ;-)

> For randomness, record/replay is just less fiddly and error-prone than reliably
> regenerating values from an algorithm that intentionally tries to imitate
> non-predictability. One slip-up while regenerating the sequence (example: trying
> to replay from the same seed on a RNG that has since had a bug fixed, or on a
> different architecture which the RNG wasn't well-tested on) and then the
> inaccuracies just cascade. Plus this way you can swap in a non-deterministic RNG
> and everything will still work. Or more easily skip back/ahead.

There's a difference between _relying_ on the deterministic sequence from a 
pseudo-RNG and the principle of design that says if I declare two separate RNG 
instances,

     SomeRNG rng1;
     SomeRNG rng2;

then the calls to rng1 should not have any effect on the results of the calls to 
rng2.  Or alternatively, if I do,

     SomeRNG rng1;
     SomeRNG rng2;

     rng1.seed(n);
     rng2.seed(n);

     foreach (a, b; lockstep(rng1, rng2).take(10))
     {
         writefln("%s\t%s", a, b);
     }

then the values I get should be the same, because rng1 and rng2 are independent 
instances that have been identically seeded.

> I'm just not seeing a legitimate use-case for multiple states of the same RNG
> engine (at least on the same thread) that wouldn't be better served by different
> approach.

We could get into broader discussions, but I don't think it needs to be more 
complicated than: if I create an instance of a variable of type T, then I should 
expect that it will be independent of any other instance of a variable of type 
T, unless I have deliberately created that link.

> AIUI an RNG is impure by definition. (Well, unless you consider the internal
> state to be part of the input, but I'd say *that* view breaks encapsulation.)

Not at all.  In fact a good theoretical model of a pseudo-RNG is

     * a state variable, and

     * a pure function that transforms one state variable into another
       such that the sequence of states is indistinguishable from a random
       sequence.

You can also add other pure transformations to map the state variable to 
particular random variates (e.g. unsigned integers in a certain range).

You can encapsulate this quite neatly as the state variable being a private 
member of a struct or class, with the pure transformation function being 
.popFront and the state-to-variate mapping being handled by .front.  Note that 
.popFront _is_ pure for all the pseudo-RNGs in std.random2 :-)

> I'm also unconvinced that it breaks encapsulation anyway. If anything, I think
> it's better encapsulated since separate instances *won't* behave like they're
> under "quantum entanglement". It forces separate instances to stay separate.

Cf. my earlier example with two individual RNG instances initialized with the 
same seed.  This ought to produce two identical sequences; with your proposed 
static internals, it won't, because calls to rng1 will affect the results of 
subsequent calls to rng2, and vice versa.  This is pretty much the definition of 
breaking encapsulation, no?

> It *is* a fairly strange situation though. Unlike most encapsulation units, the
> whole point of an RNG is to be *inconsistent* rather than consistent. So that
> kinda turns everything on its ear.

The point of a pseudo-RNG is that if you analyse the sequence of variates 
produced, you can't distinguish it from a genuine random sequence.  But it's 
also intended that different instances seeded with different values ought to 
produce statistically independent sequences with very high probability.

Your proposal would make it impossible to _have_ multiple effective instances 
seeded differently (unless in different threads).

> Yea, when I wrote it I figured on proposing an optional stream-like interface
> for std.random. The Hash_DRBG algorithm is inherently based around the idea that
> each generated number can be arbitrary-length (not only that - generating ex.
> two 8-byte values with Hash_DRGB is fundamentally different from generating
> eight 2-byte values, and produces different results). And the OS-specific
> entropy generators are basically streams, too (on both Win and Posix, you
> request any number of bytes of randomness).
>
> So that was just a natural way to implement it. But it did strike me as being
> generally useful, so I'm glad that you agree.
>
> You're right though, the current state of Phobos streams is kind of a stumbling
> block. Naturally it'd be best to use the new stream interface, but since that
> isn't in place, that's an issue. I don't know what the state of the rewrite is.
> Been awhile since I've heard anything about it.

Yup, me neither.  I will try and quiz Steven on it at some point.

> I figure, at the very least, we could just make the stream interfaces for RNGs
> private. Then we can open it up whenever we finally do have a new std.stream to
> conform to.
>
> Or we could do an OutputRange instead...but the recent discussion on those makes
> me a bit hesitant to do so. And I definitely don't wanna perform the contortions
> to invert it into a non-coroutine-like InputRange.
>
> Hmm, yea, I'm more and more liking the "keep it private for now, and then see
> what happens" approach...unfortunately... :/

Yes, that's probably for the best.