isUniformRNG

Thu May 22 14:01:57 PDT 2014

On 12/05/14 20:17, Nick Sabalausky via Digitalmars-d wrote:
> On 5/11/2014 8:16 AM, Joseph Rushton Wakeling via Digitalmars-d wrote:> On
> 11/05/14 05:58, Nick Sabalausky via Digitalmars-d wrote:
>  >> The seed doesn't need to be compromised for synchronized RNGs to fubar
>  >> security.
>  >
>  > Before we go onto the detail of discussion, thanks very much for the
>  > extensive explanation.  I was slightly worried that my previous email
>  > might have come across as dismissive of your (completely understandable)
>  > concerns.
>
> Oh, not at all. I've been finding the discussion rather interesting. :)

Me too -- sorry for the late response, things have been busy :-)

>  > Obviously one _can_ solve the problem by the internal state variables of
>  > the RNG being static, but I'd suggest to you that RNG-as-reference-type
>  > (which doesn't necessarily need to mean "class")
>
> Yea, doesn't necessarily mean class, but if it is made a reference type then
> class is likely the best option. For example, I'd typically regard struct* in a
> D API as a code smell.

Well, I wasn't going to suggest struct* as the type.  There have been various 
proposals here for a RefTypeOf template that stores an internal pointer to a 
struct instance and exposes its public interface via alias this.  Unfortunately 
this approach is probably problematic because of this issue:
https://issues.dlang.org/show_bug.cgi?id=10996

One could also just have an internal pointer to the RNG's private state 
variable(s).  monarch_dodra and I have both prototyped some designs for this, 
but I do agree that class is largely preferable, because it avoids the need for 
the developer to take responsibility for ensuring the reference type semantics.

> So I'm fine going the class route (or otherwise reference-based) and making
> internal state per-instance. Or even having a "duplicate this RNG with identical
> state" function, if people want it.
>
> I think we're pretty well agreed on non-crypto RNGs. Your stance is convincing
> here.

Excellent. :-)

> For crypto-RNGs:
>
>   [ ... snip ... ]
>
> I think my preference would still be to keep the internal state static here
> though (again, just speaking for crypto-RNGs only). As I've argued, the
> determinism is a non-feature for crypto-RNGs (they deliberately fight it every
> step of the way), and the shared state carries a couple entropy-related benefits
> (Reseeding one, ie accumulating additional fresh entropy, benefits all others.
> And all RNG activity within in the thread acts as additional entropy). So I only
> see upsides, not downsides.

Thanks for the excellent and detailed explanation here.  Your case is also 
pretty convincing.  A few remarks, though.

First, one concern I still have with static internals is essentially the same as 
the issue I have with the reference-types-made-of-structs: it's relying on the 
programmer to "do the right thing", and you know that someone is going to forget 
to mark as static a variable that needs it.  With any luck that will be an 
easily-spotted and fixed issue, but using a class avoids the need.

So, I'd still feel more comfortable with the idea of crypto-RNGs being classes 
and not structs -- you can still have the static internals to deal with your 
desire for uniqueness, of course.

Second, I think your idea about separating the deterministic part of the 
algorithm from the source of entropy, and allowing arbitrary sources (or 
combinations of sources) to be plugged in, is an interesting one and worth pursuing.

> The only potential issue I see: Are there going to be people who are looking for
> determinism in an RNG and intentionally want to use something like Hash_DRBG for
> that. Maybe for the sake of it's usage of SHA. They would have to either limit
> the number of values they generate (to avoid the algorithm's internal reseeding)
> or else provide their own entropy source (not too difficult though, since I've
> recently parametrized HashDRBG on that). But I suppose it could be done.

To be honest, I wouldn't worry about anyone wanting to use a crypto RNG 
algorithm deterministically.  Wait for someone to request the feature. :-)

>  > The benefit of doing it this way, as opposed to static internal
>  > variables, is that you aren't then constrained to have only one single,
>  > effectively thread-global instance of _every_ RNG you create.
>
> Non-crypto RNGs: Yea, that's good, I like it.
>
> Crypto RNGs: Hmm, good question. My concern is that it's something the user has
> to specifically choose to use. It's not what automatically happens when you ask
> for an instance of a specific RNG.

Yea, this is a good point.  I think you've convinced me that the natural state 
of a crypto RNG is that its state should essentially be unique -- not per-instance.

One remark: if you can separate out your algorithm into a deterministic 
algorithm templated on sources of entropy, then note that each instantiation 
will be unique _relative to the sources of entropy_ but that one could create 
multiple independent instances relying on _different_ sources of entropy.

> Fair enough, at least for non-crypto RNGs. For crypto RNGs, I'm thinking now
> that static state can be avoided as long as the design still does a sufficient
> job of steering actual crypto-purpose users away from multiple separate instances.

Having got this far through the discussion, I feel that I'm happy with the idea 
of static state for crypto RNGs, but equally I'll be happy with alternatives.  I 
probably do have a bit of a personal inclination to avoid static if at all 
possible, but in this case I think you've made a very reasonable argument for 
it.  (Some of the arguments I cited earlier, like the effect on function purity, 
etc., don't apply here because crypto RNGs' .popFront() is of necessity going to 
be non-pure.)

>  > I wrote a class-based implementation because I wanted to see how that
>  > came out to various struct-based approaches monarch_dodra and I had
>  > tried.  I do think that it offers much value in terms of simplicity and
>  > elegance of design, but there may be other costs that make it
>  > untenable.
>  >
>
> Have you found any such costs yet, or anything in particular that suggests there
> may be some? Intuitively, I wouldn't think the minor amount of (by default) GC
> heap usage would matter (just as one particular aspect of classes).

Well, the main concern would be if using classes made it impossible (or 
frustratingly difficult) to use the RNG package's full functionality in 
non-GC-using code.

I don't think there are any issues like speed hits.

> Again, for non-crypto RNGs, I'm totally with you now.
>
> For crypto-RNGs, I think it's unclear. The question hinges on:
>
> A. "Can we permit deterministic usages without making crypto users more likely
> to have multiple independent instances within a thread? (regardless of whether
> crypto users do it intentionally or unintentionally)"

Honestly, if an RNG is designed as a crypto RNG, I don't think you need to worry 
about supporting deterministic usage.  The main concern of my arguments was 
related to non-crypto pseudo-RNG use-cases.

> That question, of course, needs to be balanced with:
>
> B. "Is 'non-crypo deterministic' an inappropriate misuse of crypto-RNGs (when we
> already have RNGs designed for non-crypo deterministic purposes anyway?)"

Probably. :-)

>
> and C. "How much should we even care about B?"
>
> I'm concerned that the global-singleton-instance pattern may be insufficient for
> "A".

I don't think that matters.  We don't need to support deterministic usages for 
crypto RNGs.

> What I think I'd like to see is this: For *all* RNGs (crypto and non-crypto), a
> reference-based design where obtaining a common thread-global instance is always
> the implicit default behavior, and individual or duplicate instances can always
> be obtained *explicitly* in a way that cannot be misread/misinterpreted. That
> would succeed at "A" and render "B" and "C" non-issues. Plus, I think it would
> be generally appropriate, even for deterministic non-crypto RNGs, anyway.

Well, I guess that what I feel is: the general class-based approach of 
std.random2 handles most of this.  Where crypto RNGs are concerned I'm fine with 
the idea of the internal state being static if you feel that will maximize 
effective use of the entropy supplied.

If we combine that with the idea of templating the deterministic parts of crypto 
RNGs on their sources of entropy, then it should be clear that there _can_ be 
multiple independent instances of a crypto RNG if that's desired, but the user 
needs to provide different sources of entropy to each in order to make that happen.

Then, provide a sensible "default" version of each crypto RNG type, with 
explicitly specified entropy sources, so that the typical version that will be 
instantiated by users will (i) be suitable for crypto and (ii) assuming it does 
have static internals, will be unique per thread.

> Also, I don't want to forget the issue of stream interfaces. What do you think
> about including them, but just with a big red "Subject to change pending a new
> std.stream" banner in the docs? I think that's a perfectly pragmatic "best of
> both worlds" compromise. Think it would be well/poorly-received?

If the DConf discussion related to an experimental part of the standard library 
are anything to go by, I think we will have plenty of opportunity to implement 
functionality that is subject to change, so I don't think we need fear doing that.