isUniformRNG

Thu May 22 22:47:19 PDT 2014

On 5/22/2014 5:01 PM, Joseph Rushton Wakeling via Digitalmars-d wrote:
> On 12/05/14 20:17, Nick Sabalausky via Digitalmars-d wrote:
>>
>> Yea, doesn't necessarily mean class, but if it is made a reference
>> type then
>> class is likely the best option. For example, I'd typically regard
>> struct* in a
>> D API as a code smell.
>
> Well, I wasn't going to suggest struct* as the type.  There have been
> various proposals here for a RefTypeOf template that stores an internal
> pointer to a struct instance and exposes its public interface via alias
> this.  Unfortunately this approach is probably problematic because of
> this issue:
> https://issues.dlang.org/show_bug.cgi?id=10996
>
> One could also just have an internal pointer to the RNG's private state
> variable(s).  monarch_dodra and I have both prototyped some designs for
> this, but I do agree that class is largely preferable, because it avoids
> the need for the developer to take responsibility for ensuring the
> reference type semantics.
>

Cool, yea, we're agreed on classes looking like probably the best 
approach. (Pending your concern below about "does it work for people not 
using the GC"...)

> Thanks for the excellent and detailed explanation here.  Your case is
> also pretty convincing.  A few remarks, though.
>
> First, one concern I still have with static internals is essentially the
> same as the issue I have with the reference-types-made-of-structs: it's
> relying on the programmer to "do the right thing", and you know that
> someone is going to forget to mark as static a variable that needs it.
> With any luck that will be an easily-spotted and fixed issue, but using
> a class avoids the need.
>
> So, I'd still feel more comfortable with the idea of crypto-RNGs being
> classes and not structs -- you can still have the static internals to
> deal with your desire for uniqueness, of course.
>

Yea, again, classes FTW.

> Second, I think your idea about separating the deterministic part of the
> algorithm from the source of entropy, and allowing arbitrary sources (or
> combinations of sources) to be plugged in, is an interesting one and
> worth pursuing.

Yup, as one example, I felt it was a good idea to keep the door open for 
anyone wanting to directly use one o' them fancy hardware-based 
true-randomness generators. Or facilitate anyone who may need to avoid 
reliance on the randomness provided by their OS. Or enhancing 
unittests/debugging with dependency injection.

> To be honest, I wouldn't worry about anyone wanting to use a crypto RNG
> algorithm deterministically.  Wait for someone to request the feature. :-)
>

Alright. Sounds good :)

> Yea, this is a good point.  I think you've convinced me that the natural
> state of a crypto RNG is that its state should essentially be unique --
> not per-instance.
>
> One remark: if you can separate out your algorithm into a deterministic
> algorithm templated on sources of entropy, then note that each
> instantiation will be unique _relative to the sources of entropy_ but
> that one could create multiple independent instances relying on
> _different_ sources of entropy.
>

That's certainly true. But I don't think it's a problem since using 
multiple types of crypto-secure entropy in the same program would be a 
rather oddball usecase (and it wouldn't actually *be* a security flaw in 
and of itself).

And in the rare case where someone actually might want use multiple 
types of entropy from a single RNG, they can still just provide a custom 
"aggregate" entropy source, to draw from other sources however they deem 
fit.

>> Fair enough, at least for non-crypto RNGs. For crypto RNGs, I'm
>> thinking now
>> that static state can be avoided as long as the design still does a
>> sufficient
>> job of steering actual crypto-purpose users away from multiple
>> separate instances.
>
> Having got this far through the discussion, I feel that I'm happy with
> the idea of static state for crypto RNGs, but equally I'll be happy with
> alternatives.  I probably do have a bit of a personal inclination to
> avoid static if at all possible,

Avoiding statics *is* usually a prudent approach in most situations, I 
agree.

> but in this case I think you've made a
> very reasonable argument for it.  (Some of the arguments I cited
> earlier, like the effect on function purity, etc., don't apply here

Sounds good then :)

> because crypto RNGs' .popFront() is of necessity going to be non-pure.)
>

To make sure I understand (it seems my understanding of D's pure isn't 
quite as strong as I'd thought): It cannot be pure *because* of the 
static internal state, right?

>> Have you found any such costs yet, or anything in particular that
>> suggests there
>> may be some? Intuitively, I wouldn't think the minor amount of (by
>> default) GC
>> heap usage would matter (just as one particular aspect of classes).
>
> Well, the main concern would be if using classes made it impossible (or
> frustratingly difficult) to use the RNG package's full functionality in
> non-GC-using code.
>
> I don't think there are any issues like speed hits.
>

I see.

FWIW, if simply making them class-based *did* cause any problems using 
them "in non-GC-using code", I would think that could only be due to a 
much more general failing in D. Because D's classes aren't supposed to 
require GC allocation. GC allocation is only supposed to be the default 
(for classes).

> I don't think that matters.  We don't need to support deterministic
> usages for crypto RNGs.
>

[In the legendary words of Sgt Rick Hunter:]

"Works for me!"

> Well, I guess that what I feel is: the general class-based approach of
> std.random2 handles most of this.  Where crypto RNGs are concerned I'm
> fine with the idea of the internal state being static if you feel that
> will maximize effective use of the entropy supplied.
>
> If we combine that with the idea of templating the deterministic parts
> of crypto RNGs on their sources of entropy, then it should be clear that
> there _can_ be multiple independent instances of a crypto RNG if that's
> desired, but the user needs to provide different sources of entropy to
> each in order to make that happen.
>

Agreed.

> Then, provide a sensible "default" version of each crypto RNG type, with
> explicitly specified entropy sources, so that the typical version that
> will be instantiated by users will (i) be suitable for crypto and (ii)
> assuming it does have static internals, will be unique per thread.
>

Right. And my Hash_DRBG implementation already does this "sensible default".

>> Also, I don't want to forget the issue of stream interfaces. What do
>> you think
>> about including them, but just with a big red "Subject to change
>> pending a new
>> std.stream" banner in the docs? I think that's a perfectly pragmatic
>> "best of
>> both worlds" compromise. Think it would be well/poorly-received?
>
> If the DConf discussion related to an experimental part of the standard
> library are anything to go by, I think we will have plenty of
> opportunity to implement functionality that is subject to change, so I
> don't think we need fear doing that.

I think I may have missed that particular discussion (I've only been 
catching the livestreams of certain talks). Recap?

If truly "no need to fear adding functionality that's subject to 
change", then that also answers another question I was about to raise: 
Submit a PR for this algorithm now, or just simply incorporate it as 
part of your new std.random yet to go through the approval process? May 
as well go with "now".

This is one last matter: Should my initial PR for Hash_DRBG be 
struct-based or class-based? While we both agree that class-based is a 
better approach overall, the current std.random is still struct-based 
and your class-based version hasn't gone through the approvals process 
yet (I assume it needs to, since it's a fairly big reworking of a whole 
phobos module).