New Lagged Fib. PRNG gen and random2.d

Mon Aug 26 08:26:49 PDT 2013

On Sunday, 25 August 2013 at 16:58:12 UTC, Joseph Rushton 
Wakeling wrote:
>
> Nice! :-)
>

Thanks ;)

> I think that as a general approach it's nice, but we should 
> preferably make generic the method of wrapping a payload.

To be honest, I think that, for now, this is an "implementation 
detail". All our PRNG's are created with an explicit "value 
type", that can easily be made the "Payload", no matter what we 
do.

> I think this is a matter of taste, but I don't think it's bad 
> to force the user to seed things.  Note that the existing RNGs 
> go to the extent of actually checking inside popFront() etc. if 
> the RNG has been initialized, and if not, seeding it with a 
> default value.
>
> The only thing I'll remark is that these default-seed ideas are 
> quite prevalent in other RNG implementations in other 
> languages.  Some of the existing unittests may even be based on 
> that default seeding.  So, there may be some expectation of 
> that option being available.

An "option" I had dabbled in was making the PRNG proper a 
"Container", which you can *slice* to obtain a high performance 
range. EG:

//----
auto prng = Random(); //Creates a PRNG.
auto first = prng.front; //lazilly seeds
prng.popFront(); //Checks seeded
auto second = prng.front; //Cheks seeded again...

auto fastPRNG = prng[]; //Lazilly seeds, then returns a type 
*guaranteed* seeded.
auto third = fastPRNG.front; //No check of isSeeded.
//----

The initial reason I played around with this, is that it was an 
option to "solve" the reference problem: Make the PRNG's 
themselves value types, and iterate using reference type 
*slices*. The end result (for "std.random1") has a couple *major* 
drawbacks though, IMO:
1. Opt-*in*: Only those aware will use it. The rest of us mortals 
will still get hit by the bugs.
2. Easily escapes references to locals.
3. As always, if we introduce std.random2, it'll be that much 
more to clean-up.

Still, I thought I'd share my ideas, even if they aren't the best 
solutions. It *is* an option for the (don't)auto-seed problem.

One more thing. One of the ways I see it (specifically for LF): 
We can add auto-seed later if we decide it was a bad idea to not 
have it. The opposite isn't possible.

> I think the two of us agree on the general principle that RNGs 
> should be re-done as structs that wrap a reference to a 
> payload.  Performance- and design-wise, I'd say that the 
> following things are worth considering:
>
>   (1) Should the memory management be done via GC (new) as in 
> your
>       implementation, or manually managed?  Bear in mind the 
> general
>       desirable principle that Phobos code shouldn't rely on 
> the GC if it
>       doesn't have to.
>
>       I don't know what impact the use of the GC here can be 
> expected to have
>       on overall performance of a piece of code, and whether it 
> might rule out
>       use of this design in highly performance-conscious 
> use-cases such as
>       games, etc.
>
>       My own use of RefCounted to ensure manual memory 
> management (no GC)
>       seems to have some scope-related issues, see:
>
> http://forum.dlang.org/post/mailman.445.1377370120.1719.digitalmars-d@puremagic.com
>
>       ... and I assume that if payload was allocated manually 
> via malloc
>       instead of new, then the same might be true.

Another option I had thought of, was to make the value-types 
Payloads as public, with *massive* "do not use signs". *We* then 
provide simple and generic (and pure/nothrow) GC-based wrappers.

 From there, if, for whatever reason, a user needs malloc 
allocation, or heck, static allocation, he still has an 
"underhand" access to the payload implementation, but lifecycle 
management remains *his* responsibility.

I think: Simple and safe by default, with the possibility for 
extension.

>   (2) Should we provide a generic payload wrapper, or require 
> RNG creators
>       to implement everything manually?  I strongly support a 
> generic wrapper,
>       as there is too great a risk of implementation error if 
> we require
>       the wrapping-of-a-reference to be done manually each time.
>
>   (3) As we discussed off-list, if we do implement a generic 
> wrapper, how
>       should it work?  Should it work as per my implementation,
>
>           alias RandomGenerator!MyEngine MyRNG;
>
>       or instead as you suggested, as a template mixin,
>
>           struct MyRNG
>           {
>               mixin RandomGenerator!MyEngine;
>           }
>
>       I can see the case for the latter as it will result in 
> much more readable
>       type names in error messages.  However, I think it has 
> the potential to
>       obscure implementation details in a way that isn't 
> helpful.  Consider what
>       happens if we do:
>
>           alias RandomGenerator!MtEngine19937 Mt19937_64
>           // ... WRONG!! we forgot to tweak the engine when we 
> copy-pasted,
>           // but at least we'll see in any error messages what 
> type of internal
>           // engine is being used
>
>       ... compared to:
>
>           struct Mt19937
>           {
>               mixin RandomGenerator!MtEngine19937;
>           }
>           // ... Copy-paste failure again, but this time it's 
> obscured and we'll
>           // never find out unless we look at the actual source 
> code.

Again, for now, I think this is implementation detail. That said, 
I don't buy much into the typo argument. In particular, this is 
something that gets copy pasted only once, so we should be 
relatively safe.

One of the arguments in favor of "internal" mixins is that of 
extension: For example, laggedFib has a very real reason to 
implement popFrontN, as it can potentially pop thousands of 
elements at once in o(1). "Internal" mixin makes it easy to 
extend a PRNG's capabilities past that of the "lowest common 
denominator". With a "alias Random = 
RandomGenerator!RandomPayload", you are really limited to 
whatever RandomGenerator wrapped.

>   (4) The devil's advocate position -- should we take the 
> simple route to
>       reference-type RNGs by making them final classes?  It's 
> trivial to do but
>       to me it feels "un-Phobos-ish" and will also have the 
> problem of requiring
>       a lot more code rewrites on the part of std.random users 
> who want to
>       upgrade to std.random2.
>
> That seems like a good opening summary of issues -- destroy. :-)
>
> Best wishes,
>
>     -- Joe

Honestly, it might just be the simplest thing to do. For one, it 
would elegantly solve the "must be seeded" issue (allocation is 
initialization). It *guarantees* Reference semantics. Finally, 
the (supposed) overhead should be inexistant compared to the 
compelxity of a PRNG.

The option of allowing "public Payload" definitions could still 
leave an open door for those that need PRNG's, but don't use the 
GC (think vidjagames).