hap.random: a new random number library for D

Tue Jun 10 23:41:33 PDT 2014

On Tuesday, 10 June 2014 at 23:08:33 UTC, Chris Cain wrote:
> I had an opportunity to give the entire code a good once over 
> read and I have a few comments.

Thanks! :-)

> 1. Biggest thing about the new hap.random is how much nicer it 
> is to actually READ. The first few times I went through the 
> current std.random, I remember basically running out of breath. 
> hap.random was almost a refreshing read, in contrast. I'm 
> guessing it has a lot to do with breaking it down into smaller, 
> more manageable pieces. Regardless, good work on that. I 
> suspect it'll make it easier to contribute to in the future.

That's great to hear, as it was a design goal.  I think there 
will probably at some point be a need to separate things further 
(e.g. std.random.generator will probably have to be separated as 
will std.random.distribution) but always keeping the principle of 
implementing packages to make it possible to just "import 
hap.random" (or "import hap.random.generator", or whatever).

> 2. Something I'd really like to see is for the seed-by-range 
> functions to take the range by reference instead of by value to 
> ensure that the seed values used are less likely to be used in 
> another RNG inadvertently later. Basically, I envision a 
> similar problem with seedRanges as we currently have with RNGs 
> where we have to make sure people are careful with what they do 
> with the ranges in the end. This should cover use cases where 
> users do things like `blah.seed(myEntropyRange.take(3))` as 
> well, so that might take some investigation to figure out how 
> realistic it would be to support.

Yea, that's an interesting point.  I mean, you'd hope that 
myEntropyRange would be a reference type anyway, but every little 
helps :-)

> 3. I'd also REALLY like to see seed support ranges/values 
> giving ANY type of integer and guarantee that few bytes are 
> wasted (so, if it supplies 64-bit ints and the generator's 
> internal state array only accepts 32-bit ints, it should spread 
> the 64-bit int across two cells in the array). I have working 
> code in another language that does this, and I wouldn't mind 
> porting it to D for the standard library. I think this would 
> greatly simplify the seeding process in user code (since they 
> wouldn't have to care what the internal representation of the 
> Random state is, then).

That would be very cool.  Can you point me at your code examples?

> 4. I'd just like to say the idea of using ranges for seeds gets 
> me giddy because I could totally see a range that queries 
> https://random.org for true random bits to seed with, wrapped 
> by a range that zeroes out the memory on popFront. Convenient 
> and safe (possibly? Needs review before I get excited, 
> obviously) for crypto purposes!

The paranoiac in me feels that anything that involves getting 
random data via HTTPS is probably insecure crypto-wise :-)  
However, I think sourcing random.org is a perfect case for an 
entry in hap.random.device.  I think the best thing to do would 
probably be to offer a RandomOrgClient (which offers a very thin 
API around the random.org HTTP API) and then to wrap that in a 
range type that uses the client internally to generate random 
numbers with particular properties.

> 5. Another possible improvement would be something akin to a 
> "remix" function. It should work identically to reseeding, but 
> instead of setting the internal state to match the seed (as I 
> see in 
> https://github.com/WebDrake/hap/blob/master/source/hap/random/generator.d#L485), 
> remixing should probably be XOR'd into the current state. That 
> way if you have a state based on some real entropy, you can 
> slowly, over time, drip in more entropy into the state.

Also a very interesting suggestion.  Is there a standard name for 
this kind of procedure?

> 6. I'd like to see about supporting xorshift1024 as well 
> (described here: http://xorshift.di.unimi.it/ and it's public 
> domain code, so very convenient to port ... I'd do it too, of 
> course, if that seems like an okay idea). This is a really 
> small thing because xorshift1024 isn't really much better than 
> xorshift128 (but some people might like the idea of it having 
> significantly longer period).

Fantastic, I will see about implementing those.  I wasn't 
previously aware of that work, but I _was_ aware that the 
standard Xorshift generators have some statistical flaws, so it's 
great to have some alternatives.  It should be straightforward to 
implement things like XorshiftP128 or XorshiftS1024 and 
XorshiftS4096 (using P and S in place of + and *).

With these in place we might even be able to deprecate the old 
Xorshift generators.

Just for clarity, here's how I see things rolling out for the 
future:

   * First goal is to ensure the existing codebase "plays nice" 
with
     people's programs and that it works OK with dub, rdmd, etc. 
and
     doesn't have any serious architectural or other bugs.  The 
1.0.0
     release will not have any new functionality compared to what 
is
     in place now.

   * Once it seems to be reasonably stable then work can begin on a
     1.x release series that brings in successive pieces of new
     functionality.

> Done :) ... if I get a response, I'll make sure to incorporate 
> everything said.

Great, let me know how that goes. :-)