Mir Random [WIP]
Andrei Alexandrescu via Digitalmars-d
digitalmars-d at puremagic.com
Wed Nov 23 05:41:25 PST 2016
On 11/23/2016 12:58 AM, Ilya Yaroshenko wrote:
> On Tuesday, 22 November 2016 at 23:55:01 UTC, Andrei Alexandrescu wrote:
>> On 11/22/16 1:31 AM, Ilya Yaroshenko wrote:
>>> - `opCall` API instead of range interface is used (similar to C++)
>>
>> This seems like a gratuitous departure from common D practice. Random
>> number generators are most naturally modeled in D as infinite ranges.
>> -- Andrei
>
> It is safe low level architecture without performance and API issues.
I don't understand this. Can you please be more specific? I don't see a
major issue wrt offering opCall() vs. front/popFront. (empty is always
true.)
> It
> prevents users to do stupid things implicitly (like copying RNGs).
An input range can be made noncopyable.
> A
> hight level range interface can be added in the future (it will hold a
> _pointer_ to an RNG).
Is there a reason to not have that now? Again, I'm literally talking
about offering front/popFront in lieu of opCall(). The only
implementation difference is you keep the currently-generated number as
a member instead of returning it from opCall. I doubt one could measure
a performance difference.
If you implement it as a noncopyable input range, you get a large
support network working for you. With opCall, we have virtually no such
support - you need to do everything once again.
> In additional, when you need to write algorithms
> or distributions opCall is much more convenient than range API.
Could you please be more specific? On the face of it I'd agree one call
is less than two, but I don't see a major drawback here.
> In
> additions, users would not use Engine API in 99% cases: they will just
> want to call `rand` or `uniform`, or other distribution.
>
> I am sure that almost any library should have low level API that is fits
> to its implementation first. Addition API levels also may be added.
Is there a large difference between opCall and front/popFront?
Actually I can think of one - the matter of getting things started.
Ranges have this awkwardness of starting the iteration: either you fill
the current front eagerly in the constructor, or you have some sort of
means to detect initialization has not yet been done and do it lazily
upon the first use of front. The best strategy would depend on the
actual generator, and admittedly would be a bit more of a headache
compared to opCall. Was this the motivation?
> Current Phobos evolution is generic degradation: more generic and
> "universal" code hide more uncovered bugs in the code. The std.range is
> good example of degradation, it has a lot of API and implementation bugs.
Do you have examples of issues outside random number generators?
> ### Example of API+implementation bug:
>
> #### Bug: RNGs has min and max params (hello C++). But, they are not
> used when an uniform integer number is generated : `uniform!ulong` /
> `uniform!ulong(0, 100)`.
>
> #### Solution: In Mir Rundom any RNGs must generate all 8/16/32/64 bits
> uniformly. It is RNG problem how to do it.
Min and max are not parameters, they are bounds provided by each
generator. I agree their purpose is unclear. We could require all
generators to provide min = 0 and max = UIntType.max without breaking
APIs. In that case we only need to renounce LinearCongruentialEngine
with c = 0 (see
https://github.com/dlang/phobos/blob/master/std/random.d#L258) - in fact
that's the main reason for introducing min and max in the first place.
All other code stays unchanged, and we can easily deprecate min and max
for RNGs.
(I do see min and max used by uniform at
https://github.com/dlang/phobos/blob/master/std/random.d#L1281 so I'm
not sure I get what you mean, but anyhow the idea that we require RNGs
to fill an uint/ulong with all random bits simplifies a lot of matters.)
> I will not fill this bug as well another dozen std.random bugs because
> the module should be rewritten anyway and I am working on it. std.random
> is a collection of bugs from C/C++ libraries extended with D generic
> idioms. For example, there is no reason in 64 bit Xorshift. It is 32 bit
> by design. Furthermore, 64 expansion of 32 bit algorithms must be proved
> theoretically before we allow it for end users. 64 bit analogs are
> exists, but they have another implementations.
One matter that I see is there's precious little difference between
mir.random and std.random. Much of the code seems copied, which is an
inefficient way to go about things. We shouldn't fork everything if we
don't like a bit of it, though admittedly the path toward making changes
in std is more difficult. Is your intent to work on mir.random on the
side and then submit it as a wholesale replacement of std.random under a
different name? In that case you'd have my support, but you'd need to
convince me the replacement is necessary. You'd probably have a good
case for eliminating xorshift/64, but then we may simply deprecate that
outright. You'd possibly have a more difficult time with opCall.
> Phobos degrades because
> we add a lot of generic specializations and small utilities without
> understanding use cases.
This is really difficult to parse. Are you using "degrades" the way it's
meant? What is a "generic specialization"? What are examples of "small
utilities without understanding use cases"?
> Phobos really follows stupid idealistic idea:
> more generic is better, more API is better, more universal algorithms is
> better. The problems is that Phobos/DRuntime is soup where all (because
> its "universality") interacts with everything.
I do think more generic is better, of course within reason. It would be
a tenuous statement that generic designs in Phobos such as ranges,
algorithms, and allocators are stupid and idealistic. So I'd be quite
interested in hearing more about this. What's that bouillabaisse about?
Thanks,
Andrei
More information about the Digitalmars-d
mailing list