[OT] Generating distribution of N dice rolls

Thu Nov 10 12:21:30 UTC 2022

On 10.11.22 12:59, jmh530 wrote:
> On Thursday, 10 November 2022 at 10:45:50 UTC, Timon Gehr wrote:
>> [snip]
>>
>> The distribution of the output is multinomial(n,k,[1/k,1/k,...,1/k]), 
>> i.e., n trials, k events, and the events follow an uniform distribution.
>> https://en.wikipedia.org/wiki/Multinomial_distribution
>>
>> It is at least as hard to sample as binomial(n,1/k), because that is 
>> the marginal distribution of each component of the result.
>>
>> I guess one approach is this (if you can find a way to sample from 
>> binomial distributions that is good enough for your use case):
>>
>> [snip]
> 
> mir-random has the ability to sample from both the binomial [1] and 
> multinomial [2] distributions (you would want the multinomial here, 
> unless I'm missing something).

Well, I said that he needs to sample from a multinomial distribution. I 
just pointed out that you can sample from multinomial(n,k,...) by 
sampling from binomial distributions k times, and that to be able to 
sample from multinomial, you need to be able to sample from binomial anyway.

It turns out mir-random is doing precisely what I suggested:
https://github.com/libmir/mir-random/blob/master/source/mir/random/ndvariable.d#L344-L365

> I don't know how well mir-random handles large values of N or K,

FWIW this is what it does (haven't looked at it closely though): 
https://github.com/libmir/mir-random/blob/master/source/mir/random/variable.d#L1730-L1770

> but there are also well-known normal and poisson 
> approximations of the binomial distribution (I make some use of them 
> here [3] for binomials) at least.
> ...

Well, that seems a bit more involved. In case that is more 
accurate/efficient than mir-random, I guess one can sample from 
multinomial using that

> I wasn't exactly sure what you meant by the events follow a uniform 
> distribution,

A multinomial is the distribution of a frequency table obtained after 
drawing n times from a categorical distribution. The OP is only 
interested in the special case where that categorical distribution is 
uniform. n is the number of samples, k is the number of events. It's the 
terminology used by the Wikipedia article I linked.

> but it's certainly possible to simulate from the 
> multinomial to get the counts. You should also be able to calculate the 
> PMF for it to compare the simulations against. The PMF shouldn't be 
> uniform.
> ...

Well, it should be multinomial. :)

> 
> [1] http://mir-random.libmir.org/mir_random_variable.html#BinomialVariable
> [2] 
> http://mir-random.libmir.org/mir_random_ndvariable.html#MultinomialVariable
> [3] 
> https://github.com/libmir/mir-stat/blob/master/source/mir/stat/distribution/binomial.d