A Friendly Challenge for D
Jabari Zakiya
jzakiya at gmail.com
Wed Oct 10 22:05:23 UTC 2018
On Wednesday, 10 October 2018 at 20:43:01 UTC, Kagamin wrote:
> On Wednesday, 10 October 2018 at 16:15:56 UTC, Jabari Zakiya
> wrote:
>> https://gist.github.com/jzakiya/6c7e1868bd749a6b1add62e3e3b2341e
>
> As i understand, main thread preallocates global memory and
> tracks it, and other threads don't track it?
Here's an abreviated elementary explanation of the algorithm and
implementation.
You will need to read (download) my paper "The Segmented Sieve of
Zakiya (SSoZ)"
because I will make reference to things in it, to keep this as
short as possible.
https://www.scribd.com/doc/228155369/The-Segmented-Sieve-of-Zakiya-SSoZ
To really understand the math and theory of the algorithm
requires primarily
just understanding Table 3 on page 3 of the paper, as it
encapsulates everything.
You can read the paper to understand the language of the
algorithm used in the code.
Twinprimes are 2 (odd) primes that differ by only 2 eg. 3-5, 5-7,
11-13, 29-31.
Table 3 shows all the residues values (prime candidates|pcs) and
residues groups
(resgroups|columns) to find all the primes upto 541 using P5
(modpg = 30, rescnt = 8).
For a given value N, it will be represented with a PG table of
some number of
resgroups, with max size I call Kmax (the regroups value N
residues in).
Using P5, I only need to sieve primes along the residue tracks
(restracks) that can
produce twinprimes, here 11-13, 17-19, 29-31. Thus I create 3
byte arrays, one for each twinpair, and use the lower 2 bits to
represent the upper and lower twinprime restracks.
Then for each twinpair (here 3) I run 3 thread which perform the
SSoZ on the entire Kmax length of resgroups in parallel. At the
end I accumulate the results and print them out. This, in a
nutshell, is what the algorithm does. The paper gives you enough
to understand the fundamental nature of the algorithm, though
I've learned so much more than in 2014. :)
The larger the PG the more twinpair restracks (see page 14) there
are to use. For larger numbers you want to use the largest PG
possible that 'fits' into the hardware cpu. All my
development|testing was done on laptops using Intel I5|I7 cpus
with 4 or 8 threads. I'm really interested how it performs on
other platforms (ARM, AMD, PowerPC, etc).
The main proc "twinprimes_ssoz" manages program flow and set as
follows:
1) accepts the input number (an integer) in "val" from the cli
2) sets "num" to be first odd number < "val" if "val" even
3) calls "selectPG" with "num" to select optimum Prime Generator
(PG) parameters
4) compute various working parameters per selected PG and number
value (see refs)
5) compute global values for number of primes, and their values,
<= sqrt(num) for
selected PG with proc "sozpg"
6) Set initial twinprime count for selected PG
7) Then with proc "segsieve" allocate one thread to perform SSoZ
(segmented sieve of zakiya)
on each twinprime pair residues (determined by selected PG), and
count number of twinprmes
computed in each thread.
8) Then determine true total twinprime count and last twinprime
<= "num"
It also is timing different intervals and prints out diagnostics
and final output.
The proc "twins_sieve" is the primary function that manages and
performs the SSoZ for
a given twinprim pair parameters in each thread. The more threads
the faster the process goes.
I'll provide more info later. I have to run now. I wanted to get
this out now while I
was at my laptop, and online.
More information about the Digitalmars-d
mailing list